• Join over 1.2 million students every month
• Accelerate your learning by 29%
• Unlimited access from just £6.99 per month
Page
1. 1
1
2. 2
2
3. 3
3
4. 4
4
5. 5
5
6. 6
6
7. 7
7
8. 8
8
9. 9
9
10. 10
10
11. 11
11
12. 12
12
13. 13
13
14. 14
14
15. 15
15
16. 16
16
17. 17
17
18. 18
18
19. 19
19
20. 20
20
21. 21
21
22. 22
22
23. 23
23
• Level: GCSE
• Subject: Maths
• Word count: 7036

# Data Handling Coursework

Extracts from this document...

Introduction

Mamoona Mohsin

Data handling coursework

Introduction:

I have been provided with a set of data containing 1183 pupils’ height, weight, gender, year group, eye colour, age, hair colour, left/right handed, favourite music, favourite sport, favourite colour and many more. I am going to consider weight, height, year group and gender. I am going to set up three hypotheses to find different combinations between different factors and I will prove these hypothesis using different statistical techniques.

Sampling:

Stratified sampling:

I will 10% of the sample using random stratified sampling.

Total pupils in school = 1183

10% of 1183 ≈ 118 pupils

Sample size for each year group = (pupils in a year/ total pupils in school) x total pupils

in school

 Year group Random sample size 7 (282/1183) x 118 ≈ 28 8 (270/1183) x 118 ≈ 27 9 (261/1183) x 118 ≈ 26 10 (200/1183) x 118 ≈ 20 11 (170/1183) x 118 ≈ 17

Random and systematic sampling:

I randomly chose 8th pupil and then every 10th pupil using systematic sampling to complete the sample of 118 pupils.

Data:

There is a value of weight missing in the data, because the person who took the sample might not weigh that pupil.

Year Group

Gender

Height (m)

Weight (kg)

7

Female

1.60

42

7

Female

1.64

140

7

Female

1.53

40

7

Female

1.52

50

7

Female

1.64

47

7

Female

1.61

47

7

Female

1.66

45

7

Female

1.59

38

7

Female

1.73

49

7

Female

1.59

38

7

Female

1.51

50

7

Female

1.46

45

7

Female

1.63

51

7

Male

1.47

50

7

Male

1.48

44

7

Male

1.57

56

7

Male

1.55

53

7

Male

1.63

50

7

Male

1.57

48

7

Male

1.70

57

7

Male

1.55

32

7

Male

1.62

42

7

Male

1.69

57

7

Male

1.51

59

7

Male

1.54

43

7

Male

1.55

50

7

Male

1.49

38

7

Male

1.45

8

Female

1.32

48

8

Female

1.72

50

8

Female

1.59

55

8

Female

1.61

48

8

Female

1.62

46

8

Female

1.56

54

8

Female

1.52

58

8

Female

1.57

45

8

Female

1.60

57

8

Female

1.59

44

8

Female

1.58

52

8

Female

1.59

50

8

Female

1.55

42

8

Male

1.72

57

8

Male

1.52

43

8

Male

1.48

40

8

Male

1.72

50

8

Male

1.50

50

8

Male

1.62

38

8

Male

1.48

26

8

Male

1.74

45

8

Male

1.77

54

8

Male

1.73

56

8

Male

1.55

43

8

Male

1.45

72

8

Male

1.52

45

8

Male

2.00

35

9

Female

1.65

48

9

Female

1.6

41

9

Female

1.52

45

9

Female

1.66

45

9

Female

1.55

36

9

Female

1.48

47

9

Female

1.54

54

9

Female

1.59

52

9

Female

1.62

42

9

Female

1.6

46

9

Female

1.6

46

9

Female

1.53

48

9

Female

1.78

59

9

Female

1.06

74

9

Male

1.60

40

9

Male

1.48

43

9

Male

1.60

49

9

Male

1.58

65

9

Male

1.80

64

9

Male

1.75

63

9

Male

1.70

42

9

Male

1.58

50

9

Male

1.43

60

9

Male

1.65

51

9

Male

1.74

61

9

Male

1.81

68

10

Female

1.61

54

10

Female

1.50

40

10

Female

1.55

50

10

Female

1.73

45

10

Female

1.47

45

10

Female

1.56

56

10

Female

1.51

36

10

Female

1.62

52

10

Female

1.60

50

10

Female

1.70

60

10

Male

1.54

76

10

Male

1.85

70

10

Male

1.75

45

10

Male

1.79

75

10

Male

1.57

49

10

Male

1.57

54

10

Male

1.70

59

10

Male

1.55

64

10

Male

1.68

64

10

Male

1.55

50

11

Female

1.68

54

11

Female

1.52

44

11

Female

1.67

52

11

Female

1.68

48

11

Female

1.65

52

11

Female

1.60

55

11

Female

1.62

51

Middle

26 x 1 = 26

676

676

32

1

32 x 1 = 32

1024

1024

35

1

35 x 1 = 35

1225

1225

38

2

38 x 2 = 76

1444

2888

40

2

40 x 2 = 80

1600

3200

42

2

42 x 2 = 84

1764

3528

43

4

43 x 4 = 172

1849

7396

44

1

44 x 1 = 44

1936

1936

45

3

45 x 3 = 135

2025

6075

48

1

48 x 1 = 48

2304

2304

49

2

49 x 2 = 98

2401

4802

50

8

50 x 8 = 400

2500

20000

51

1

51 x 1 = 51

2601

2601

53

1

53 x 1 = 53

2809

2809

54

2

54 x 2 = 108

2916

5832

56

2

56 x 2 = 112

3136

6272

57

3

57 x 3 = 171

3249

9747

58

1

58 x 1 = 58

3364

3364

59

2

59 x 2 = 118

3481

6962

60

2

60 x 2 = 120

3600

7200

61

1

61 x 1 = 61

3721

3721

63

2

63 x 2 = 126

3969

7938

64

3

64 x 3 = 192

4096

12288

65

1

65 x 1 = 65

4225

4225

66

1

66 x 1 = 66

4356

4356

68

2

68 x 2 = 136

4624

9248

70

1

70 x 1 = 70

4900

4900

72

2

72 x 2 = 144

5184

10368

75

1

75 x 1 = 75

5625

5625

76

1

76 x 1 = 76

5776

5776

92

1

92 x 1 = 92

8464

8464

Total

f = 58

fa = 3124

∑a2 = 100844

fa2 = 176750

Mean (ā) = ∑fa/∑f = 3124/58 = 53.86 kg

n = ∑f

Standard deviation (s) = (∑fa2 – nā2)

√(n - 1)

= (176750 – 58 x 53.862)

√(58 - 1)

s  = 12.2

Variance (s2) = 148.84

Lower end = 53.86 – 2 x 12.2 = 29.46

Upper end = 53.86 + 2 x 12.2 = 78.26

26 < 29.46 therefore 26 is an outlier.

92 > 78.26 therefore 92 is an outlier.

Girls:

 Weight (a) kg Frequency (f) Frequency (f) x weight (a) weight x weight (a2) f x a2 36 2 36 x 2 = 72 1296 2592 38 2 38 x 2 = 76 1444 2888 40 2 40 x 2 = 80 1600 3200 41 1 41 x 1 = 41 1681 1681 42 3 42 x 3 = 126 1764 5292 44 2 44 x 2 = 88 1936 3872 45 7 45 x 7 = 315 2025 14175 46 3 46 x 3 = 138 2116 6348 47 3 47 x 3 = 141 2209 6627 48 5 48 x 5 = 240 2304 11520 49 1 49 x 1 = 49 2401 2401 50 6 50 x 6 = 300 2500 15000 51 3 51 x 3 = 153 2601 7803 52 5 52 x 5 = 260 2704 13520 54 4 54 x 4 = 216 2916 11664 55 2 55 x 2 = 110 3025 6050 56 1 56 x 1 = 56 3136 3136 57 1 57 x 1 = 57 3249 3249 58 1 58 x 1 = 58 3364 3364 59 1 59 x 1 = 59 3481 3481 60 2 60 x 2 = 120 3600 7200 74 1 74 x 1 = 74 5476 5476 140 1 140 x 1 = 140 19600 19600 Total ∑f = 59 ∑fa = 2969 ∑a2 = 76428 ∑fa2 = 160139

Mean (ā) = ∑fa/∑f = 2969/59 = 50.32 kg

Standard deviation (s) = (∑fa2 – nā2)

√(n - 1)

= (160139 – 59 x 50.322)

√(59 - 1)

s  = 13.6

Variance (s2) = 184.96

Lower end = 50.32 – 2 x 13.6 = 23.12

Upper end = 50.32 + 2 x 13.6 = 77.52

There is no outlier at lower extreme, but 140 is an outlier at upper extreme because 140 > 77.52.

Variance for girls is greater than for boys, it mean the weight of girls is more varied. The reason for it can be due to an outlier at upper extreme.

Year 7:

n = 28, but there is one value missing in the data for a male weight therefore n = 27.

Male

n = 14

32, 38, 42, 43, 44, 48, 50, 50, 50, 53, 56, 57, 57, 59

Female

n = 13

38, 38, 40, 42, 45, 45, 47, 47, 49, 50, 50, 51, 140

Key 30|2 = 32

 Female Weight (kg) Male 8 89 7 7 5 5 2 01 0 00 304050¦140 2 82 3 4 80 0 0 3 6 7 7 9
 Averages Boys Girls Modal class (kg) 50 - 59 40 - 49 Median (kg) 50 47

The table above shows that all averages for boys are greater than girls; therefore it proves my hypothesis that boys of the year 7 are heavier than girls.

 Range (kg) Boys: 59 – 38 = 21 Girls: 140 – 30 = 110 kg

Range for girls is greater than for boys, the reason for it could be an anomalous value that one girl weighs 140 kg.

Boys:

 Weight (a) kg Frequency (f) Frequency (f) x weight (a) weight x weight (a2) f x a2 32 1 32 x 1 = 32 1024 1024 38 1 38 x 1 = 38 1444 1444 42 1 42 x 1 = 48 1764 1764 43 1 43 x 1 = 43 1849 1849 44 1 44 x 1 = 44 1936 1936 48 1 48 x 1 = 48 2304 2304 50 3 50 x 3 = 150 2500 7500 53 1 53 x 1 = 53 2809 2809 56 1 56 x 1 = 56 3136 3136 57 2 57 x 2 = 114 3249 6498 59 1 59 x 1 = 59 3481 3481 Total ∑f = 14 ∑fa = 679 ∑a2 = 25496 ∑fa2 = 33745

Mean (ā) = ∑fa/∑f = 679/14 = 48.50 kg

Standard deviation (s) = (∑fa2 – nā2)

√(n - 1)

= (33745 – 14 x 48.502)

√(14 - 1)

s  = 7.91

Variance (s2) = 62.57

Lower end = 48.50 – 2 x 7.91 = 32.68

Upper end = 48.50 + 2 x 7.91 = 64.32

32 < 32.68 therefore 32 is an outlier. There is no outlier at the upper extreme for year 7 boys.

Girls:

 Weight (a) kg Frequency (f) Frequency (f) x weight (a) weight x weight (a2) f x a2 38 2 38 x 2 = 76 1444 2888 40 1 40 x 1 = 40 1600 1600 42 1 42 x 1 = 42 1764 1764 45 2 45 x 2 = 90 2025 4050 47 2 47 x 2 = 94 2209 4418 49 1 49 x 1 = 49 2401 2401 50 2 50 x 2 = 100 2500 5000 51 1 51 x 1 = 51 2601 2601 140 1 140 x 1 = 140 19600 19600 Total ∑f = 13 ∑fa = 682 ∑a2 = 36144 ∑fa2 = 44322

Mean (ā) = ∑fa/∑f = 682/13 = 52.46 kg

Standard deviation (s) = (∑fa2 – nā2)

√(n - 1)

= (44322 – 13 x 52.462)

√(13 - 1)

s  = 26.68

Variance (s2) = 711.8

Lower end = 52.46 – 2 x 26.68 = -0.9

Upper end = 52.46 + 2 x 26.68 = 106.06

There is no outlier at lower extreme, but 140 is an outlier at upper extreme because 140 > 106.06. This outlier affects standard deviation, variance and mean.

Year 11:

n = 17

Key 50|8 = 58

Boys

n = 8

50, 58, 60, 63, 66, 68, 72, 92

Girls

n = 9

44, 48, 51, 51, 52, 52, 54, 55, 60

 Girls Weight (kg) Boys 8 45 4 2 2 1 10 405060708090 0 80 0 3 6 822
 Averages Boys Girls Modal class (kg) 60 - 69 50 - 59 Median (kg) 64.5 52

The table above shows that all averages for boys are greater than girls; therefore it proves my hypothesis that boys of the year 7 are heavier than girls.

 Range (kg) Boys: 92 – 50 = 42 Girls: 60 – 44 = 16 kg

Range for boys is greater than for girls, so it strengthens more my hypothesis.

Boys:

 Weight (a) kg Frequency (f) Frequency (f) x weight (a) weight x weight (a2) f x a2 50 1 50 x 1 = 50 2500 2500 58 1 58 x 1 = 58 3364 3364 60 1 60 x 1 = 60 3600 3600 63 1 63 x 1 = 63 3969 3969 66 1 66 x 1 = 66 4356 4356 68 1 68 x 1 = 68 4624 4624 72 1 72 x 1 = 72 5184 5184 92 1 92 x 1 = 92 8464 8464 Total ∑f = 8 ∑fa = 529 ∑a2 = 36061 ∑fa2 = 36061

Mean (ā) = ∑fa/∑f = 529/8 = 66.13 kg

Standard deviation (s) = (∑fa2 – nā2)

√(n - 1)

= (36061 – 8 x 66.132)

√(8 - 1)

s  = 12.43

Variance (s2) = 154.41

Lower end = 66.13 – 2 x 12.43 = 41.27

Upper end = 66.13 + 2 x 12.43 = 90.99

There is no outlier at lower extreme for year 11 boys, but 92 is an outlier at upper extreme.

Girls:

 Weight (a) kg Frequency (f) Frequency (f) x weight (a) weight x weight (a2) f x a2 44 1 44 x 1 = 44 1936 1936 48 1 48 x 1 = 48 2304 2304 51 2 51 x 2 = 102 2601 5202 52 2 52 x 2 = 104 2704 5408 54 1 54 x 1 = 54 2916 2916 55 1 55 x 1 = 55 3025 3025 60 1 60 x 1 = 60 3600 3600 Total ∑f = 9 ∑fa = 467 ∑a2 = 19086 ∑fa2 = 24391

Conclusion

Then I learnt the correlation of the different aged males. All of the results such as range, mean, modal class, standard deviation and variance for year11 males were greater than for year7 despite of containing an outlier for males of each year group.The overall conclusion that I can see is that the outlier for year7 females weakened my hypothesis. Otherwise the outliers for males had no effect.

2c. The third hypothesis I set up was that the pupils’ height would increase with their increasing age.

• To see combination, I drew box and whisker plots for the heights(m) of each year group separately to see the distribution of the data. It showed that the data for year8 was highly distributed.
• Then I drew a table representing number in sample, mean, mode, median, lower quartile, upper quartile, inter quartile range, distribution and outliers.
• Median height(m) increased as year group increased except for year10, it gave strength to my hypothesis.
• Modal heights for year7 and year10 were the same. Year8 was bimodal and modes for year10 and year11 were larger than others.
• Mean height increased as year group increased except for year8; it could be due to 2 modal heights for year8.
• The distribution for year7 was symmetrical, for year8, 9 & 10 was positive and for year11 was negative. It affected the inter quartile range.
• The inter quartile range was affected by outliers. There was no outlier for year7, 8 & 10. For year 9 there was one outlier 1.06m and for year 11 there were three outliers 1.52m, 1.79m & 1.82m; these outliers affected the distribution of the data.

The overall result shows that I have not been completely successful in proving my hypotheses. The outliers and some anomalous values affected the distribution of the data. There was one weight value missing which also affected the results to prove my hypotheses. For different samples the results would be different. So the same results can’t be assumed for different sample.

This student written piece of work is one of many that can be found in our GCSE Height and Weight of Pupils and other Mayfield High School investigations section.

## Found what you're looking for?

• Start learning 29% faster today
• 150,000+ documents available
• Just £6.99 a month

Not the one? Search for your essay title...
• Join over 1.2 million students every month
• Accelerate your learning by 29%
• Unlimited access from just £6.99 per month

# Related GCSE Height and Weight of Pupils and other Mayfield High School investigations essays

1. ## Height and Weight of Pupils

The UQ for female's height and weight was 1.585m and 44.1kg. From all the results you can immediately tell that year 8 students will be taller and heavier than year 7 students but shorter and lighter than yr 11 students.

2. ## A hypothesis is the outline of the idea/ideas which I will be testing and ...

45.55 which is rounded to 46 kg Median: 46kg Median 1.56m Range: 60 - 30 = 30 Range 1.75 - 1.41 = 0.34m Histogram and Frequency Polygons From the data I have collected and formed through my frequency tables and mean averages and many more I will now produce a

1. ## Statistics GCSE Coursework. Height and weight of pupils. The sampling method I am ...

Year 7 1.5?x<1.6 Year 9 1.6?x<1.7 Year 11 1.6?x<1.7 Results for weight (kg): Modal interval (males and females) Year 7 40?x<50 Year 9 50?x<60 Year 11 40?x<50 As you can see these results are inconclusive and do not support my hypothesis so I looked at the means instead: Average height and weight for females: Mean height (m)

2. ## Maths Coursework - Data Handling

lower than 1.55, which was the median height in the box and whisker plot. As there is so little difference, between the medians and the estimated means, we can see that either there are very few and inconsequential outliers, or the outliers are cancelled out by more outliers at the other end of the scale.

1. ## Maths Data Handling

the frequency tables, but I will have to round to one decimal place, meaning the mean will not be completely accurate. I will also find out the modal class interval in the production of the frequency tables as the class interval with the highest frequency will be the modal class interval.

2. ## I will be testing the following hypothesis in my pilot study: ...

The data is much more distributed for the girls with a larger Interquartile range than the boy's data. This is related to my hypothesis for the year 7 boys and girls height. I had alleged in my hypothesis that the height would be generally the same and as shown above it is.

1. ## Statistics coursework part 1

20: 30: 3 4 5 5 5 5 6 6 8 8 8 8 9 40: 0 0 0 0 0 0 0 0 0 0 1 1 1 2 2 2 2 2 2 2 2 2 3 3 44: 0 0 1 1 1 1 1 1 1

2. ## Maths: Data Handling Coursework

Hunter-gathering was very common millions of years ago, where people had to hunt for their food out in the wild. Usually, it was the men who hunted, while women stayed at home. The reason for this is probably because men are physically strong.

• Over 160,000 pieces
of student written work
• Annotated by
experienced teachers
• Ideas and feedback to