• Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month
Page
  1. 1
    1
  2. 2
    2
  3. 3
    3
  4. 4
    4
  5. 5
    5
  6. 6
    6
  7. 7
    7
  8. 8
    8
  9. 9
    9
  10. 10
    10
  11. 11
    11
  12. 12
    12
  13. 13
    13
  14. 14
    14
  15. 15
    15
  16. 16
    16
  17. 17
    17
  18. 18
    18
  19. 19
    19
  20. 20
    20
  21. 21
    21
  22. 22
    22
  23. 23
    23
  • Level: GCSE
  • Subject: Maths
  • Word count: 7036

Data Handling Coursework

Extracts from this document...

Introduction

Mamoona Mohsin

Data handling coursework

Introduction:

I have been provided with a set of data containing 1183 pupils’ height, weight, gender, year group, eye colour, age, hair colour, left/right handed, favourite music, favourite sport, favourite colour and many more. I am going to consider weight, height, year group and gender. I am going to set up three hypotheses to find different combinations between different factors and I will prove these hypothesis using different statistical techniques.

Sampling:

Stratified sampling:

I will 10% of the sample using random stratified sampling.

Total pupils in school = 1183

10% of 1183 ≈ 118 pupils

Sample size for each year group = (pupils in a year/ total pupils in school) x total pupils

            in school

Year group

Random sample size

7

(282/1183) x 118 ≈ 28

8

(270/1183) x 118 ≈ 27

9

(261/1183) x 118 ≈ 26

10

(200/1183) x 118 ≈ 20

11

(170/1183) x 118 ≈ 17

Random and systematic sampling:

I randomly chose 8th pupil and then every 10th pupil using systematic sampling to complete the sample of 118 pupils.

Data:

There is a value of weight missing in the data, because the person who took the sample might not weigh that pupil.

Year Group

Gender

Height (m)

Weight (kg)

7

Female

1.60

42

7

Female

1.64

140

7

Female

1.53

40

7

Female

1.52

50

7

Female

1.64

47

7

Female

1.61

47

7

Female

1.66

45

7

Female

1.59

38

7

Female

1.73

49

7

Female

1.59

38

7

Female

1.51

50

7

Female

1.46

45

7

Female

1.63

51

7

Male

1.47

50

7

Male

1.48

44

7

Male

1.57

56

7

Male

1.55

53

7

Male

1.63

50

7

Male

1.57

48

7

Male

1.70

57

7

Male

1.55

32

7

Male

1.62

42

7

Male

1.69

57

7

Male

1.51

59

7

Male

1.54

43

7

Male

1.55

50

7

Male

1.49

38

7

Male

1.45

8

Female

1.32

48

8

Female

1.72

50

8

Female

1.59

55

8

Female

1.61

48

8

Female

1.62

46

8

Female

1.56

54

8

Female

1.52

58

8

Female

1.57

45

8

Female

1.60

57

8

Female

1.59

44

8

Female

1.58

52

8

Female

1.59

50

8

Female

1.55

42

8

Male

1.72

57

8

Male

1.52

43

8

Male

1.48

40

8

Male

1.72

50

8

Male

1.50

50

8

Male

1.62

38

8

Male

1.48

26

8

Male

1.74

45

8

Male

1.77

54

8

Male

1.73

56

8

Male

1.55

43

8

Male

1.45

72

8

Male

1.52

45

8

Male

2.00

35

9

Female

1.65

48

9

Female

1.6

41

9

Female

1.52

45

9

Female

1.66

45

9

Female

1.55

36

9

Female

1.48

47

9

Female

1.54

54

9

Female

1.59

52

9

Female

1.62

42

9

Female

1.6

46

9

Female

1.6

46

9

Female

1.53

48

9

Female

1.78

59

9

Female

1.06

74

9

Male

1.60

40

9

Male

1.48

43

9

Male

1.60

49

9

Male

1.58

65

9

Male

1.80

64

9

Male

1.75

63

9

Male

1.70

42

9

Male

1.58

50

9

Male

1.43

60

9

Male

1.65

51

9

Male

1.74

61

9

Male

1.81

68

10

Female

1.61

54

10

Female

1.50

40

10

Female

1.55

50

10

Female

1.73

45

10

Female

1.47

45

10

Female

1.56

56

10

Female

1.51

36

10

Female

1.62

52

10

Female

1.60

50

10

Female

1.70

60

10

Male

1.54

76

10

Male

1.85

70

10

Male

1.75

45

10

Male

1.79

75

10

Male

1.57

49

10

Male

1.57

54

10

Male

1.70

59

10

Male

1.55

64

10

Male

1.68

64

10

Male

1.55

50

11

Female

1.68

54

11

Female

1.52

44

11

Female

1.67

52

11

Female

1.68

48

11

Female

1.65

52

11

Female

1.60

55

11

Female

1.62

51

...read more.

Middle

26 x 1 = 26

676

676

32

1

32 x 1 = 32

1024

1024

35

1

35 x 1 = 35

1225

1225

38

2

38 x 2 = 76

1444

2888

40

2

40 x 2 = 80

1600

3200

42

2

42 x 2 = 84

1764

3528

43

4

43 x 4 = 172

1849

7396

44

1

44 x 1 = 44

1936

1936

45

3

45 x 3 = 135

2025

6075

48

1

48 x 1 = 48

2304

2304

49

2

49 x 2 = 98

2401

4802

50

8

50 x 8 = 400

2500

20000

51

1

51 x 1 = 51

2601

2601

53

1

53 x 1 = 53

2809

2809

54

2

54 x 2 = 108

2916

5832

56

2

56 x 2 = 112

3136

6272

57

3

57 x 3 = 171

3249

9747

58

1

58 x 1 = 58

3364

3364

59

2

59 x 2 = 118

3481

6962

60

2

60 x 2 = 120

3600

7200

61

1

61 x 1 = 61

3721

3721

63

2

63 x 2 = 126

3969

7938

64

3

64 x 3 = 192

4096

12288

65

1

65 x 1 = 65

4225

4225

66

1

66 x 1 = 66

4356

4356

68

2

68 x 2 = 136

4624

9248

70

1

70 x 1 = 70

4900

4900

72

2

72 x 2 = 144

5184

10368

75

1

75 x 1 = 75

5625

5625

76

1

76 x 1 = 76

5776

5776

92

1

92 x 1 = 92

8464

8464

Total

f = 58

fa = 3124

∑a2 = 100844

fa2 = 176750

Mean (ā) = ∑fa/∑f = 3124/58 = 53.86 kg

n = ∑f

Standard deviation (s) = (∑fa2 – nā2)image02.png

                                             √(n - 1)

                                    = (176750 – 58 x 53.862)image03.png

                                                   √(58 - 1)

                                s  = 12.2

Variance (s2) = 148.84

Lower end = 53.86 – 2 x 12.2 = 29.46

Upper end = 53.86 + 2 x 12.2 = 78.26

26 < 29.46 therefore 26 is an outlier.

92 > 78.26 therefore 92 is an outlier.

Girls:

Weight (a) kg

Frequency (f)

Frequency (f) x weight (a)

weight x weight (a2)

f x a2

36

2

36 x 2 = 72

1296

2592

38

2

38 x 2 = 76

1444

2888

40

2

40 x 2 = 80

1600

3200

41

1

41 x 1 = 41

1681

1681

42

3

42 x 3 = 126

1764

5292

44

2

44 x 2 = 88

1936

3872

45

7

45 x 7 = 315

2025

14175

46

3

46 x 3 = 138

2116

6348

47

3

47 x 3 = 141

2209

6627

48

5

48 x 5 = 240

2304

11520

49

1

49 x 1 = 49

2401

2401

50

6

50 x 6 = 300

2500

15000

51

3

51 x 3 = 153

2601

7803

52

5

52 x 5 = 260

2704

13520

54

4

54 x 4 = 216

2916

11664

55

2

55 x 2 = 110

3025

6050

56

1

56 x 1 = 56

3136

3136

57

1

57 x 1 = 57

3249

3249

58

1

58 x 1 = 58

3364

3364

59

1

59 x 1 = 59

3481

3481

60

2

60 x 2 = 120

3600

7200

74

1

74 x 1 = 74

5476

5476

140

1

140 x 1 = 140

19600

19600

Total

f = 59

fa = 2969

∑a2 = 76428

fa2 = 160139

Mean (ā) = ∑fa/∑f = 2969/59 = 50.32 kg

Standard deviation (s) = (∑fa2 – nā2)image02.png

                                             √(n - 1)

                                    = (160139 – 59 x 50.322)image03.png

                                                   √(59 - 1)

                                s  = 13.6

Variance (s2) = 184.96

Lower end = 50.32 – 2 x 13.6 = 23.12

Upper end = 50.32 + 2 x 13.6 = 77.52

There is no outlier at lower extreme, but 140 is an outlier at upper extreme because 140 > 77.52.

Variance for girls is greater than for boys, it mean the weight of girls is more varied. The reason for it can be due to an outlier at upper extreme.

Year 7:

n = 28, but there is one value missing in the data for a male weight therefore n = 27.

Male

n = 14

32, 38, 42, 43, 44, 48, 50, 50, 50, 53, 56, 57, 57, 59

Female

n = 13

38, 38, 40, 42, 45, 45, 47, 47, 49, 50, 50, 51, 140

Key 30|2 = 32

Female

Weight (kg)

Male

8 8

9 7 7 5 5 2 0

1 0 0

0

30

40

50

¦

140

2 8

2 3 4 8

0 0 0 3 6 7 7 9

Averages

Boys

Girls

Modal class (kg)

50 - 59

40 - 49

Median (kg)

50

47

The table above shows that all averages for boys are greater than girls; therefore it proves my hypothesis that boys of the year 7 are heavier than girls.

Range (kg)

Boys: 59 – 38 = 21

Girls: 140 – 30 = 110 kg

Range for girls is greater than for boys, the reason for it could be an anomalous value that one girl weighs 140 kg.

Boys:

Weight (a) kg

Frequency (f)

Frequency (f) x weight (a)

weight x weight (a2)

f x a2

32

1

32 x 1 = 32

1024

1024

38

1

38 x 1 = 38

1444

1444

42

1

42 x 1 = 48

1764

1764

43

1

43 x 1 = 43

1849

1849

44

1

44 x 1 = 44

1936

1936

48

1

48 x 1 = 48

2304

2304

50

3

50 x 3 = 150

2500

7500

53

1

53 x 1 = 53

2809

2809

56

1

56 x 1 = 56

3136

3136

57

2

57 x 2 = 114

3249

6498

59

1

59 x 1 = 59

3481

3481

Total

f = 14

fa = 679

∑a2 = 25496

fa2 = 33745

Mean (ā) = ∑fa/∑f = 679/14 = 48.50 kg

Standard deviation (s) = (∑fa2 – nā2)image02.png

                                             √(n - 1)

                                    = (33745 – 14 x 48.502)image03.png

                                                   √(14 - 1)

                                s  = 7.91

Variance (s2) = 62.57

Lower end = 48.50 – 2 x 7.91 = 32.68

Upper end = 48.50 + 2 x 7.91 = 64.32

32 < 32.68 therefore 32 is an outlier. There is no outlier at the upper extreme for year 7 boys.

Girls:

Weight (a) kg

Frequency (f)

Frequency (f) x weight (a)

weight x weight (a2)

f x a2

38

2

38 x 2 = 76

1444

2888

40

1

40 x 1 = 40

1600

1600

42

1

42 x 1 = 42

1764

1764

45

2

45 x 2 = 90

2025

4050

47

2

47 x 2 = 94

2209

4418

49

1

49 x 1 = 49

2401

2401

50

2

50 x 2 = 100

2500

5000

51

1

51 x 1 = 51

2601

2601

140

1

140 x 1 = 140

19600

19600

Total

f = 13

fa = 682

∑a2 = 36144

fa2 = 44322

Mean (ā) = ∑fa/∑f = 682/13 = 52.46 kg

Standard deviation (s) = (∑fa2 – nā2)image02.png

                                             √(n - 1)

                                    = (44322 – 13 x 52.462)image03.png

                                                   √(13 - 1)

                                s  = 26.68

Variance (s2) = 711.8

Lower end = 52.46 – 2 x 26.68 = -0.9

Upper end = 52.46 + 2 x 26.68 = 106.06

There is no outlier at lower extreme, but 140 is an outlier at upper extreme because 140 > 106.06. This outlier affects standard deviation, variance and mean.

Year 11:

n = 17

Key 50|8 = 58

Boys

n = 8

50, 58, 60, 63, 66, 68, 72, 92

Girls

n = 9

44, 48, 51, 51, 52, 52, 54, 55, 60

Girls

Weight (kg)

Boys

8 4

5 4 2 2 1 1

0

40

50

60

70

80

90

0 8

0 0 3 6 8

2

2

Averages

Boys

Girls

Modal class (kg)

60 - 69

50 - 59

Median (kg)

64.5

52

The table above shows that all averages for boys are greater than girls; therefore it proves my hypothesis that boys of the year 7 are heavier than girls.

Range (kg)

Boys: 92 – 50 = 42

Girls: 60 – 44 = 16 kg

Range for boys is greater than for girls, so it strengthens more my hypothesis.

Boys:

Weight (a) kg

Frequency (f)

Frequency (f) x weight (a)

weight x weight (a2)

f x a2

50

1

50 x 1 = 50

2500

2500

58

1

58 x 1 = 58

3364

3364

60

1

60 x 1 = 60

3600

3600

63

1

63 x 1 = 63

3969

3969

66

1

66 x 1 = 66

4356

4356

68

1

68 x 1 = 68

4624

4624

72

1

72 x 1 = 72

5184

5184

92

1

92 x 1 = 92

8464

8464

Total

f = 8

fa = 529

∑a2 = 36061

fa2 = 36061

Mean (ā) = ∑fa/∑f = 529/8 = 66.13 kg

Standard deviation (s) = (∑fa2 – nā2)image02.png

                                             √(n - 1)

                                    = (36061 – 8 x 66.132)image03.png

                                                   √(8 - 1)

                                s  = 12.43

Variance (s2) = 154.41

Lower end = 66.13 – 2 x 12.43 = 41.27

Upper end = 66.13 + 2 x 12.43 = 90.99

There is no outlier at lower extreme for year 11 boys, but 92 is an outlier at upper extreme.

Girls:

Weight (a) kg

Frequency (f)

Frequency (f) x weight (a)

weight x weight (a2)

f x a2

44

1

44 x 1 = 44

1936

1936

48

1

48 x 1 = 48

2304

2304

51

2

51 x 2 = 102

2601

5202

52

2

52 x 2 = 104

2704

5408

54

1

54 x 1 = 54

2916

2916

55

1

55 x 1 = 55

3025

3025

60

1

60 x 1 = 60

3600

3600

Total

f = 9

fa = 467

∑a2 = 19086

fa2 = 24391

...read more.

Conclusion

 Then I learnt the correlation of the different aged males. All of the results such as range, mean, modal class, standard deviation and variance for year11 males were greater than for year7 despite of containing an outlier for males of each year group.The overall conclusion that I can see is that the outlier for year7 females weakened my hypothesis. Otherwise the outliers for males had no effect.

2c. The third hypothesis I set up was that the pupils’ height would increase with their increasing age.

  • To see combination, I drew box and whisker plots for the heights(m) of each year group separately to see the distribution of the data. It showed that the data for year8 was highly distributed.
  • Then I drew a table representing number in sample, mean, mode, median, lower quartile, upper quartile, inter quartile range, distribution and outliers.
  • Median height(m) increased as year group increased except for year10, it gave strength to my hypothesis.
  • Modal heights for year7 and year10 were the same. Year8 was bimodal and modes for year10 and year11 were larger than others.
  • Mean height increased as year group increased except for year8; it could be due to 2 modal heights for year8.
  • The distribution for year7 was symmetrical, for year8, 9 & 10 was positive and for year11 was negative. It affected the inter quartile range.
  • The inter quartile range was affected by outliers. There was no outlier for year7, 8 & 10. For year 9 there was one outlier 1.06m and for year 11 there were three outliers 1.52m, 1.79m & 1.82m; these outliers affected the distribution of the data.

The overall result shows that I have not been completely successful in proving my hypotheses. The outliers and some anomalous values affected the distribution of the data. There was one weight value missing which also affected the results to prove my hypotheses. For different samples the results would be different. So the same results can’t be assumed for different sample.

...read more.

This student written piece of work is one of many that can be found in our GCSE Height and Weight of Pupils and other Mayfield High School investigations section.

Found what you're looking for?

  • Start learning 29% faster today
  • 150,000+ documents available
  • Just £6.99 a month

Not the one? Search for your essay title...
  • Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

Related GCSE Height and Weight of Pupils and other Mayfield High School investigations essays

  1. Marked by a teacher

    Height and Weight of Pupils

    This indicates that females are taller. My box plots show the LQ of males is for height and weight is 1.465m and 38.4kg and female height and weight were 1.495m and 48.2kg. This is greater for females also. The UQ of males for height and weight is 1.615m and 48.6kg.

  2. A hypothesis is the outline of the idea/ideas which I will be testing and ...

    45.55 which is rounded to 46 kg Median: 46kg Median 1.56m Range: 60 - 30 = 30 Range 1.75 - 1.41 = 0.34m Histogram and Frequency Polygons From the data I have collected and formed through my frequency tables and mean averages and many more I will now produce a

  1. Maths Coursework - Data Handling

    lower than 1.55, which was the median height in the box and whisker plot. As there is so little difference, between the medians and the estimated means, we can see that either there are very few and inconsequential outliers, or the outliers are cancelled out by more outliers at the other end of the scale.

  2. Statistics GCSE Coursework. Height and weight of pupils. The sampling method I am ...

    Correlation The rough guide above shows us how the value of r (the correlation which is between -1 and 1) relates to our data. Based on my results, we can assume that when students arrive in year 7 there is a small positive correlation between height and weight, also based

  1. During this coursework unit I will be using statistical knowledge to analyse my data ...

    The range is good because it shows how much difference there is between the highest and lowest value. This shows how much the data set varies but can again be distorted by extreme values and so isn't too reliable. The standard deviation however tells me the average variation of the data values from the mean number.

  2. Biology coursework is to find the different variations in our science class.

    Furthermore, tongue rolling is discontinuous and can not be environmental because there are only two alternative choices and it is impossible to learn how to roll the tongue. Pure breed F1 generation Analysis of graph 5 - Arm folding Everyone can fold their arms since they mimic people from very young ages.

  1. Contrast and compare the two central protagonists in the poems 'Knife Play' and 'The ...

    Marriage in "The Ex-Queen Among the Astronomers" comes in the form of "She wears the rings he let her keep;" This line is very revealing of the relationship the protagonist was previously in. The word "let" being very telling, she needed permission to keep the rings, even though the relationship would have been finished when this was a problem.

  2. Maths Data Handling

    Year Girls Boys Total 7 11 13 24 8 11 12 23 9 12 10 22 10 8 9 17 11 7 7 14 This stratified sample creates a small version of the whole school. If I did not use a stratified sample, then bias will occur in my investigation.

  • Over 160,000 pieces
    of student written work
  • Annotated by
    experienced teachers
  • Ideas and feedback to
    improve your own work