# Statistical investigation.

Extracts from this document...

Introduction

Mathematics GCSE

Mayfield High School

Year Group | Number of Boys | Number of Girls | Total |

7 | 151 | 131 | 282 |

8 | 145 | 125 | 270 |

9 | 118 | 143 | 261 |

10 | 106 | 94 | 200 |

11 | 84 | 86 | 170 |

The total number of students at the school is 1183.

I have been given data on all the students covering a range of different things such as hair colour, eye colour, numbers of brothers or sisters and even favourite music and IQ. There are 27 different categories and there for a total of 31941 datum points.

1183 x 27 =31941

What I first need to do is to decide which line of enquiry I will choose. There are several options but I need to pick I think will show successfully what I can do within the area of statistics. I have decided to compare these four things:

>Year Group

>Sex

>Height

>Weight

I feel that these four sections will enable me to carry out a statistical investigation. I aim to find out whether there is any relationship between them and differences between age and sex.

Collecting Data

The first step I need to take is to take a random sample of the data. Before I do this I will need to decide how many students’ data I want to be analysing. I think 60 would be a good amount allowing me to have 30 boys and 30 girls. Now I need to use stratified sample to find out how many boy and girls I need from each year group.

282/ 1183 x 60 =14.30262

270/ 1183 x 60 =13.694

261/ 1183 x 60 =13.23753

200/ 1183 x 60 =10.1437

170/ 1183 x 60 =8.622147

Total = 60

The reason I have decided to do a stratified sample is because stratified sampling is the best way to represent data in a proportional way. It makes it a much fairer process to randomly select data by giving each one an equal chance of being selected. Because I can’t exactly have 13.694 prices of data I need to simplify each value by rounding it up to the nearest whole number.

Middle

Number

8

Male

1.20

36

23

8

Male

1.32

47

87

8

Male

1.69

59

39

8

Male

1.54

42

116

8

Male

1.50

52

86

8

Male

1.42

26

8

8

Male

1.77

54

1

8

Male

1.57

62

128

8

Female

1.62

49

10

8

Female

1.62

54

32

8

Female

1.75

64

53

8

Female

1.41

39

61

8

Female

1.66

72

59

8

Female

1.50

57

46

Year Group | Gender | Height (m) | Weight (kg) | Number |

9 | Male | 1.54 | 42 | 26 |

9 | Male | 1.69 | 65 | 74 |

9 | Male | 1.56 | 60 | 63 |

9 | Male | 1.64 | 35 | 80 |

9 | Male | 1.61 | 45 | 30 |

9 | Male | 1.73 | 52 | 15 |

9 | Female | 1.45 | 51 | 59 |

9 | Female | 1.6 | 48 | 55 |

9 | Female | 1.56 | 50 | 139 |

9 | Female | 1.40 | 41 | 130 |

9 | Female | 1.47 | 56 | 141 |

9 | Female | 1.68 | 57 | 68 |

9 | Female | 1.62 | 55 | 39 |

Year Group | Gender | Height (m) | Weight (kg) | Number |

10 | Male | 1.74 | 80 | 88 |

10 | Male | 1.63 | 60 | 1 |

10 | Male | 1.57 | 64 | 84 |

10 | Male | 1.82 | 57 | 50 |

10 | Male | 1.60 | 47 | 31 |

10 | Female | 1.68 | 58 | 59 |

10 | Female | 1.60 | 66 | 78 |

10 | Female | 1.56 | 56 | 15 |

10 | Female | 1.40 | 45 | 5 |

10 | Female | 1.68 | 53 | 76 |

Year Group | Gender | Height (m) | Weight (kg) | Number |

11 | Male | 1.82 | 66 | 33 |

11 | Male | 1.65 | 50 | 73 |

11 | Male | 1.62 | 48 | 64 |

11 | Male | 1.78 | 67 | 20 |

11 | Female | 1.03 | 45 | 37 |

11 | Female | 1.65 | 52 | 61 |

11 | Female | 1.80 | 42 | 40 |

11 | Female | 1.69 | 51 | 14 |

11 | Female | 1.73 | 50 | 11 |

Now that I have 60 pieces of data which represent the population proportionally I can begin my investigation. With the data that I have I can draw several graphs to represent the heights and the weights of the different ages or genders. Because there are so many things I can do with the data I need to decide a systematic way to approach the investigation so that I am not wasting time repeating calculations. Because height and weight are continuous data I will have to construct histograms to represent the data and in order to do this I need to make cumulative frequency tables.

I am now ready to begin recording my results in a table. To start with I will look at height. When I arrange the data in order of height I wasn’t surprised to find out that the tallest person was in fact in year 11. I expected the shortest person however o be in year 7 so when I found that the shortest male from my data was in year 8 and that the shortest female was in year 11 I was shocked. The girl’s heights varied quite vastly and showed less of a correlation with age.

Boys

Height, h (cm) | Tally | Frequency |

120 ≤ h < 130 | 1 | |

130 ≤ h < 140 | 2 | |

140 ≤ h < 150 | 3 | |

150 ≤ h < 160 | 7 | |

160 ≤ h < 170 | 11 | |

170 ≤ h < 180 | 4 | |

180 ≤ h < 190 | 2 |

Girls

Height, h (cm) | Tally | Frequency |

100 ≤ h < 110 | 1 | |

110 ≤ h < 120 | 0 | |

120 ≤ h < 130 | 0 | |

130 ≤ h < 140 | 1 | |

140 ≤ h < 150 | 6 | |

150 ≤ h < 160 | 5 | |

160 ≤ h < 170 | 14 | |

170 ≤ |

Conclusion

After that I will then calculate spearman’s rank coefficient. The reason I have decided to do this is because I want to find out how correlated the height and weight are. I will calculate the coefficient for the boys, girls and different year groups. That way I will be able to compare the values and decide which year group or gender has a better correlation.

I will then evaluate all my results and make necessary comments to the results I obtained. I need to show my understanding of my results by evaluating all of the outcomes.

I also may carry out research on the BMI. This helps me understand how the relationship between height and weight varies according to age.

Standard Deviation

To begin with I have calculated the standard deviation for the whole population. The standard deviation is a statistic that tells you how tightly all the various examples are clustered around the mean in a set of data. When the examples are pretty tightly bunched together and the bell-shaped curve is steep, the standard deviation is small. When the examples are spread apart and the bell curve is relatively flat, that tells you have a relatively large standard deviation.

The formula for standard deviation is as follows:

Standard deviation for the height of the whole population

0.150282

Standard deviation for the height of the boys of the population

0.148333

Standard deviation for the height of the girls of the population

0.15358

Standard deviation for the height of the just the year sevens

0.11498686

Standard deviation for the height of the just the year eights

0.163164112

Standard deviation for the height of the just the year nines

0.097848653

Standard deviation for the height of the just the year tens

0.113705272

Standard deviation for the height of the just the year elevens

0.240023147

Scatter graph for year sevens

Weight

Scatter graph for year eights

Weight

Gradient of the line : y = 0.9158x + 107.43

Roy Vivasi 11PB

This student written piece of work is one of many that can be found in our GCSE Height and Weight of Pupils and other Mayfield High School investigations section.

## Found what you're looking for?

- Start learning 29% faster today
- 150,000+ documents available
- Just £6.99 a month