# Print this Document

Extracts from this document...

Introduction

Mayfield High School

I am investigating the pupils of Mayfield High School. It is a fictitious school, although the data is based on that of a real school. The line of enquiry I have decided to follow is the relationship between height and weight of the pupils.

The following table shows the numbers of pupils in the school:

Year Group | Boys | Girls | Total |

7 | 151 | 131 | 282 |

8 | 145 | 125 | 270 |

9 | 118 | 143 | 261 |

10 | 106 | 94 | 200 |

11 | 84 | 86 | 170 |

604 | 579 | 1183 |

Using this information, I have chosen to use a sample size of 30, as it is a large enough number to get a fair representation of the population, and divides fully into 360 in the event that I would need to draw any pie charts.

To begin with this line of enquiry, I shall take a random sample of 30 boys and 30 girls from the whole school register, recording their heights and weights. In order to do this I will allocate each student a number, generate random numbers using my calculator, and take the data of the corresponding student.

Boys | Girls | ||

Height (cm) | Weight (kg) | Height (cm) | Weight (kg) |

162 | 48 | 132 | 35 |

141 | 45 | 130 | 36 |

153 | 40 | 173 | 51 |

146 | 53 | 150 | 40 |

147 | 47 | 159 | 38 |

147 | 45 | 142 | 29 |

158 | 48 | 152 | 33 |

165 | 50 | 159 | 52 |

154 | 40 | 166 | 50 |

173 | 59 | 149 | 47 |

164 | 42 | 157 | 45 |

160 | 41 | 171 | 40 |

155 | 68 | 163 | 47 |

154 | 48 | 155 | 66 |

132 | 48 | 160 | 60 |

152 | 38 | 165 | 45 |

155 | 74 | 161 | 38 |

172 | 42 | 169 | 48 |

170 | 50 | 162 | 54 |

170 | 57 | 151 | 39 |

157 | 64 | 154 | 68 |

168 | 64 | 157 | 40 |

152 | 45 | 153 | 65 |

162 | 52 | 190 | 40 |

169 | 65 | 174 | 47 |

180 | 68 | 179 | 45 |

168 | 58 | 163 | 48 |

152 | 38 | 133 | 55 |

152 | 45 | 178 | 55 |

170 | 72 | 159 | 48 |

In doing this I have encountered a few extreme values in the data that I have had to discard because they are seemingly mistakes in filling in the forms or entering the data into the database. For example, a lower-school girl had a weight of 140kg, which in my opinion was not feasible, and so I discounted it from the sample and took another students data instead.

Here are the frequency tables for the above data, separated by gender. As the data is continuous I have grouped it in a grouped frequency table.

Boys | ||

Height, h (cm) | Tally | Frequency |

130 < h < 140 | ¦ | 1 |

140 < h < 150 | ¦¦¦¦ | 4 |

150 < h < 160 | ¦¦¦¦ ¦¦¦¦ ¦ | 11 |

160 < h < 170 | ¦¦¦¦ ¦¦¦ | 8 |

170 < h < 180 | ¦¦¦¦ | 5 |

180 < h <190 | ¦ | 1 |

190 < h < 200 | 0 |

Weight, w (kg) | Tally | Frequency |

20 < w < 30 | 0 | |

30 < w < 40 | ¦¦ | 2 |

40 < w < 50 | ¦¦¦¦ ¦¦¦¦ ¦¦¦¦ | 14 |

50 < w < 60 | ¦¦¦¦ ¦¦ | 7 |

60 < w < 70 | ¦¦¦¦ | 5 |

70 < w < 80 | ¦¦ | 2 |

Girls | ||

Height, h (cm) | Tally | Frequency |

130 < h < 140 | ¦¦¦coag ag" . "r se" . ag . "ag" . "w or". ag . " " . ag . "k inag foag " . ag . ": | 3 |

140 < h < 150 | ¦¦ | 2 |

150 < h < 160 | ¦¦¦¦ ¦¦¦¦ ¦ | 11 |

160 < h < 170 | ¦¦¦¦ ¦¦¦ | 8 |

170 < h < 180 | ¦¦¦¦ | 5 |

180 < h <190 | 0 | |

190 < h < 200 | ¦ | 1 |

Weight, w (kg) | Tally | Frequency |

20 < w < 30 | ¦ | 1 |

30 < w < 40 | ¦¦¦¦ ¦ | 6 |

40 < w < 50 | ¦¦¦¦ ¦¦¦¦ ¦¦¦ | 13 |

50 < w < 60 | ¦¦¦¦ ¦ | 6 |

60 < w < 70 | ¦¦¦¦ | 4 |

70 < w < 80 | 0 |

Firstly, I shall consider the trends in height. To do this, I will record the data in a histogram because it is continuous.

In order to draw the histogram I must calculate the frequency density of the bars. This is done by: Frequency density = frequency ÷class width

Boys | ||

Height (cm) | Frequency | Frequency density |

130 < h < 140 | 1 | 0.1 |

140 < h < 150 | 4 | 0.4 |

150 < h < 160 | 11 | 1.1 |

160 < h < 170 | 8 | 0.8 |

170 < h < 180 | 5 | 0.5 |

180 < h < 190 | 1 | 0.1 |

190 < h < 200 | 0 | 0 |

Girls | ||

Height (cm) | Frequency | Frequency density |

130 < h < 140 | 3 | 0.3 |

140 < h < 150 | 2 | 0.2 |

150 < h < 160 | 11 | 1.1 |

160 < h < 170 | 8 | 0.8 |

170 < h < 180 | 5 | 0.5 |

180 < h < 190 | 0 | 0 |

190 < h < 200 | 1 | 0.1 |

Now I am able to draw the histograms of girls’ and boys’ heights.

The histograms show that the heights of boys and girls are very similar. They show a small dispersion of results with little variation for the boys, although there are some outlying values for the girls (for example the girl who is over 190cm tall).

In order to make a further comparison between heights of boys and girls, I will use the histograms to draw frequency polygons.

The frequency polygons show that there are fewer boys with heights below 140cm and above 190cm than there are girls, but more who are between 140 and 150cm and 180 and 190cm.

To continue with the line of enquiry, I will sort the data into stem and leaf diagrams as it is grouped, and calculate the averages. This will enable me to compare the heights of the different genders further.

Boys | Stem | Girls | ||

Frequency | Leaf | Leaf | Frequency | |

1 | 2 | 13 | 0,2,3 | 3 |

4 | 7,7,6,1 | 14 | 2,9 | 2 |

11 | 8,7,5,5,4,4,3,2,2,2,2 | 15 | 0,1,2,3,4,5,7,7,9,9,9 | 11 |

8 | 9,8,8,5,4,2,2,0 | 16 | 0,1,2,3,3,5,6,9 | 8 |

5 | 3,2,0,0,0 | 17 | 1,3,4,8,9 | 5 |

1 | 0 | 18 | 0 | |

0 | 19 | 0 | 1 | |

Key: 13/2 = 132 cm |

These are the average results for height:

Heights (cm) | Mean | Modal Class Interval | Median | Range |

Boys | 159 | 150 < h < 160 | 158 | 48 |

Girls | 159 | 150 < h < 160 | 159 | 60 |

Two of the three measures of average were the same for boys and girls, although the median height was slightly lower for boys (158 cm compared to 159cm). The data for boys showed tighter dispersion, with a spread less than that of the girls (the range for boys was 48cm compared to 60cm for the girls).

The evidence from the sample suggests that 11/30, or 37% of both boys and girls have a height of between 150 and 160cm.

Now I shall investigate the weights of the sample, following the same process.

To draw out the histograms of weights, I must again calculate the frequency density.

Boys | ||

Weight (kg) | Frequency | Frequency Density |

20 < w < 30 | 0 | 0 |

30 < w < 40 | 2 | 0.2 |

40 < w < 50 | 14 | 1.4 |

50 < w < 60 | 7 | 0.7 |

60 < w < 70 | 5 | 0.5 |

70 < w < 80 | 2 | 0.2 |

Girls | ||

Weight (kg) | Frequency | Frequency Density |

20 < w < 30 | 1 | 0.1 |

30 < w < 40 | 6 | 0.6 |

40 < w < 50 | 13 | 1.3 |

50 < w < 60 | 6 | 0.6 |

60 < w < 70 | 4 | 0.4 |

70 < w < 80 | 0 | 0 |

Middle

0,0,0,0,5,5,5,5,7,7,7,8,8

13

7

9,8,7,3,2,0,0

5

0,1,2,4,5,5

6

5

8,8,5,4,4

6

0,5,6,8

4

2

4,2

7

0

Key: 2/9 = 29

Weights (kg) | Mean | Modal Class Interval | Median | Range |

Boys | 48 | 40 < w < 50 | 49 | 36 |

Girls | 47 | 40 < w < 50 | 46 | 39 |

The three averages were all higher for boys than for girls, although the data for boys was less widely spread out with a range of 36kg compared to 39kg for the girls. The evidence from the sample would suggest that 14/30 or 47% of boys and 13/30 or 43% of girls have a weight between 40 and 50kg.

These conclusions for both height and weight have been taken using a sample of only 30 boys and 30 girls. To confirm that these results are accurate and true of the entire population, I would need to either enlarge the sample size or repeat the whole procedure using a different sample.

Following this line of enquiry, I have made this hypothesis:

In general, the taller a person is, the more they will weigh.

In order to test this hypothesis, I need to take a new sample of 30 students of either gender.

Height (cm) | 145 | 154 | 163 | 160 | 160 | 160 | 159 | 156 | 154 | 165 | 165 | 164 | 172 | 165 | 149 |

Weight (kg) | 52 | 40 | 60 | 50 | 46 | 51 | 52 | 74 | 52 | 56 | 59 | 42 | 46 | 44 | 37 |

Height (cm) | 165 | 170 | 106 | 157 | 180 | 175 | 179 | 162 | 163 | 172 | 160 | 165 | 167 | 177 | 162 |

Weight (kg) | 72 | 52 | 74 | 36 | 42 | 57 | 45 | 72 | 45 | 51 | 55 | 48 | 66 | 57 | 56 |

These values will be plotted on a scatter diagram so that I can identify a correlation and find the relationship between height and weight.

The scatter diagram shows a moderate positive correlation between weight and height, suggesting that the taller a person is the heavier they are. The line of best fit suggests that a person who is 1.80m tall will weight 74kg.

Earlier in the investigation I found evidence to suggest that weight, and perhaps height, are affected by gender. I shall now investigate how gender affects the correlation between weight and height. I predict that:

Correlation between height and weight will improve if the genders are considered in isolation.

I will use the random sample of 30 boys and 30 girls taken at the start of the investigation to test this hypothesis, and plot this on 3 different scatter diagrams, showing the genders individually and the sample as a whole.

The evidence in the scatter diagrams supports my hypothesis that correlation between height and weight is stronger if boys and girls are studied individually.

The lines of best fit on the diagrams show that a boy who was 1.80m tall would weight 70kg, whereas a girl of the same height would weight 73kg.

The equations of the lines of best fir would enable me to calculate predictions for height or weight.

Finding the equations of the lines requires calculating the gradient of the line, and the point at which it crosses the y-axis.

Boys only: y = 0.1 x + 0.9

10

y = 0.01 x + 0.9

Girls only: y = 0.5 x – 2.1

7

y= 5 x –2.1

70

Mixed Population: y = 0.15 x + 1.2

17

y = 15 x + 1.2

1700

Using the equation, I ca predict that a girl 1.50m tall would weight 50kg.

y = 5 x – 2.1

70

x = 70 ( y + 2.1 )

5

x = 70 ( 1.50 + 2.1 )

5

x = 50kg

The line of best fir is an estimation of the relationship between height and weight, using only he sample of data.

There are anomalous values, for example the girl who is 1.90m tall and weighs 40kg, which does not follow the relationship.

Cumulative frequency is very useful when comparing sets of continuous data. I will use it in cumulative frequency curves to show data trends.

The following tables show the cumulative frequency for height and weight for boys, girls and the mixed population.

Heights (cm) | Cumulative Frequency | Weights (kg) | Cumulative Frequency | |||||

Boys | Girls | Mixed Population | Boys | Girls | Mixed Population | |||

< 140 | 1 | 3 | 4 | < 30 | 0 | 1 | 1 | |

<150 | 5 | 5 | 10 | <40 | 2 | 7 | 9 | |

<160 | 16 | 16 | 32 | <50 | 16 | 20 | 36 | |

<170 | 224 | 24 | 48 | <60 | 23 | 26 | 49 | |

<180 | 29 | 29 | 58 | <70 | 28 | 30 | 58 | |

<190 | 30 | 29 | 59 | <80 | 30 | 30 | 60 | |

<200 | 30 | 30 | 60 |

The curves will be drawn on the same axis to make comparing them easier.

The curves have enabled me to read off easily and accurately the median, upper and lower quartiles and the interquartile range. These are shown for both height and weight in the following tables.

Heights (cm) | Median | Lower Quartile | Upper Quartile | Interquartile Range |

Mixed | 160 | 154 | 167 | 13 |

Boys | 159 | 153 | 168 | 15 |

Girls | 159 | 153 | 168 | 15 |

Weights (kg) | Median | Lower Quartile | Upper Quartile | Interquartile Range |

Mixed | 47 | 42 | 57 | 15 |

Boys | 49 | 45 | 59 | 14 |

Girls | 45 | 41 | 54 | 13 |

Conclusion

Dispersions from the line

Mean of Vertical Dispersions

35

135

21

10.8 cm

37

138

14

38

140

4

40

143

21, 5, 37

41

145

2

42

147

3, 13

43

148

6

44

150

12

45

152

13

46

154

2

47

155

8

48

157

3, 10, 23

49

159

11

50

160

5

52

164

3, 18

54

167

9

55

169

1

57

172

2

59

175

19

65

186

36

67

189

34

68

191

11

72

197

17, 16

Comparing the mean of vertical dispersions of the boys in Year 7, and of all the boys in the stratified sample, the evidence suggest that considering the year groups in isolation gives a stronger correlation between height and weight. This mean of vertical dispersions of Year 7 boys was 8cm. The mean of vertical dispersion for all of the boys was 12.6cm, more than 1.5 times the mean of vertical dispersions of the Year 7 boys.

Final Summary

These are the final conclusions I have made from this investigation after extending the line of enquiry and refining my hypotheses.

- A sample of 30 students stratified over age and gender shows that the mean height is 161 cm for both boys and girls. However, the range of heights was considerably greater for boys than for girls, which suggests that there would be many boys with a height smaller than the girls.
- A 10% sample of the boys in Year 7 suggest that this age and gender has a mean height of 154 cm, with a mean deviation about the mean of 5 cm, excluding exceptional values. Comparing this to the stratified sample for the whole male sample, which has a mean height of 161cm and a mean deviation of 12.6 cm, the evidence suggests that both age and gender affects the strength of the correlation and there for accuracy in the approximation of the relationship between height and weight.

In taking a stratified sample, I eliminated the bias of age, where the proportion of boys to girls and the different ages was not reflected in the original sample. Keeping the sample within the ratio of numbers in each age and gender, I have reduced the possibility of one category being represented more than another and therefore affecting the results. The consequences have been a more fair representation of the school’s population, which theoretically will have contributed to the increased accuracy and reliability of the results and conclusions that I have drawn.

This student written piece of work is one of many that can be found in our GCSE Height and Weight of Pupils and other Mayfield High School investigations section.

## Found what you're looking for?

- Start learning 29% faster today
- 150,000+ documents available
- Just £6.99 a month