# Edexcel GCSE Statistics Coursework

Extracts from this document...

Introduction

Edexcel GCSE Statistics 1389

PLANNING SHEET – MAYFIELD HIGH

Student Name: Anya Sweilam Class: 11H3

This investigation is based upon the students of Mayfield High School, a fictitious school- there are 1182 students at Mayfield presented within 13 categories. I will be investigating the relationship between height and weight and how these statistics differ between females and males. I have chosen to look at height and weight mainly because in this line of enquiry my data will be numerical and continuous, meaning that I will be able to produce a more detailed analysis. For example, if I had chosen to look at eye colour and hair colour my analysis would be limited and therefore my investigation may be imprecise.

My aim in this investigation is to query whether or not there is a correlation between height and weight and find out if this varies between genders. I believe that as a student becomes taller their weight will increase; due to this assumption I expect a graph of weight and height to show a rising trend. Listed below are my hypotheses.

The height and weight of a person is affected by their age and gender. I assume that in years 7-9 girls will generally be taller than boys- this is because girls tend to grow faster than boys during the early stages of development. Boys will, however, eventually grow taller and so in years 10-11 it can be assumed the number boys taller than girls will be greater. This also applies to adults aged 20 and above. As for the weight, boys are generally heavier than girls; this is due to their body structure. I, therefore, predict that my results will produce a pattern which shows that boys weigh more than girls.

Middle

1

30

0.2

75

With the cumulative frequency graph displaying weight, the female’s data produces an almost perfect S-shape curve, whereas the male’s data has, what seems to be, an anomaly (the third point allocated at the weight of 45KG and cumulative frequency of 9) which affects its shape. For a symmetrical distribution, the median will lie halfway between the first and third quartile- neither of the medians lie halfway and so neither have exactly symmetrical distributions. The female’s median, however, is extremely close to being halfway between the two quartiles showing us a more symmetrical distribution than that of the males; this may explain the almost perfect curve on the frequency graph which the points plotted for females produce.

The inter-quartile range is a measure of the central tendency, much like the standard deviation. The advantage of the inter-quartile range over the standard deviation, however, is that the inter-quartile range includes half of the points regardless of the shape of the distribution. The smaller the inter-quartile range, the more consistent the data is. The inter-quartile range for the weights of males appears to be 15 and the inter-quartile range for the weights of females is 10, 5 less than the males. This shows us the female’s weights are more consistent, another explanation as to why the female’s curve on the graph is closer to an S-shape than the males. Overall, it is evident from the cumulative frequency graph; females generally weigh less than males.

Neither curves on the graph displaying height are perfect- nor near perfect, S-shape curves and neither median lies halfway between the first and third quartile, and so neither males nor females have symmetrical distributions. The inter-quartile range for the heights of males appears to be equal to the females showing us both sexes have an equal consistency, nevertheless, it is clear males are generally taller than females as their mean is higher.

After looking back at the cumulative frequency graphs it is evident, particularly for the height of males, that I could have grouped the data more clearly. The third and fourth row in the group of male heights show a frequency of 0, which has an effect on the S-shape of the curve on my graph, and possibly having an effect on the lower quartile. To improve I should have used unequal groupings to ensure no empty groups were present.

Box plots are an informative way to display a range of numerical data. It can show many things about a data set, like the lowest term in the set, the highest term in the set, the median, the upper quartile, and the lower quartile. Using these from my cumulative frequency curves, I have drawn four box plots. Outliers are not present in every box plot drawn, except one where there is an extreme value which deviates significantly from the rest of the sample. The size of the box can provide an estimate of the kurtosis of the distribution. A thin box relative to the whiskers indicates that a very high number of cases are contained within a very small segment of the sample indicating a distribution with a thinner peak whereas a wider box is indicative of a wider peak and so, the wider the box, the more U-shaped the distribution becomes.

Looking at the box plots representing height, we can see the box plot for females is slightly more negatively skewed than that of the males, showing that most of the data are smaller values, proving females generally weigh less than males. The medians lie at the same point- 1.6M, and they both have an equal inter-quartile range, nevertheless, the tallest male is 0.5M taller than the tallest female. As both boxes are of equal size both distributions are equally U-shaped. The box represents the middle 50% of the data sample- half of all cases are contained within it. The 50% of data within the box for the males ranges between 1.55M and 1.7M whereas for the females it ranges between 1.5M and 1.65M, showing us females are generally shorter than males.

Looking at the box plots for weight, we see that half the female's weights are between 45 and 55KG whereas half the men's weights lie between 45 and 60KG. The highest value for females is 70KG (ignoring the outlier) and for males: 75KG, the median for the males’ weight is 5KG higher than that of the females. The lowest value which appears on the box plot for males is 30KG and the highest is 75KG, giving us a range of 45KG. Looking at the same pieces of data for the females, we can work out that the range is in fact 5KG less than that of the males. It is evident that the distribution of the female’s box plot has a thinner peak than the males attributable to the simple fact that the box of the female’s weight is far thinner than the males’. The distribution for the weight of males is, therefore, more U-shaped.

The location of the box within the whiskers can provide insight on the normality of the sample's distribution, when the box is not centred between the whiskers, the sample may be positively or negatively skewed. If the box is shifted significantly to the low end, it is positively skewed; if the box is shifted significantly to the high end, it is negatively skewed, however, none of the four box plots are shifted significantly to either the high end or the low end. Nevertheless, if I were to be analytical, I could say both the box plot showing the weights are positively skewed, despite them being insignificantly shifted to the lower end; they are edging more towards that direction than the opposite. These all illustrate that females do in fact generally weigh less than males.

An outlier appears on the box plot showing the weights of females, this may be the result of an error in measurement, in which case it will distort the interpretation of the data, having undue influence on many summary statistics- for example: the mean, however, if the outlier is a genuine result, it is important because it may perhaps indicate an extreme of behaviour or may have been affected by external behaviour, for example, dietary habits. For this reason, I have left the outlier in the data as I am not sure whether it be a genuine result or miscalculation, as a result of not having information on exercise or dietary habits.

To conclude, it is construable that my hypothesis was in fact correct. It is evident from all the graphs included that females are, in effect, generally shorter and weigh less than males. Whether this is attributable to, as studies show, the varied skeletons of the opposed sexes or the dissimilar hormones produced in both female and male bodies, it is known females are generally shorter and weigh less than males. When the average male and female both reach the age of 20 it is said ‘females are generally 10 percent shorter than males and 20 per cent lighter’ and between the ages of 11 and 16 ‘males appear to generally be 15 percent taller and heavier than the female sex’. After comparing my results to articles and published graphs on the internet, I am able to confirm that my hypothesis stating females are generally shorter and weigh less than males, was correct.

Hypothesis 2:

After calculating the frequency density for the male and female heights and weights, I created four histograms; the advantage of a histogram is that it shows the shape of the distribution for a large set of data and so was therefore able to show me the shape of the distributions for male and female heights and weights, however, when using histograms it is more difficult to compare two or more data sets as we are unable to read exact values as the data is grouped into categories. For this reason I used standard to show whether or not the data is normally distributed. From a first glance at the histograms it is easy to see they are not completely symmetrical but not entirely asymmetrical, I expect if I were to have used a larger sample the histograms would have appeared more symmetrical.

Tables in which I used to create the histograms

Females

Height (M) | Frequency | FD | Upper-Class |

1<h<1.5 | 7 | 14 | 1.5 |

1.5<h<1.65 | 17 | 113 | 1.65 |

1.65<h<1.75 | 5 | 50 | 1.75 |

1.75<h<1.85 | 1 | 10 | 1.85 |

Weight (KG) | Frequency | FD | Upper-Class |

30<w<45 | 6 | 0.4 | 45 |

45<w<50 | 13 | 2.6 | 50 |

50<w<60 | 9 | 0.3 | 60 |

60<w<75 | 2 | 0.4 | 75 |

Males

Height (M) | Frequency | FD | Upper-Class | |||

1<h<1.45 | 7 | 16 | 1.45 | |||

1.45<h<1.65 | 11 | 55 | 1.65 | |||

1.65<h<1.7 | 7 | 14 | 1.7 | |||

1.7<h<1.8 | 4 | 40 | 1.8 | |||

1.8<h<1.85 | 1 | 2 | 1.85 | |||

Weight (KG) | Frequency | FD | Upper-Class | |||

25<w<40 | 4 | 0.3 | 30 | |||

40<w<60 | 20 | 1 | 45 | |||

60<w<65 | 3 | 0.6 | 65 | |||

65<w<75 | 3 | 0.3 | 70 |

From looking at the histograms, it is clear only two of these encompass curves which are appropriate to super impose normal distribution curves, and so for this reason I will not calculate the normal distribution. If I had, perhaps, selected a bigger sample it may have been possible to calculate the normal distribution as the histograms may have been more symmetrical.

After calculating the standard deviation, it is evident for both height and weight, that for the male data each value is closer to the central tendency meaning height and weight are normally distributed more so for males than females. Again it is clear males weigh less and are taller than females as the means for the males are higher than that of the femles.

Gender | Weight (x) | x - 47.9 | (x-47)SQD | Gender | Height (x) | x - 1.56 | (x-1.56)SQD |

Female | 45 | -2.9 | 8.41 | Female | 1.03 | -0.53 | 1.0609 |

Female | 51 | 3.1 | 9.61 | Female | 1.35 | -0.21 | 1.8225 |

Female | 30 | -17.9 | 320.41 | Female | 1.42 | -0.14 | 2.0164 |

Female | 40 | -7.9 | 62.41 | Female | 1.43 | -0.13 | 2.0449 |

Female | 45 | -2.9 | 8.41 | Female | 1.43 | -0.13 | 2.0449 |

Female | 45 | -2.9 | 8.41 | Female | 1.46 | -0.1 | 2.1316 |

Female | 44 | -3.9 | 15.21 | Female | 1.47 | -0.09 | 2.1609 |

Female | 45 | -2.9 | 8.41 | Female | 1.5 | -0.06 | 2.25 |

Female | 50 | 2.1 | 4.41 | Female | 1.55 | -0.01 | 2.4025 |

Female | 50 | 2.1 | 4.41 | Female | 1.55 | -0.01 | 2.4025 |

Female | 60 | 12.1 | 146.41 | Female | 1.55 | -0.01 | 2.4025 |

Female | 45 | -2.9 | 8.41 | Female | 1.56 | 0 | 2.4336 |

Female | 45 | -2.9 | 8.41 | Female | 1.57 | 0.01 | 2.4649 |

Female | 48 | 0.1 | 0.01 | Female | 1.59 | 0.03 | 2.5281 |

Female | 56 | 8.1 | 65.61 | Female | 1.6 | 0.04 | 2.56 |

Female | 52 | 4.1 | 16.81 | Female | 1.61 | 0.05 | 2.5921 |

Female | 42 | -5.9 | 34.81 | Female | 1.62 | 0.06 | 2.6244 |

Female | 48 | 0.1 | 0.01 | Female | 1.62 | 0.06 | 2.6244 |

Female | 49 | 1.1 | 1.21 | Female | 1.62 | 0.06 | 2.6244 |

Female | 49 | 1.1 | 1.21 | Female | 1.62 | 0.06 | 2.6244 |

Female | 54 | 6.1 | 37.21 | Female | 1.62 | 0.06 | 2.6244 |

Female | 38 | -9.9 | 98.01 | Female | 1.63 | 0.07 | 2.6569 |

Female | 47 | -0.9 | 0.81 | Female | 1.63 | 0.07 | 2.6569 |

Female | 48 | 0.1 | 0.01 | Female | 1.63 | 0.07 | 2.6569 |

Female | 49 | 1.1 | 1.21 | Female | 1.65 | 0.09 | 2.7225 |

Female | 54 | 6.1 | 37.21 | Female | 1.65 | 1.65 | 2.7225 |

Female | 50 | 2.1 | 4.41 | Female | 1.68 | 1.68 | 2.8224 |

Female | 42 | -5.9 | 34.81 | Female | 1.71 | 1.71 | 2.9241 |

Female | 60 | 12.1 | 146.41 | Female | 1.72 | 1.72 | 2.9584 |

Female | 56 | 8.1 | 65.61 | Female | 1.75 | 1.75 | 3.0625 |

Total | 1437 | 0 | 1158.7 | Total | 46.82 | 7.82 | 73.6234 |

SD= | 211.5 | SD= | 13.4 |

Conclusion

54

3.3

10.89

Male

1.77

1.77

3.1329

Male

50

-0.7

0.49

Male

1.83

1.83

3.3489

Male

60

9.3

86.49

Male

1.85

1.85

3.4225

Male

75

24.3

590.49

Total

47.56

9.5

19.0888

Total

50.7

0

3422.3

SD=

3.49

SD=

624.9

Gender | Height (M) | Weight (Kg) | Height Rank | Weight Rank | Diff | Diff^2 |

female | 1.42 | 30 | 3 | 1 | 2 | 4 |

female | 1.63 | 38 | 23 | 2 | 21 | 441 |

female | 1.43 | 40 | 4.5 | 3 | 1.5 | 2.25 |

female | 1.62 | 42 | 19 | 4.5 | 14.5 | 210.25 |

female | 1.71 | 42 | 28 | 4.5 | 23.5 | 552.25 |

female | 1.47 | 44 | 7 | 6 | 1 | 1 |

female | 1.03 | 45 | 1 | 9.5 | -8.5 | 72.25 |

female | 1.43 | 45 | 4.5 | 9.5 | -5 | 25 |

female | 1.46 | 45 | 6 | 9.5 | -3.5 | 12.25 |

female | 1.5 | 45 | 8 | 9.5 | -1.5 | 2.25 |

female | 1.56 | 45 | 12 | 9.5 | 2.5 | 6.25 |

female | 1.57 | 45 | 13 | 9.5 | 3.5 | 12.25 |

female | 1.63 | 47 | 23 | 13 | 10 | 100 |

female | 1.59 | 48 | 14 | 15 | -1 | 1 |

female | 1.62 | 48 | 19 | 15 | 4 | 16 |

female | 1.63 | 48 | 23 | 15 | 8 | 64 |

female | 1.62 | 49 | 19 | 18 | 1 | 1 |

female | 1.62 | 49 | 19 | 18 | 1 | 1 |

female | 1.65 | 49 | 25.5 | 18 | 7.5 | 56.25 |

female | 1.55 | 50 | 10 | 21 | -11 | 121 |

female | 1.55 | 50 | 10 | 21 | -11 | 121 |

female | 1.68 | 50 | 27 | 21 | 6 | 36 |

female | 1.35 | 51 | 2 | 23 | -21 | 441 |

female | 1.61 | 52 | 16 | 24 | -8 | 64 |

female | 1.62 | 54 | 19 | 25 | -6 | 36 |

female | 1.65 | 54 | 25.5 | 25 | 0.5 | 0.25 |

female | 1.6 | 56 | 15 | 27.5 | -12.5 | 156.25 |

female | 1.75 | 56 | 30 | 27.5 | 2.5 | 6.25 |

female | 1.55 | 60 | 10 | 29.5 | -19.5 | 380.25 |

female | 1.72 | 60 | 29 | 29.5 | -0.5 | 0.25 |

Summation | Sum | 2942.5 |

6 * sum | 17655 | |

Count | n | 30 |

n(n^2-1) | 26970 | |

Spearman's R | 0.4 |

Gender | Height (M) | Weight (KG) | Height Rank | Weight Rank | Diff | Diff^2 |

male | 1.42 | 26 | 5 | 1 | 4 | 16 |

male | 1.41 | 31 | 4 | 2 | 2 | 4 |

male | 1.32 | 38 | 2 | 3 | -1 | 1 |

male | 1.6 | 38 | 15 | 3 | 12 | 144 |

male | 1.63 | 40 | 18 | 5 | 13 | 169 |

male | 1.65 | 41 | 21 | 6 | 15 | 225 |

male | 1.34 | 42 | 3 | 7.5 | -4.5 | 20.25 |

male | 1.44 | 42 | 6 | 7.5 | -1.5 | 2.25 |

male | 1.26 | 44 | 1 | 9 | -8 | 64 |

male | 1.46 | 45 | 7 | 10 | -3 | 9 |

male | 1.66 | 46 | 24 | 11 | 13 | 169 |

male | 1.55 | 50 | 9 | 13 | -4 | 16 |

male | 1.58 | 50 | 13 | 13 | 0 | 0 |

male | 1.68 | 50 | 25 | 13 | 12 | 144 |

male | 1.59 | 52 | 14 | 17.5 | -3.5 | 12.25 |

male | 1.57 | 54 | 11.5 | 17.5 | -6 | 36 |

male | 1.57 | 54 | 11.5 | 17.5 | -6 | 36 |

male | 1.65 | 54 | 21 | 17.5 | 3.5 | 12.25 |

male | 1.65 | 54 | 21 | 17.5 | 3.5 | 12.25 |

male | 1.77 | 54 | 28 | 17.5 | 10.5 | 110.25 |

male | 1.65 | 55 | 21 | 21.5 | -0.5 | 0.25 |

male | 1.85 | 55 | 30 | 21.5 | 8.5 | 72.25 |

male | 1.62 | 56 | 17 | 23 | -6 | 36 |

male | 1.73 | 57 | 26.5 | 24 | 2.5 | 6.25 |

male | 1.6 | 60 | 15 | 25.5 | -10.5 | 110.25 |

male | 1.73 | 60 | 26.5 | 25.5 | 1 | 1 |

male | 1.55 | 64 | 9 | 27 | -18 | 324 |

male | 1.55 | 65 | 9 | 28 | -19 | 361 |

male | 1.65 | 69 | 21 | 29 | -8 | 64 |

male | 1.83 | 75 | 29 | 30 | -1 | 1 |

Summation | Sum | 2178.5 |

6 * sum | 13071 | |

Count | n | 30 |

n(n^2-1) | 26970 | |

Spearman's R | 0.5 |

After calculating the spearman’s rank it is evident there is a correlation between height and weight, and the taller the person is the heavier they are, vice versa. There is a weak positive correlation between height and weight for females and a moderate positive correlation for males as it is slightly stronger.

The height and weight of a person is affected by their age and gender. I assumed that in years 7-9 girls will generally be taller than boys- due to the fact girls tend to grow faster than boys during the early stages of development. Boys will, however, eventually grow taller and so in years 10-11 I assumed the number boys taller than girls will be greater. I was correct. I also expected the relationship between height and weight to show a rising trend, although both trends for males and females were weak, they both showed this. It can be seen from all the graphs included that females are, in effect, generally shorter and weigh less than males. Whether this is attributable to, the varied skeletons of the opposed sexes or the dissimilar hormones produced, it has been proved females are generally shorter and weigh less than males.

This student written piece of work is one of many that can be found in our GCSE Height and Weight of Pupils and other Mayfield High School investigations section.

## Found what you're looking for?

- Start learning 29% faster today
- 150,000+ documents available
- Just £6.99 a month