This coursework is about a statistical analysis to investigate the height and weight of pupils within Mayfield High School

Authors Avatar

Introduction

This coursework is about a statistical analysis to investigate the height and weight of pupils within Mayfield High School. The total number of pupils in the school is 1183 but I don’t have time to collect all the heights and weights of all the students so I obtained a random stratified sample of 200 students. I acquired this random stratified sample of 200 pupils by calculating the percentage of male and female in each year group and used the data to form the proportion of students I would take in the sample from each year group. When I attained the percentages of each year group I found that I had to round them up so I could get the whole numbers of pupils instead of having half a pupil and three-quarters, for example, I had to round up 10.6% to 11%, but in the end I still made it up to 100%.

The working out looked like this:

In selecting my data I found that there were errors in the database e.g. I found 3 female students in year 7 all weighing between 110kg and 140kg and I also found that there were 4 students weighing under 10kg – lighter than a sack of rice.

I also found some outrageous heights.

Hypotheses

  1. As height increases, weight increases
  2. Boys are naturally taller that girls
  3. As age increases, the rate of growth decreases
  4. Girls’ heights are wider spread than boys’ heights.
  5. Students stop growing at year 10 and they are, on average the same height and weight as year 11
    Hypothesis 1: As height increases, weight increases

To test this hypothesis I have decided to use a scatter graph because scatter graphs are useful in comparing two variables in this case height and weight. I entered my data into a spreadsheet and produced this graph using the program Autograph.

Fig 1.0

From my graph (figure 1.0) I can see that as height increases, weight also increases and the average person has a height of 1.597m and a weight of 49.24kg, I think this is quite a small person but this takes into consideration children who are quite young (11yrs old in year 7). This shows a disadvantage of using all the data at the same time and would be a reason to separate the data into year groups. (See section 2)

Join now!

Figure 1.0 also indicates a positive correlation but I don’t think it is a strong correlation because the points are visually widely scattered. To get a better idea of how reliable the line of best fit is I will have to work out the correlation coefficient. If the coefficient is between 0.5 and 1 (because I know it’s positive) then the straight line is a good fit and if it’s closer to 0 then it’s not a good fit to the data.


Working out the correlation coefficient

The formula to find the Product Moment Correlation Coefficient ...

This is a preview of the whole essay