Results
Random selection for Year 7
Random selection for Year 8
Random selection for Year 9
Random selection for Year 10
Random selection for Year 11
For each of the stratified data tables recorded I have plotted a scatter graph of each individual year and on the scatter graphs I have also plotted the perfect regression line for each year group.
To find the perfect regression line I have used this formula:
Y = a + b x. This is the formula used to find the equation to and line graph.
a = where the line crosses the y axis
b = gradient of the line
The following formula is used to determine the value of b:
b = Sxy/Sxx
Sxy = sum of (height x weight) – (height x weight)/ number in year group
Sxx = sum of height2 – sum of (height x height)/ number in year group
The ‘sum of’ equals the total of that data series
a = (b/ number in year group) – (b x sum of height)/ number in year group
These can be represented in another way
H = sum of all the heights in that year group
W = sum of all the weights in that year group
H2 = sum of the squares of all the heights in that year group
N = number of pupils in that year group
Therefore;
Sxy = (H x W) –(H x W)/N
Sxx = H2 – (HxH)/N
b = Sxy/Sxx
a = b/N – (bxH)/N
and the line of regression equation Y = a + b (x).
Once I have found the values of a and b these are substituted in to this formula
Standard Deviation
The standard deviation is a widely used as a measure of dispersion, i.e. how much the data varies around the average (mean).
The following steps help to find the standard deviation.
(a) Find the mean for the given data.
(b) Find the deviation from the mean for each piece of data.
(c) Square each of the Deviations.
(d) Find the mean of these squares
(e) The square root of this mean is the standard deviation.
Evaluation
The investigation on the whole was a success as I proved my hypothesis. The results obtained from the Mayfield Data were very reliable and accurate.
The results that I collected show that the taller the person the heavier their weight and the general increase is in proportion to the height. From the scatter graphs I have drawn it is possible to identify a positive correlation in all of them, this suggests that the height of the people in my data was proportional to the weight. However this line is not linear but shows the general aspect of the behaviour between height and weight. For example if the person was short then they are likely to be lighter than someone taller than them. An example of this would be Adam Cullin in year 8 had a height of 1.52 m and a weight of 45 kg is lighter than Ian Freeman at 1.82m and 64 kg. Louise McDonald at 1.59m is 46kg but Amanda Packham at 1.65 is heavier at 49kg. The scatter graphs support the hypothesis.
There were several anomalies in the Mayfield Data that I found, which didn’t follow the general trend of the graphs, some were included in the random sample and some were not and example of the ones that were not picked are (not all listed);
- Mellisa Bailey, in Yr 8 – height 1.75, weight 72 kg
- Simon Morrison, in yr 11 – height 1.92 m, weight 45 kg
- Semour Banks, in year 10 – height 1.6 m weight 9 kg
- Steve Austin, in year 9 – height 1.8 m, weight 48 kg
These were some of the anomalies that I spotted in my results, however there were not many and as a result I concluded that they would not be significant to my final conclusion.
These anomalies suggest that not every one in the school follows the general trend of where the taller the person the heavier his/her weight, there are exceptions to the rule.
The formulas that I used to find out the perfect line of regression were very reliable because they displayed the line of best fit that best represented the data at hand. This then helped me decide whether the data was a negative or a positive correlation and pick out the anomalies from my data, which didn’t follow the tendency of the graph.
Recommendations
If I were to perform this experiment again I would use the height and weight and compare it to the body mass index. This would show me where the mass on the person’s body has any effect on his/her weight or height. Then use this data to construct a scatter graph and a regression line to display my results.
Conclusion
The data varied slightly more for males and less for females, which showed girls had a more stable weight and height relationship; than boys. It showed males were sometimes unstable especially at younger ages than respective females’ heights between the range 1.4m - 1.8m and 30-70kg. Females have a similar spread of height as boys but their overall range of weight for respective year groups is much less i.e. boys have a larger weight range than girls of the same year group. The regression line for males varies much more than females this is reflective of the larger mean averages for boys.
For all the females from years 7 to 11 had either positive correlation or no correlation at all, girls tended to follow the general idea that the taller you are the heavier you are.
For all males from years 7 to 11 had positive correlation apart from one year where we could say that we were shown negative correlation, year 9 males did not fit the general term where we have someone that is 1.55m tall and weights 67kg (approx) and someone who is 1.80m and weights 42kg (approx) this results in a negative gradient.
I later went on to see if there was a relationship between every female and every male, I found that there was stronger correlation within female as every pupil was close and compact and there was a fairly strong correlation within boys but still a couple of pupils did not fit the trend.
A larger sample would provide a greater level of confidence in the analysis as relying upon a random selection can only provide a certain level of mathematically based confidence
All my work was carried out using Microsoft Word and Excel.