Mayfield igh Investigation

Mayfield High Handling Data Coursework

Hypothesis

I believe that the taller the person, the more they will weigh. This will be more visible in year 11, due to the rate of puberty. Also, boys in year 11 will mostly be taller than the girls, however in year 7 the heights will be more even or girls may even be taller due to their rate of puberty is at a lower age than boys. There should be a positive correlation between the weight and height of the pupils.

Pre-test

I will do a pre-test to ensure that my hypothesis will be correct. I will take a stratified sample from year 11 and then random sample the number of data needed for each gender and investigate the correlation between height and weight. I will then put this into a scatter graph which should then show a positive correlation between the weight and the height of the students. I have chosen year 11 above year 7 as the puberty rate will have slowed as the students approach adulthood, and the results should be clearer. I am selecting 40 students, and I will now work out how many girls and boys I will use.

For year 11, there are 86 females (51% approx), and 84 males (49% approx) and 170 for the whole year.

These are the random sample of the stratified sample in numerical order:

Females: 2, 6, 7, 17, 20, 23, 24, 32, 33, 35, 36, 38, 55, 60, 64, 69, 72, 79, 81, 86

Males: 87, 88, 89, 90, 91, 95, 97, 98, 103, 105, 106, 111, 115, 123, 140, 161, 156,

164, 167, 170

Pre-test conclusion

My graphs for my preliminary show that there is a positive correlation between weight and height, and that it is slightly different from females and males, as the males correlation is showing stronger than that of females. Although it isn’t extremely clear as there isn’t enough of a sample, it is still clear enough to see that my hypothesis will work for the rest of my investigation. To carry out the rest poo of my investigation I will use bigger samples and advance on the types of data analysis I will use.

Collecting Data

Before I use stratified sampling to find out what data I am using, I firstly need to remove any outliers from the data as this will affect my end conclusion. The only anomalous data I found was number 134, as there was no weight recorded so it was unappropriated to use this in my final data. Due to the fact that humans are all different, I have been a bit indecisive about what outliers I should take out, but I felt as if I could not take out any except the one I have because there was not any data that completely stood out.

I will use 25% of each year group for my sample, as this will give me the most accurate method of acquiring a sample because the year groups are different sizes. Due to the fact I am using 25%, I can just divide the number of students in each year by 4.

For year 7, there are 131 females (47% approx), and 150 males (53% approx).

For year 11, there are 86 females (51% approx), and 84 males (49% approx).

Year 7 Girls

Year 7 Boys

Year 11 Girls

Year 11 Boys

In order to support my hypothesis, I will now carry out some analysis to my data. This will help me work out whether what I predicted was wrong or possibly right and supported. I will start by finding out the averages and measures of spread: the estimated mean, the modal class, the median class and the standard deviation to each year and then each gender from that year.

Year 7 girls

The modal class is 150.5≥h>160.5.
The median class is (33+1)/2 =17 which is 150.5≥h>160.5.
Estimated mean: 5109.25/33=154.83

Standard deviation:= 12.70 cm

...