To test Hypothesis 3, I a going to draw two scatter diagrams, one for girl’s heights and weights and one for boys heights and weights. Instead of just drawing a line of best fit on out graph I am going to estimate the mean, plot this point and make sure I draw a line of best fit through it this will give me a more accurate solution to prove or disprove my hypothesis.
Comparison
My scatter graph suggests that the boys have a strong positive correlation where as the girls correlation is not as strong. Both graphs show that as their height increases, their weight increases.
Limitations
My hypotheses has been proven, but I have to consider the fact that I have only used 27 pupils, more girls than boys and they are all of a similar age group, (14-15) yrs and also some are more developed than others.
The main problem with my investigation is the sample size and the fact that they are all year 10 pupils. To make this more realistic I will increase the sample size to 100, increase the age range from 11 to 18 years and collect secondary data. To avoid bias I will use roughly 100 girls and 100 boys.
Hypothesis 1
Boys are taller than girls
Hypothesis 2
Girls are heavier than boys.
Data Collection
These figures were collected from Mayfield High school, which does not have year 12 and year 13 pupils, the data for these people was extracted from the internet.
We have a total of 1283 pupils, we need 100 pupils. I am going to use stratified sampling because it uses proportional representation. Random sampling would give every person an equal chance but could end up with all year 7 pupils, which would defeat my purpose of having mixed age ranges.
100/1283 x 100% = 8%
Stratified Sampling
Systematic Random Sampling
I will take the sample number of pupils and divide by 2 so that there is an equal number of boys and girls. Any odd numbers should be rounded to the nearest even number. To select the pupils for example in year 7 I will divide the number of pupils by 2; 22/2= 11. I will then use the first 11 girls and the first 11 boys in year 7, I will do the same for all the year groups.
My next stage is to produce a histogram to represent the frequency density. This is calculated by dividing the frequency by class width.
Histograms
Conclusion
I can conclude that from drawing histograms my hypothesis has not been proved. They prove that boys are heavier than girls, and girls are taller than boys which is the opposite of my prediction. To further my investigation I will estimate the mean to try to support my hypothesis. I am using the mean because it uses all the data values.
Estimating the Mean
Boys
16677.5/100= 166.775 ~167 cm
5162.5/100= 51.625 ~52 kg
Girls
16175/100= 161.75 ~ 162 cm
6114.5/100= 61.145 ~ 61 kg
Conclusion
From estimating the mean we can see that my hypotheses- Boys are taller than girls and Girls are heavier than boys have both been proved. I have rounded the numbers off because they are not needed at that degree of accuracy.
I am now going to take my investigation a stage further by looking at the data spread. I will find the inter-quartile range by drawing a cumulative frequency curve, from which the quartiles and median can be found. With this data I can then draw box and whisker diagrams. These diagrams will allow easy comparisons of the data spread to be made.
Cumulative Frequency
Girls Weight
Girls Height
Boys Weight
Boys Height
I will now draw my cumulative frequency curves to find the lower and upper quartile ranges, the range and inter quartile range. I will then draw box and whisker diagrams, to allow easy comparisons.
Conclusion
From drawing my cumulative frequency graphs and box and whisker diagrams I can see that my hypotheses have been proved, Boys are taller than girls and Girls are heavier than boys.
Standard Deviation
To take my investigation one final step forward, I am going to calculate the standard deviation for boys and girls, heights and weights. I will work out the Standard deviation +1,-1 and +2,-2. The reason I am going to find the standard deviation is that by knowing the spread of data about the mean it gives me a clearer idea if how tall or heavy a person is, related to the rest of the pupils from the same gender.
Standard deviation is the third measure of spread of data about the mean, otherwise known as dispersion. Standard deviation gives a more detailed picture of the way in which the data is dispersed about the mean as the centre of the distribution.
The formula for Standard deviation is:
Conclusion
From calculating the standard deviation I can see the spread of data about the mean, this gives me a clearer idea of how tall or heavy a person is related to the rest of the pupils from the same gender.
Most data sets are like a bell shaped symmetrical histogram. This I known as a symmetrical distribution. In a normal distribution the majority of the data is gathered about the mean, there is usually a few extremes. I worked out the +1 and -1 standard deviation and for boys height the S.D was +11, -11 and 69% of the values were within this range. The average amount of values is 68%. To further my investigation I also took all the values to +2 and -2 S.D, for boys height 100% of the values were within +2 and-2 S.D. The average percentage is 95%. This proves that my data is evenly distributed.
Overall Conclusion
My investigation has been quite successful. I calculated different methods to prove my hypotheses.
When I drew the scatter diagram to start my investigation, I found that it only slightly proved my hypothesis, therefore I needed to try a different method to prove my hypotheses. My next attempt was to draw histograms. My histograms did not support my hypotheses. My next attempt was to estimate the mean, this way I was using all the data values and it would give me a better result, hopefully and it did. Both my hypotheses were proved boys are taller than girls and girls are heavier than boys. Finally to prove my hypotheses again I drew cumulative frequency graphs and box and whisker diagrams. Both of these again proved my hypotheses, so here is the evidence that boys are taller than girls and girls are heavier than boys.
Limitations
The limitations of this investigation are that I used children from secondary school aged between 12 and 18. this is the age where children are developing and their heights and weights change at different speeds and times so some people were more developed than others. I only used a sample size of 100 girls and 100 boys.
Future improvements
My future improvements would be to use a much larger sample size, of not just children but people of all ages, this will give me a better and less biased conclusion to my hypotheses.