GIRLS
FEMALES
Averages
Since all the data is continuous it makes more sense to use the modal class interval (the class interval that contains the most values) rather then the mode.
Height:
Out of three measures of averages the boys mean and median were higher then that of the girls. The modal class interval was equal. But, the range of the boys was 0.4m compared to the girls who had 0.42m this meant that the girls heights were more spread out. The evidence from the sample suggest that 11 out of 27 boys, or 41% have a height between 160-170cm, whilst 15 out of 33, or 45% of the girls have a height between 160-170cm. The frequency polygons show that there are fewer boys with heights below 140cm than girls.
Weight:
Out of the three measures of averages all three of them were higher for boys than girls, with a range of 0.37kg compared to 0.28kg for the girls. The evidence for the sample suggests that 13 out of 27, or 48% of the boys have weights between 50-60kg, whilst 14 out of 33, or 42% have weights between 40-50kg. The frequency polygons show that there are more girls under 50kg then there are boys.
These conclusions are only based on data of 27 boys and 33 girls if I were to extend or repeat the whole exercise then this will confirm my results
The reason why I am using the stem and leaf diagrams is so that it will be easier for me to read of the median values.
Extending the investigation:
‘In general the heavier the person is the taller that person is likely to be’
This prediction can be tested with my original data see pg. 2 and can be seen on the graph entitled scatter diagram for mixed population.
Based on the data and the graph it is evident that there is a positive correlation between weight and height. This suggests that the heavier a person is, the taller they will be.
Further experiments:
My next step will be to extend my line of inquiry and see how my investigation is affected by gender.
This is my hypothesis:
‘There will be a better correlation between weight and height if we consider boys and girls separately’
The evidence does not support my hypothesis since there is a more positive correlation in the mixed population graph rather than the separated gender graphs.
The line of best fit on each scatter diagram can be used to make predictions:
The lines of best fit on my diagrams predict that a girl who was 160cm tall would weigh 43kg, whereas a boy of the same height weighed 41kg.
But, these results cannot be predictable because none of the girls go past the 65kg of weight but 6 out of 27, or 22% of the boys are past the 65kg weight. So this wouldn’t be a fair reflection.
Equations of graphs:
If y represents height in cm and x represents weight, the equations of the line of best fit for our data set are:
Boys only: y = 0.75x+ 128.16
Girls only: y = 0.29x+ 147.11
Combined: y = x+ 134
These predictions can be used to make predictions of weight when you know height, or height when you know weight. For example to predict the shoe size of a boy who is 170cm tall:
Y= 0.75x+ 128.16
y-128.16
X= -------------
0.75
If Y = 170 then
170 – 128.16
X= ------------- = 55.78 = 56kg
0.75
Using the equations of my lines of best fit, I can predict that a boy who is 170cm tall will weigh 56kg.
The line of best fit is a best estimation of relationship between height and weight. There are exceptional values in my data (such as the girl who weighs 55kg and is 133cm tall) which fall outside the general trend. By rounding to the nearest whole number makes my predictions less accurate.
Cumulative Frequency graphs
I will use cumulative frequency which is a very powerful tool when comparing different data sets
This table shows the cumulative frequency for heights for boys, girls and for the mixed sample:
Graph for this table appears on next page
This table shows the cumulative frequency for weights for boys, girls and for the mixed sample:
Graph for this table appear in two pages
The benefit of drawing cumulative frequency curves for a continuous variable like height is that you can easily read of the median, upper quartile and interquartile range.
You can also use cumulative frequency curves to predict percentages of students who have a height within a given range. Suppose you wanted to find out the percentage of boys who height was between 160 and 180cm tall. The cumulative frequency curve tells us that 3 boys had heights up to 160cm and 18 boys up to 180cm. This means that 18-3= 15 boys had height between 160cm and 180cm.
You can use this figure to estimate that 15/27 or 55% of boys in the school will be between 160 and 180cm tall
You could also say that if you select a boy at random the data suggests that the probability of him having height between 160 and 180cm is 0.6
You could also use a Box and whisker diagram which shows the minimum and maximum values, the median, and the upper and lower quartiles. This provides a very clear comparison between the different data sets.
With the cumulative frequency graph I can say, the median height for boys is 170cm. The curve also tells us that that 23 girls had height less than 170cm.
So 10 out of 33 girls have a height greater then the median height for boys. Which is 10/33= 30%
Whilst in general boys are taller then girls we have evidence to suggest that 30% of the girls have a height greater than the median height of boys.
But the things we have to take into account is the fact that this data was taken randomly so this means that as the data was taken from only 60 people it can be called unfair to base percentages on only 60 people. That’s why the more people and the more data there is the more accurate and the more believable and reliable the results will be.
Considering Age
Since age would definitely affect height or weight it would seem necessary for us to mention that
When age is taken into consideration, the correlation between weight and height will be better that when age is not considered.
Summary of results and conclusion
There is a positive correlation between height and weight. In general the heavier you are the taller you are likely to be.
As only 27 boys were collected and 33 girls it is hard to give a precise analysis on the scattering of the points on the diagram. But, as it is there are more points closely around the line of best fit in the scatter diagram for boys then there are for girls. This suggests that boys heights correlation is better then that of the girls height and this also means that the boys heights is less predictable.
The points on the scatter graph are more dispersed then the points on the mixed population graph, this suggest that there is a better correlation when they (boys and girls) are considered together.
Since the mixed population graph does not need a curve line of best fit the overall relationship is not linear.
Estimates can be made, by either reading off the graph or using the equation of the line of best fit
The median height of the boys is larger than the girls
We would’ve had better results if there were more data collected, as this is what would make the results more accurate and reliable.