Despite both male and female pupils having the majority of their weights in the 40-50kg interval, 11 out of 30 male pupils (37%) fitted into this category whereas 13 out of 30 female pupils (43%) did which is easily seen upon my frequency polygon. I could not really include that in supporting my hypothesis as the other aspects do. My evidence shows that the average male pupil is 7.3kg heavier than that of the average female pupil, and also that the median weight for the male pupils is 3.1kg above that of the female pupils. Another factor my sample would suggest is that the male pupils weights were more spread out with a range of 50kg rather than 31kg as the female pupils results showed. The difference in range is also shown on my frequency polygon where the female pupils weights are present in 5 group intervals, where as the male pupils weights occurred in 6 of them.
Height
I am now going to use the height frequency tables to produce similar graphs and tables as I have already done with the weight. Obviously as height is continuous data, as mentioned already, I am going to use histograms to show both boys and girls weights. I am also going to make another hypothesis that;
"In general the boys will be of a greater height than the girls."
Histogram of boys' heights
Histogram of girls' heights
Similarly as with the weight, I can see the obvious contrasts between the male and female heights, but the data is not presented in a practical way to perform a comparison, which is why I am going to present the two data sets on a frequency polygon.
This graph does support my hypothesis as the male pupils heights reach up to the 190-200cm interval, whereas the female pupils heights do not have data beyond the 170-180 cm interval. Similarly there were female pupils that fitted into the 120-130cm category where as the boys' heights started at 130-140cm. The data, in the format of a stem and leaf diagram, is shown below.
Stem and Leaf Diagram for Male Pupils Height:
1.1:
1.2:
1.3: 0.02
1.4: 0.08,
1.5: 0, 0, 0, 0.02, 0.02, 0.02, 0.03, 0.03, 0.04, 0.05, 0.05, 0.09
1.6: 0, 0, 0.01, 0.03, 0.03, 0.04, 0.05, 0.06
1.7: 0.02, 0.05
1.8: 0, 0, 0.02, 0.06
1.9: 0, 0.01
Stem and Leaf Diagram for Female Pupils Height:
1.1:
1.2: 0
1.3: 0
1.4: 0.02, 0.02, 0.03
1.5: 0.04, 0.05, 0.05, 0.06, 0.07, 0.08, 0.09, 0.09
1.6: 0, 0.01, 0.01, 0.02, 0.02, 0.02, 0.02, 0.03, 0.03, 0.05, 0.07, 0.08, 0.08
1.7: 0, 0, 0.02, 0.03
1.8:
1.9:
With these more detailed results, I can now see the exact frequency of each group and what exact heights fitted into each groups, as you cannot tell where the heights stand with the grouped graphs. For all I know all of the points in the group 140 ≤ h < 150 could be at 140cm, which is why I feel it is a sensible idea to see exactly what data points you are dealing with. I can also now work out the mean, median and range of the data.
Differing from the results from my weight evidence, the heights' modal group for male and female pupils differ, and much to my surprise, the female pupils modal class is in fact one group higher than the male pupils. This is very visible on my frequency polygon as the female pupils data line reaches higher than that of the male pupils. This doesn't exactly undermine my hypothesis however as the modal class only means the group which had the highest frequency, not which group has a greater height. On the other hand the average height supports my prediction as the male pupils average height is 5.4 cm above the female pupils. The median height had slightly less of a difference than the weight as there was only 0.2 cm between the two, although it was the female pupil’s median that was higher. When it comes to the range of results, similarly to the weight, the male pupils range was vaster than the female pupils, although there was no where near as greater contrast in the two with a difference of 16 cm between the two. With all of the work I have done so far, my conclusions are only based on a random sample of 30 boys and girls so they are not necessarily 100% accurate, and therefore I will extend my sample later on in the project. Before I go on to further my investigation, I feel that it is necessary for me to work out the quartiles and medians of both data sets, as this allows me to work with grouped data rather than individual points as in my stem and leaf diagrams. To do this I am going to produce cumulative frequency graphs as this is a very powerful tool when comparing grouped continuous data sets and will allow me to produce a further conclusion when comparing height and weight separately. I am also going to produce box and whisker diagrams for each data set on the same axis as the curves for this allows me to find the median and lower, upper and interquartile ranges very simply. I am firstly going to look at weight, and to produce the best comparison possible I am going to plot boys, girls and mixed population on one graph.
Cumulative frequency curves for weight
Box and Whisker Plots for Weight
All three of my curves clearly show the trend towards greater weights amongst boys and girls. From looking at my box and whisker plots, and my cumulative frequency diagarams, I have obtained the following evidence:
These results continue to agree with my prediction which was made earlier that the male pupils will be of a heavier weight than the female pupils. I can see this as the median, lower quartile, upper quartile, inter quartile range and mean are all of lower values than the male pupils, but also the male pupils range of weights is shown to be greater from these results as their interquartile range is 5.3 kg higher than the female pupils.
Cumulative frequency for heights
Box and Whisker Plots for Height
These results also show the trend towards a greater height amongst the boys and girls. Similarly as done with my weight diagram, I have obtained the following evidence;
Similarly as with the weight results, these results continue to further my prediction that the male pupils would be of a greater height than the female pupils. As with the weight results this can be seen from the upper quartile and mean points which in the female pupils case, are all of a value smaller than the male pupils.
From all of the graphs and tables I have produced so far, I can fairly and confidently say that the male pupils weights and heights' are higher than the female puils, but none of my evidence collected so far helps me conclude my original hypothesis made; "The taller the pupil, the heavier they will weigh."
Although when looking at my cumulative frequency graphs of height and weight, I could make the statement that both diagrams appear to be very similar from appearance although I cannot make any form of relationship between the height and weight. I am now going to extend my investigation and see how height and weight can be related, and to do this the most effective way is by producing scatter diagrams. I will plot male and female pupils on separate graphs as I feel the results will produce a stronger correlation when done this way and also to continue with the style I have begun with. Using scatter diagrams allows me to compare the correlations of the two graphs, and the equations of the lines of best fit (best estimation of relationship between height and weight) of each gender.
Boys' Scatter diagram of height and weight
This graph shows a positive correlation between height and weight, and all of the datum points seem to fit reasonably close to the line of best fit. There is one point that I have labled which does not really fit in with the line of best fit - these are called anomalous or outlying points, it means that they do not fit in with the trend of the results.
Girls' Scatter diagram of height and weight
This graph similarly shows a positive correlation, although the correlation is weaker than the male pupils as the spread is lesser on the male pupils graph than on the female pupils. The datum points on this graph are quite closely bunched together in the middle where as on the male pupils graph, there is a wider spread of results - which would agree with the conclusion made earlier that the male pupils heights and weights are of a larger range than the female pupils. As both of my lines of best fit are completely straight, I would assume that the equation of the line would be in the form of;
y = mx + c.Wheny represents height in m, and x represents weight in kg, the equations of the lines of best fit for my data set are (I obtained these equations from my graphs in “Autograph” as an exact result was available, however if I were to find the results myself I would do so by finding the gradients and looking at the point where they intercept the y axis):
Male Pupils: y =66.64x+55.58 Female Pupils: y = 41.59x+19.67
I have now reached a point in my investigation where my random sample of 30 boys and girls is not necessary anymore. There have definitely been some clear conclusions made from my graphs and tables already, which have all in fact fitted in with my predictions made. However my predictions are only based on general trends observed in my data, and in both the male and female samples there were individuals whose results did not fit in with the general trend. I cannot have complete confidence in my results so far due to the fact this is only a random sample of 30 female pupils and their` age has not been considered which I now feel is a necessary factor. I have spent a good amount of time considering different genders but now I am going to look at age differences. It is only common sense that age is going to affect your height and weight, for you would think a year 7 pupil would be smaller and lighter than a pupil in year 11. As Mayfield High School is a growing school there would be more pupils in year 7 than in year 11, therefore my random sample was likely to contain more year 7 pupils than year 11 - this is biased and unfair. To ensure that I obtain a data set with an accurate representation of the whole school, I am going to have to take a stratified sample. A stratified sample means that you sample a certain amount from a particular group to proportion that group's size within the whole population, i.e. pupils within year 8, within the whole school.
Below is a table showing the number of girls and boys in each year at Mayfield High School:
I have decided to continue with a sample of 60 pupils, 30 female pupils and 30 male pupils, as I feel from my random sample this amount of data was easy to work with and produced some sufficient results. I have now got to work out how many male and female pupils I will need from each year to make sure that my sample is a good representation of the whole school. To do this, I must consider the genders separately as there are 580 female pupils in the school and 603 male pupils. When working out the year 7 sample this is what I would do;
Take the total number of year 7 female pupils-131, and divide that by the total number of female pupils in the school-580
131/580 = 0.22586207. I then have to multiply that number by 30 as that is the total number of female pupils data. I wish to obtain 0.22586207 X 30 = 6.7758621, if I then round that number up to one whole number it means that I need 7 girls from year 7 in my stratified sample.
These are the calculations performed to retrieve my stratified sample numbers;
Year 7 - Girls - 131/580 = 0.22586207 X 30 = 6.7758621 = 7
Year 7 - Boys - 151/603 = 0.25041459 X 30 = 7.5124377 = 8
Year 8 - Girls - 125/580 = 0.21551724 X 30 = 6.4655172 =6
Year 8 - Boys - 145/603 = 0.24046434 X 30 = 7.2139302 = 7
Year 9 - Girls - 143/580 = 0.24655172 X 30 = 7.3965516 = 7
Year 9 - Boys - 118/603 = 0.19568823 X 30 = 5.8706469 = 6
Year 10 - Girls - 94/580 = 0.16206897 X 30 = 4.8620691 = 5
Year 10 - Boys - 106/603 = 0.17578773 X 30 = 5.2736319 = 5
Year 11 - Girls - 86/580 = 0.14827586 X 30 = 4.4482758 = 4
Year 11 - Boys - 84/603 = 0.13930348 X 30 = 4.1791044 =4
Despite my new sample of 60 pupils being stratified, to obtain the particular number of male and female pupils from each year, I am going to select them randomly so again no bias is shown.
Using my new stratified sample, I have produced a scatter graph for each age group and alternate gender, i.e. a boys and girls scatter graph for year 7,8,9,10,11. I am going to maintain the same hypothesis of "the greater the height, the greater the weight, but I can also comment on the older the pupil the greater the height or weight."
Year 7
Male Pupils
Female Pupils
For the year 7 graphs, the lines of best fit appear to be at a similar gradient to one another although the male pupil’s begins at a higher point on the y axis than the female pupil’s - which would determine that the male pupils were taller. The male pupil’s points appear more bunched up and closer to the line of best fit, where as the female pupils points are more spread out but are situated quite close to the line. Both lines have a positive correlation which would agree with the taller the person the heavier they weigh.
Year 8
Male Pupils
Female Pupils
Differing from the Year 7 graphs, the lines of best fit are at quite different gradients. These graphs show that the male pupils in year 8 follow a strong pattern of the taller you are the heavier you weigh - shown by the positive correlation of the line. However the girls graph differs and has a very slight correlation which could be for many reasons - one being that girls watch their weight slightly more. The points on both of these graph are more sparsely distributed around the lines of best fit, where as the year 7 points were more closely grouped together. This could be for the reason that your body starts to change in many different ways as you grow older.
Year 9
Male Pupils
Female Pupils
The Year 9 graphs show greater contrast, although of a similar pattern to the year 8 ones. The male pupils show an even steeper positive correlation showing the heavier you weigh the taller you are, and similarly the girls show the line of best fit almost positioned horizontally across the page. Both of these graphs have points positioned very closely to the line of best fit, although that could just be coincidence.
Year 10
Male Pupils
Female Pupils
The year 10 graphs show a complete change with both of the graphs, as the male pupils consists of a practically horizontal line of best fit, and the female pupils graph consisting of a fairly steep gradient. This could be explained due to female pupils caring about their appearance more, but the male pupils change I cannot explain. This could just be chance, as there are only 5 points on the graph anyway - which is a small percentage of all the year 10 male pupils.
Year 11
Male Pupils
Female Pupils
The small amount of data points on these graphs is barely enough for me to make a conclusion, however the male pupils graph shows a negative correlation, but the female pupils graph does not differ and now create a fairly strong positive correlation which would predict that the taller you are the more you weight.
Although these graphs have given me some points to consider, one being why the female pupils graphs tend not to consist of "the taller you are the heavier you weigh" as the age increases. I have come to the conclusion that because as females reach their adolescent part of their life and start developing they become more aware of their appearance and therefore try to watch their weight a bit more. Although, I only had a small stratified sample to represent the whole school, so it would not be an accurate source of information to draw an efficient conclusion from. However I did produce this table from all of my data points to see whether a further pattern occurred:
The only part of the table that I can assume a conclusion from is the Spearman’s Rank Correlation Coefficient because the closer the value is to 1, the more evidence I have to prove my hypothesis as 1 is a perfect, directionally proportional line of best fit. A negative number means that my hypothesis is wrong as the correlation will be negative.
Seeing as I didn't have a big enough sample to make any meaningful statements within the data, I have decided to further my investigation again and to look in more detail at just one year group to see whether I can draw a better conclusion from these results. I have decided to look at each year group in more detail, however I am only going to write up an example using year 9 female pupils as I do not feel it is necessary for me to show each one in full. I am going to extract a random sample which will add up to 10% of the total pupils in year 9. As there are 143 female pupils in year 9 at Mayfield High School, I will need a total of 14 pupils; this is the random sample I extracted;
I can create a brief summary of the heights and weights in a table, as I have done with the majority of my other samples although I also used graphs with these.
From this data, although I have considered the range of results there is another measure of spread which I have not yet considered in my project is standard deviation. This is the measure of the scatter of the values about the mean value and it is also thought of as a measure of consistency. Standard deviation uses the square of the deviation from the mean, therefore the bigger the standard deviation the more spread out the data is. I am firstly going to work out the standard deviation of the year 9 female pupils heights' where 'x' represents the heights. To find the standard deviation using the equation, we need to first work out the mean value, then square each value and these squares. I have put my values in a table as it is easier to keep track of them then.
I am also going to work out the standard deviation of the female pupils weights, I am going to use the same method, and the only difference being that 'x' now represents weight rather than height.
These are the results I obtained for the standard deviation for each year group – male and female pupils separately;
From looking at the mean averages for each separate gender set, the male pupils height and weight increase as the age increases. The biggest increase for their height was from year 7-8 where the average height increased 9cm, from 154cm up to 163cm. Although when looking at the weight increases, it appears there is a slightly more even increase however the biggest jump is 7kg from year 9-10. The female pupils results generally appears to increase as their age increases, although there is one fault in the weight section. Their heights increase most rapidly from years 7-9 where the height increases 15cm on average (147cm-162cm); although from years 9-11 there is only a very small increase of 2 cm, a centimetre for each year. This could be because female pupils develop earlier than the male pupils and therefore grow faster when they are younger, and slow down when they become older. However, when looking at the weight there is one decrease in the average weight as the age increases, from year 9-10 there is a deduction of 3kg, from 54kg-51kg. This could be because this is the prime age when female pupils start to become far more concerned about their appearance and therefore watch their weight. Despite this one fault, these results would agree with my hypothesis made earlier that the older you are the heavier/taller you will be. When looking at the male and female pupils results together, in each case apart from one, the male pupils average height/weight is higher than that of the female puils. There is only one point that undermines this pattern, and that is for the weight of the year 9 pupils, where the female pupils average weight is 2 kg more than the male pupils.
When looking at the standard deviation it shows that the year 11 pupils heights on whole has the highest level of consistency, with an equal 12 cm deviation for both genders. Although when looking at the weight values, the male pupils in year 11 maintained a deviation of 12 kg for their weight, however the female pupils weights proved to be more consistent with a deviation of 9 kg. In general with the weight, the male pupils standard deviation is higher than the female pupils with an average of 3.5 kg difference above them. The only year which differs from this is again the year 9 group - where the female pupils standard deviation is 4 kg above the male pupils, this could be related to the female pupils average weight being higher than the male pupils in this section also. I can now say that the female pupils weights are in general more consistent and therefore the data points have a smaller measure of spread.
The heights standard deviation does not show much of a pattern, however in years 8,9,10 the standard deviation is higher for the male pupils than it is for the female pupils with an average increase of 10 cm difference above the female pupils. This great difference could be because of the irregular high value of 31cm standard deviation for the year 10 male pupils, whereas the female pupils only had 11cm. This high value means that the heights for the male pupils in this year group are quite irregular, and there is a vast measure of spread - I cannot see a reason for this however I have to keep in mind this is only a 10% sample of the whole year group therefore it could be that the values selected were just coincidentally a large range of heights. When looking to see if there was any pattern in the standard deviations as the age differs, the female pupils proved to consist of a slight pattern. From year 7-11 the standard deviation values consisted of; 15cm, 13cm, 13cm, 11cm, 12cm - which shows a general decrease as the pupils grow older, despite the one centimetre increase from year 10-11. All of these values are all very close to each other (within 4 cm of one another), whereas the male pupils values differ slightly more with 12cm, 19cm, 17cm, 31cm, and 12cm (year 7-11). The only conclusion I can draw from this is that the female pupils heights are overall far more consistent than the male pupils, and it could be that as the female pupils increase in age the standard deviation becomes less (ie. more consistent), and the spread of the data points become closer.
Throughout this project I have made many hypotheses including;
- The heavier the pupil, the taller they will be
- In general male pupils will weigh more than female pupils
- In general male pupils will be of a greater height than female pupils
- The older the pupil the greater the height/weight
I have answered all of these predictions throughout the project with either graphs or text, and it is proved that all of my hypothesise made have been in general correct. There have been some slight points which undermine the predictions, but all over they have been successful. My original task was to compare height and weight, although I have not only considered height and weight but including biased factors such as gender and age.
When considering age as a biased factor, I produced a stratified sample trying to create a suitable representation of the school on a smaller scale. Using the data for this stratified sample my results proved that in general the older you are the heavier/taller you are, however there was a group of pupils in year 9 which undermined this prediction. These results are however not 100% effective due to there only being a very minimal amount of data for each year group and gender.
Despite considering the age factor, I also spent a great deal of time looking at the differing genders to see whether that affected the height and weight of pupils at all. When looking at this I produced histograms, frequency polygons, cumulative frequency graphs and box & whisker diagrams, stem & leaf diagrams and scatter diagrams. The overall conclusion was that boys in general are of greater height and weight - mainly defined by the mean values which were higher than that of the girls.
However, all of these hypothesise were all as a part of my main prediction; "The taller the pupil the heavier they will weigh", and from answering all of these other predictions I can confidently say that it is true. I have come to this conclusion based on all of the graphs, diagrams, tables and statements made. On the other hand there were cases where certain data undermined this prediction but that could have been because of the small samples I had allocated myself to obtain. When producing the random sample of 60, I felt that was a satisfactory amount to work with as picking up an analysis and producing graphs from this data was simple and done efficiently. Although when it came to the stratified sample, and I was looking at the different age groups using again a sample of 60 trying to represent the school on a smaller scale - I do not feel it was as successful. If I were to repeat or further this investigation - I would definitely use a larger number of pupils for the stratified sample as when the numbers of the school pupils were put on a smaller scale, I only ended up in some cases with a scatter graph with only 4 datum points upon for the year 11 students. To retrieve accurate results from this method of sampling, I feel it is necessary to use a sample of at least 100. Additionally to the stratified work, if I had a larger sample - I would also produce additional graphs, i.e. cumulative frequency/ box and whisker, as I feel that I could draw a better result from these as I felt the scatter diagrams I produced were rather pointless.
I feel my overall strategy for handling the investigation was satisfactory, if I had given myself more time to plan what I was going to do I think I would have come up with a better method and possibly more successful project. One of the positive points about my strategy is that because I used a range of samples it meant that I was not using the same students' data throughout - I instead used a range of data therefore maintaining a better representative of Mayfied school on a whole. There is definitely room for improvements for my investigation - if I were to do it again I would spend a lot more time planning what I was going to do instead of starting the investigation in a hurry. Despite that I feel my investigation was successful as it did allow me to pull out conclusions and summaries from the data used.