Stratified Sampling
Stratified sampling ensures that a fair proportion of responses from each group are sampled. Stratified sampling involves putting data in more than one category. If I was to do this I wouldn’t have to put them into categories as they already are.
Systematic Sampling
Systematical sampling is a method by choosing at regular intervals from an unsorted list e.g. every 20 sets of data. This would be good to use as all my data is unsorted and would be good to use.
Cluster Sampling
Cluster Sampling involves putting numbers into groups or clusters. They are put into clusters through random sampling. Every item is looked at in the cluster. This would be useful as the data is already put into clusters.
Quota Sampling
Quota Sampling gives instructions about the amount of each section of the population with certain characteristics, such as age. This choice of sampling isn’t as good as some people think that the sampling can be bias.
Convenience Sampling
This method of sampling the most convenient sample is chosen e.g. if you had to choose 10 people from a class you would choose the first 10 in the register. This method is good as it can be done quickly although it can be extremely bias.
I think that the best sort of sampling would be stratified sampling. I say this because this form of sampling ensures that a fair proportion of responses from each group. It allows you to use a fair amount of people from each year. This would be an example of stratified sampling at Mayfield High School. Say that you needed 200 students from a school where there are 1183 students. As I already have the number of boys and girls in each year, I can work it out like this:
Number of Boys in a year group
X Amount of people you need to choose
Total amount of students in school
In this investigation I will not need to use sampling as it all has been done in excel on the computer but if I did have to I would choose to do stratified sampling.
Plan
For my first hypothesis (I expect that there will be a positive correlation between weight and height. I predict this because usually as the student gets taller their weight also usually gets bigger) I will need to construct scatter graphs. This would be good for this hypothesis because scatter graphs are brilliant for comparing data. You can use scatter graphs to show whether two sets of data are related. I will be measuring the correlation coefficient formula to measure the exact correlation. You use this formula to calculate the correlation coefficient:
For my second hypothesis (I also predict that the older students will be taller and weigh more. I think this because usually the older you are the more likely you will be taller and weigh more) I will need to construct a box and whisker diagram. I will then be able to compare the median and interquartile ranges of the different ages. Box and whisker diagrams are good for looking at two different sets of data.
For my third hypothesis (I also predict that the boys will be taller and weigh more than girls. I predict this because usually boys weigh more than girls on average) I will need to construct a histogram. Histograms are good because they are good at comparing averages; they are also good for finding the standard deviation and the mean. It is also easy for finding the normal distribution. You use this formula to find standard deviation:
However I can just choose use autograph to calculate the standard deviation.
What I expect to find
Hypothesis 1- I expect to find a positive correlation between weight and height. I predict that as the students get taller their weight will increase.
Hypothesis 2- I expect to find that the weight and height of the older students will have higher averages than the younger ones. I also expect to find that the interquartile range will be bigger for the younger students.
Hypothesis 3- I expect to find that the boy’s weight and height will be taller than the girls. I expect to find a normal distribution as weight and height is a natural growth.
Diagrams
Hypotheses 1:
Hypothesis 2:
Hypothesis 3:
Interpretation and Evaluation
Hypothesis 1- From the scatter graphs that I drew for my first hypothesis I found that for some of them there was a moderate correlation although some of them had a weak correlation. I got these conclusions from the correlation coefficient that I worked out during the process of making the graphs.
I found that the correlation between overall height and weight for the different key stages had a moderate correlation. I found that the correlation coefficient was 0.4 which is not that high but not that low. I can see this because the points on the graph usually have a positive correlation, because as the weight gets higher the height also gets higher.
I also looked for the correlation between height and weight for the different ages and different sexes. From this I found that the correlation for height and weight is higher for boys than for girls. This was the same for both key stages that the boys had a higher correlation coefficient. From this I found that the correlation coefficient for KS3 girls was the worst correlation which was 0.2 which is very low. For KS4 girls the correlation was stronger but it was still only 0.35. This is again low so concluding I have found that the correlation is very low for girls between height and weight. The boys correlation was fairly strong as for both ages they were both 0.4. This is similar to the initial correlation coefficient for the overall ages.
From these results I can see that usually as the students get taller they also weigh more as well. I have also found that this correlation is stronger with boys than with girls. I think this is because girls have a bigger variety of heights and weights, as girls are growing quicker than boys. This means that there is a much bigger variety and we are seeing girls that are still very small, however there are some girls that are very tall. Also I think that girls are more concerned about their weight and they can control how heavy they are but they cannot control how tall they are. So this could mean that there are tall girls that don’t weigh much or that there are short girls that do weigh more.
As I have seen from these graphs that height and weight have a correlation but not a massive correlation. Another form of correlation with weight could have been age. I feel that height and weight was a moderate correlation although I feel that if I was to say that weight and age will have a strong correlation I might find a stronger correlation. I would say this because in real life you usually find that the older you are the more you weigh. It would be very strange to find that a 10 year old would weigh more than an 18 year old.
Through my investigation I have found that there is a moderate correlation between height and weight for the students at Mayfield high school. However in my hypothesis I predicted that there would be a stronger correlation. So the evidence that I have come up with is sort of in the middle of backing up my hypothesis, although it doesn’t fully support what I predicted.
Hypothesis 2- For my second hypothesis I decided to do a box and whisker diagram. This worked out really well because I could easily compare the averages and the quartile ranges. I drew two box and whisker diagrams- which were one comparing height and the other comparing weight.
After looking at my first diagram (where I compared height for all the five different years) I found that the averages usually got higher as the students got older. I found that the average height for year 7’s is quite a bit lower than all the other years. I then found that the average height for the students in year 8 and 9 is the same. Finally I found that the average height for year 10’s is a lot bigger than year 9’s, also that averages are the same for year 10’s and year 11’s.
Also from my diagram I could see that the lower quartile ranges are fairly similar for the year 7’s, 8’s and 9’s and then there is a big jump for the lower quartile ranges for year 10’s and for year 11’s. I can see from the diagram that the lower quartile ranges from year 9 to year 10 differ in 10cm. I can also see that the lower quartile range for year 10 is bigger than the median for year 9’s. This proves that 75% of students in year 10 are taller than the average height of a year 9 student. This also shows that 75% of students in year 10 are over 1.6 metres and that only 50% of students in year 9 are over 1.6 metres.
This result is relevant to real life as usually in year 7 the students haven’t started growing which shows in the graph where the average height is smaller. Then in year 8 and 9 the students would be starting to grow so their averages would be getting taller. Then in year 10 and 11 the students will be nearly fully grown which explains the difference in the averages between year 10 and year 9.
After looking at my second diagram (where I compared weight for all the five different year groups) I found that the average is getting higher every year from year 7 to year 9, however the change isn’t very significant. Although there is a big jump from year 9 to year 10, the difference between the averages is 8 kg. Then surprisingly the average of year 11 students is lower than the average of year 10 students.
In year 11 there is a very big interquartile range, which is a lot bigger than all the other years. This shows that in year 11 there is a bigger variety of weights than all the other years. Also strangely the lower quartile range and the average is lower than in year 10.
Through year 7-10 the average weight is going up through every year, this relates to real life as the students will be getting heavier as they are starting to grow. However I would have thought this would continue through to year 11 but it didn’t. I think that this conclusion cannot relate to national data. I think that it maybe just that this set of year 10 students are larger than usual or that this set of year 11 students are smaller than usual.
In my hypothesis I predicted that the weight and height would get bigger as the students got older. The evidence that I collected showed that my hypothesis was mainly correct. I could see from my diagram showing height that the averages consistently got higher. Also I could see from my weight diagram that the averages went up until they got to year 11. This was the piece of evidence I got that went against my hypothesis, however I think that this piece of evidence would not be the same if we recorded similar data from different schools.
Hypothesis 3- For my third hypothesis I predicted that the boy’s students would taller and weigh more than girls students on average. To do this I decided to do a histogram. Histograms are good as I could compare the means and the standard deviation of the data.
I can compare the means and I can see that for height in KS3 the mean for boys is 1.59 and for girls it is 1.57. This shows that in KS3 the boys are slightly taller than the girls. However the difference in the mean isn’t that much and not that significant. The mean for height in KS4 is 1.7 for boys and 1.63 for girls. This is a significant difference which shows are real difference in height between boys and girls. The mean for weight in KS3 is 48 for boys and 49 for girls. This shows that there is no difference at all between weights for KS3 students. The girls are slightly taller on average which is a surprise as you would imagine that boys would way more but it is not that significant. The mean for weight in KS4 is 59 for boys and 51 for girls. There is a difference between weights in KS4 which shows that on average boys are heavier. You can see from the mean that on average boys are taller and weigh more than girls.
You can also compare the variety of totals by working out the standard deviation of the histogram. The standard deviation for weight of girls in KS3 is 8.3 and for boys are 9.9. This shows that there is a large variety of heights for KS3 boys and that there is also a quite a bit of variety for girls. The standard deviation for weight of girls in KS4 is 7.3 and 11.5 for boys in KS4. This shows that there is a lot of variety in weight for boys in KS4 however for girls the variety is not as much and is quite a small variety. Overall you can see that there is a bigger variety for boys than girls for both age groups especially in KS4.
You can also look at the shape of the graph to see if it fits its normal distribution curve. For the height of KS3 girls you can see that the graph is an OK shape but it tilts a bit too the right were you can see that the modal bar is out of shape and doesn’t fit to the normal distribution curve. You can see that the height of KS3 boys and KS4 girls and boys all have a good shape. The bars all fit with the normal distribution curve. The bars are corresponding to my normal distribution curve.
As you can see from the graphs it shows that boys are usually taller and weigh more. In real life you would predict that boys are usually taller and weigh more on average. However in KS3 the boys are only slightly taller than the girls on average. I think that this also reflects real life data, where girls usually grow quicker than boys. Girls develop earlier than boys however there are some boys that are still very tall. This is why I think that the boy’s average is higher because I choose to use mean to calculate the average. Using mean means that if there are a hand full of students who are tall they will put the mean up. I also got from my graphs that there is a bigger variety in height between the boys. I think this is because boys are a lot more varied than girls, girls are usually around a certain height and you don’t find as many girls that are really tall, or that are really small. With boys you find more that are either quite tall or quite small.
Overall I feel that the data that I have collected from these graphs has backed up the hypothesis that I made. You can see from the graphs that on average the boys are taller and weigh more than girls. However the graphs do show that in KS3 there is not a substantial difference and to make a conclusion on height and weight in KS3 we would need to find more evidence from different schools.
Evaluation
After I have finished gathering all my evidence I feel that my hypothesis that I made at the beginning were fairly correct. I feel like I have found enough evidence from Mayfield High School that backs up all of hypothesis. However to say that these entire hypothesis are defiantly correct I would need to find more data which is not just from an individual school but from a nationwide survey.
The data that I got from Mayfield High School has been very reliable and has been easy to use. Although I have found in my data validation that their was quite a lot of outliers that I had to get rid off. I feel like my data validation stage went quite well as I got rid of all the outliers and it didn’t waste too much of my time. The data that I got from Mayfield High School was very detailed and had lots of information, to start with I got rid of all the data that I didn’t need. This also helped as my page with all my data was easier to read of.
I feel that the graphs I made were very good as I was able to easily compare and investigate. These graphs helped me come to a conclusion of my hypothesis and I created lots of different graphs to make sure that I covered everything that I was trying to investigate.
Overall I would say that this investigation was successful and that it went very well. However if I was to do it again maybe I would have collected more data from more schools. This would have helped me come to a conclusion that I could see is defiantly write. At the moment I have evidence that at this school most of my hypothesis are correct, but if I was to do a nationwide survey I am sure that I will find a result that is a bit more varied. Overall though I would say that this investigation is quite successful and I have come to some good conclusions.