Before I start my statistics investigating, I will need to collect the following data from the datasheet - the height and weight ratio in boys and girls in all year groups. I will be able to find and access this information from a datasheet in which all the data from Mayfield High School are entered, provided by Edexcel. I have chosen this source because there are a variety of year groups between boys and girls and therefore a wide variety of height to weight ratios. I know this data is reliable because it is a actual school provided by Edexcel. However, I cannot be sure that this source is completely reliable because there could be either missing data or incorrect data.
For my coursework I will need to collect a sample of size of 10 percent of the population. I will ensure that my sample is fair by using stratified sampling, in which one in which the population is divided into groups called strata and each strata is randomly sampled using random sampling. I have chose to do this because then I can ensure that there a fair chance of anyone of the people at Mayfield High School will get a even chance of being chosen and also so that numbers in each part of the sample are in the same proportion and each strata of the entire population.
To start of with I will find how many people there are in 10 percent of 1183 by dividing 1183 by 100 and then multiplying it by 10 to give me a total number of 118.3, by because I can’t investigate 0.3 of a person I will round down to 118. To then find how many boys and girls I will investigate in each year group I will take firstly the number of boys in each year group and divide it by 1183, which is the total number of pupils and then multiplying it to 118, which is the total number of pupil I will investigate in my coursework. I will then continue to do this with all boys and girls in each year. Once I know the number of boys and girls I will investigate in each year I will then use random sampling with RAN# button on the calculator to select which of the boys and girls in each year group I will take the data of.
1183 ÷ 100 * 10 = 118.3 ←I cannot investigate 0.3 of a person so I will round down to get 118
Year 7 boys = 151 ÷ 1183 * 118 = 15 boys
Year 7 girls = 131 ÷ 1183 * 118 = 13 girls
Year 8 boys = 145 ÷ 1183 * 118 = 14.4 so 14 boys
Year 8 girls = 125 ÷ 1183 * 118 = 12.4 so 12 girls
Year 9 boys = 118 ÷ 1183 * 118 = 11.7 so 12 boys
Year 9 girls = 143 ÷ 1183 * 118 = 14.2 so 14 girls
Year 10 boys = 106 ÷ 1183 * 118 = 10.5 so 11 boys
Year 10 girls = 94 ÷ 1183 * 118 = 9.3 so 9 girls
Year 11 boys = 84 ÷ 1183 * 118 = 8.5 so 9 boys
Year 11 girls = 86 ÷ 1183 * 118 = 8.5 so 9 girls
Total number of people in sample = 118 people
Once I have sampled my data I will then use it to compare how year group or gender affects the height to weight ratio in boys and girls in the school. Before I start my coursework I can foresee the following problems – there might be anomalies in the data in which I will then re-sample, when I am calculating how many people I will sample I might not always get a whole number and so I will either round up or down when appropriate.
Before I start my statistics coursework I will firstly take a pre-sample of about 30 people. This is to test whether my hypothesis is worth investigation or not. For this I will use stratified random sampling as I have stated before. To do I will firstly have to find out how many boys and girls I will need to take the information of in each year. I will then use the Ran# button on the calculator to select which of the boys and girls in each year I will use. Here are the calculations I will do in order to get a sample size of 30 people.
Year 7 boys = 151 ÷ 1183 * 30 = 4
Year 7 girls = 131 ÷ 1183 * 30 = 3
Year 8 boys = 145 ÷ 1183 * 30 = 4
Year 8 girls = 124 ÷ 1183 * 30 = 3
Year 9 boys = 118 ÷ 1183 * 30 = 3
Year 9 girls = 143 ÷ 1183 * 30 = 4
Year 10 boys = 106 ÷ 1183 * 30 = 3
Year 10 girls = 94 ÷ 1183 * 30 = 2
Year 11 boys = 84 ÷ 1183 * 30 = 2
Year 11 girls = 86 ÷ 1183 * 30 = 2
Total of sample= 30 people
Once I have got my sample I will perform the following calculations – averages (mean, median, mode, etc), standard deviation, correlation coefficient and upper and lower quartile. These calculations will hopefully be useful as they will allow me to see clearly the correlation between the height in height to weight ratio and weight in height to weight ratio. The calculations I have chosen will also allow me to compare the data later on. I will also the information in the following diagrams – scatter graphs, cumulative frequency graphs, histograms, and box plots. My calculations and diagrams will enable me to see the correlation in the height and weight in height to weight ratio and also they will allow me to compare how year group or gender will affect it.