I added up the girls and boys in each grade group, and then I totalled each column and row.
I found that this sample is bias as there are fewer boys than girls. Subsequently I shall divide each grade group for the girls by 56 and divide the boys grade groups by 64.
Girls
A 12÷56x30 =6.4
=6
B 11÷56x30 = 6
C 8÷56x30=4.3
=4
D 10÷56x30=5.4
=5
E 15÷56x30=8
∑29
Boys
A 14÷64x30=6.6
=7
B 14÷64x30=6.6
=7
C 13÷64x30=6.0
=6
D 14÷64x30=6.6
=7
E 9÷64x30=4.2
=4
∑31
=7
This is also a biased sample. To get a fair sample I think I will have to increase the value 30 to 32.If 32 gives a biased sample I then will continue up in 2’s.I haven’t chosen 31 because I speculate that to get an even sample the value will have to an even number.
Girls
A 12÷56x32 =6.8
=7
B 11÷56x32 = 6.2
=6
C 8÷56x32=4.5
=5
D 10÷56x32=5.7
=5
E 15÷56x32=8.5
=9
∑33
Boys
A 14÷64x32=7
=7
B 14÷64x32=7
C 13÷64x32=6.5
=7
D 14÷64x32=7
E 9÷64x32=4.5
=5
∑33
The numbers needed to gain a random but unbiased result have been worked out above.
For each number of data needed I will put the number of the student into a hat and pick out the amount of numbers needed. This should give a random spread of results.
“Boys who are good at maths will be good at science”, to validate this claim it would be wise to see if the data has any relationship (correlation), I think that if I were to find this out, it would be wise to use a scatter graph with a line of best fit if there is correlation.
To do this I will take all the results for the boys in science and maths. Then put the maths values on the X-axis and the science on the Y-axis of the scatter graph.
I tallied up the results from science and maths of the boys I choose at random, and the following table was formed. The table is needed to help make the scatter graph that will show if the results have any correlation.
This graph shows that there is positive correlation between Boys who are good at maths and
Boys who are good at science. The data has a scale of 2 for every gridline. I also notice that most of the results occur within 60 marks to 80 marks.
To further confirm if my first theory is accurate I will use the mean of both the science and maths results. If the theory is valid then both means should be the same, to a certain degree.
So I will find the total sum of the items and then divide the total of both maths and science by 33 as that is the number of students chosen.
2149÷33=62.12
=62
2054÷33=62.24
=62
As we can see that the mean of science and maths for year 8 boys is practically the same with only 0.12 of a decimal apart. This further validates my hypothesis about boy who are good at science are good at maths.
I have noticed that there is defiantly a positive correlation between boys who are good at maths compared to boys who are good at science. I expected maybe a weakly positive correlation but was surprised to see that there was indeed a strong correlation. The scatter graph long with the mean validated my theory further. I was however shocked that there was only 0.12 between both samples of the mean. My initial problem has the amount of data present. However with stratified random sampling solved this problem and gave enough data to work with easily. However if I had more time I would like to have used either a larger sample of data or the whole data to further authenticate my theory. As the sample chosen could have been a relatively high sample.
My second hypothesis “girls good at English will be bad at will be bad at science” is based upon the fact that English needs creative thinking while science needs logical thinking.
The problem I foresee is trying to find the best diagram to show if the is a relationship between English and science. If there is, I expect that it would have to be negatively correlated or skewed.
However the best way I feel to see the distribution of the data is by box and wisker plot, as it show the distribution of data however I will need too group the data for this. In addition to this it would be important to find the mean as if my theory is correct then the mean for English should be higher than the girls mean for science.
From the table of results I shall construct another table. However I will change the raw data into grouped data because to a histogram we need grouped data. I will group the data in 9’s starting from 41 to 50 and so on until 90.
I have decided against using a histogram after all as it will not in greater detail show if the is any relationship between English and science. Based upon this reason I feel that it would not be wise continue drawing a histogram.
To construct the box plot I will need to find 5 different parts of information e.g. the lowest value the lower quartile the median the upper quartile and the highest values for both of the tables.
The lowest value for girls in English is 46. To find the lower quartile we divide the number of values by 4.As we want to find a quarter of the middle 50%.
n
4 so 33 divide by 4=8.25 value.
So we count up to the 8 and 9 value and then mark on the table where the lower quartile (Q1)
However we have to order the data first.
The 8.25th value lies between 51 and 55.However since there 3 values between them.52 must be Q1.
For the median we divide 33 by 2 which = the 16.5th value.
From the table I can see that Q2 e.g. the median lies between 62 and 65.If we take 63.5 as our median then that would mean that 1.5 would lie between the 16th and the 17th value. Therefore 63.5 are our median for the girls English.
To workout Q3 or the upper quartile. We multiply are 33values by 3 and then dived by 4. 3x 33=99 =24.75.
4
We mark on our table the 24th value and the 25th value. If we go up in 0.50’s from 62, the third value is 72.5, which is our Q3 .
Then we find the highest mark. Which turns out to be 87 in English.
For the science results the lowest value is found to be 36.
For Q1 Q2 and Q3 the science table will be in the same place although they will have different values.
Q1 the answer will be 51.25 as this is a quarter from 51 and 3 quarter away from 52.
Q2 the median is 61.5 as it is half the way from 61 and 62.
Q3 is 71.5, as 4, 0.5 added on to 70 equals 72.
The only factor missing is the highest value, which are 80.
My scale for the box plot will go up in 10’s and start from 30 and end at 90.
From the box plot I can easily see that girls in English have a higher median and Q3.As well has a highest value of 87. While girls in science had a lower highest value and a lower median and Q3, which suggests that girls good at English are indeed bad at maths. However to further confirm my theory I will find the mean of both English and science for girls and if my hypotheses is correct then the mean for English should be higher than the girls mean for science.
2076÷33=62.9
=63
2015÷33=61.1
=61
I have divided the sum of science and then English by 33 and rounded up to the nearest integer.
From the results I can see that although there is a slim difference in the mean of English and science there is still enough to substantiate my hypotheses that girls good at English will be bad at maths.
From the secondary data of year 8 Christmas exam results I have confirmed both my theories by using diagrams such as box plots and scatter graphs. By finding the interquartile averages and the mean .By using all of these I have accurately and easily proven my claims. However By using stratified random sampling me these may be inaccurate as the sample chosen could have been a high sample, which doesn’t support my hypotheses.
However if I had more time I would have like to take the whole sample of the 120 year 8’s and see if boys who are good at maths are really good at science. I would have taken the results for both maths and science and put them into a table. Upon which I could find the mean and then make a scatter diagram from the table of results. In addition top this I would also use a box plot from all the 120 students results in maths and science. To see better visually, if my theory was valid or not.