Handling Data - Mayfield High School
GCSE Mathematics Coursework
Handling Data - Mayfield High School
Key Stage 2 results at Mayfield School
The Plan
I am investigating the factors that affect Key Stage 2 results. Factors that may affect Key Stage 2 results are:
IQ If someone is intelligent they are more likely to get good Key Stage 2 results
Primary school The standard of education at different primary schools varies so the understanding of subjects of different pupils is likely to be affected by their primary schools'.
Age The older pupils in a year, those whose birthday is in September, may have acquired a better understanding in their subjects due to their brains being more fully developed whereas younger students in a year, those whose birthday is in August, will be at more of a disadvantage concerning Key Stage 2 results as their minds will not be as developed as those older than them.
Year group New teaching methods, such as the Numeracy Hour, have been introduced since older years, such as years 10 and 11, have been educated and so the younger years, for example years 7 and 8, may have benefited from these new teaching methods and have a better understanding of their subjects therefore are more likely to get better Key Stage 2 results.
Gender It is said that in general girls cope better in exam conditions than boys and so are more likely to get better Key Stage 2 results.
I cannot investigate how different primary schools and how new teaching methods, such as the Numeracy Hour, affect the students' Key Stage 2 results as I do not have the data to support this.
I am going to investigate how IQ and gender affect Key Stage 2 results.
Hypothesis 1: There is a strong positive correlation between IQ and Key Stage 2 results.
Hypothesis 2: (a) IQ of boys and girls is normally distributed, in line with national ....figures.
(b) Gender makes no difference to IQ.
Hypothesis 3: Key stage 2 results are not influenced by gender.
The data I have is secondary, collected by a school I do not know so I am not sure if this data is reliable or not. I am not going to use any obvious anomalous data.
Hypothesis 1
I am going to take a stratified sample of the whole school to find out if there is a strong positive correlation between IQ and Key Stage 2 results. I am using a stratified sample so that it is a fair representative of the data as there may be different numbers of students in each year. I am using a sample of 50 students, as the sample won't be too small or too big so the data will be reliable but manageable. I will also have enough data to make a clear scatter graph, 50 points on a scatter graph will be enough to provide proof of whether there is strong correlation or not. Using more data would only waste time as it would not help to prove the strength of my results. The table below shows the numbers of students of each year and gender.
Year
Number of boys
Number of girls
Total number of students in year
7
51
31
282
8
45
25
270
9
18
42
260
0
06
94
200
1
84
86
70
Total number of students at school:
182
The next table shows how many students I will choose from each category in my stratified sample and how I worked this out.
Year
Number of boys
Number of boys I will use in sample:
Number of boys
Total number of x 50
students
Number of girls
Number of girls I will use in sample:
Number of girls
Total number of x 50
students
7
51
(151/1182)*50 = 6
31
(131/1182)*50 = 6
8
45
(145/1182)*50 = 6
25
(125/1182)*50 = 5
9
18
(118/1182)*50 = 5
42
(142/1182)*50 = 6
0
06
(106/1182)*50 = 4
94
(94/1182)*50 = 4
1
84
(84/1182)*50 = 4
86
(86/1182)*50 = 4
Total number of students in sample:
50
This sample will be random so every student in the sample has an equal chance of being picked so there is a fair representative of data. To get my random sample I shall use the random number function on my calculator.
Once I have my stratified sample I will plot a scatter graph showing Key Stage 2 results (y axis) against IQ (x axis). I will divide the graph into quadrants which will help me to decide whether there is strong positive correlation between IQ and Key Stage 2 results or not. If there is strong positive correlation I am going to draw a line of best fit on this graph and I will find its equation, which will further show the relationship between IQ and Key Stage 2 results.
Hypothesis 2a
To investigate whether the IQ of boys and girls is normally distributed, in line with national figures, I am going to collect a different stratified sample of 50 girls and 50 boys from year 8. I will collect my sample using the random number function on my calculator.
My main reason for testing this hypothesis is to check the validity of my results. I know from the internet that across Britain IQ is normally distributed with mean 100 and standard deviation 15. If I find that this is also true for year 8 pupils at Mayfield then this will be good evidence that any results I have from my hypotheses are likely to be true across Britain and not just this school. For example I might have to change hypothesis 1 to "At Mayfield School there is strong positive correlation between IQ and Key Stage 2 results.
I shall group the IQ of boys and draw a histogram. I shall need to draw a histogram because the class-widths will not be equal. I will then join the midpoints of each class-width on the histogram with a curve. If the curve is more or less symmetrical, 'bell-shaped', the data is normally distributed.
I shall calculate the mean and standard deviation for boys' IQ, and then I will find whether the IQ of boys is normally distributed by finding if 68% of the data lies within 1 standard deviation of the mean, and 95% of the ...
This is a preview of the whole essay
I shall group the IQ of boys and draw a histogram. I shall need to draw a histogram because the class-widths will not be equal. I will then join the midpoints of each class-width on the histogram with a curve. If the curve is more or less symmetrical, 'bell-shaped', the data is normally distributed.
I shall calculate the mean and standard deviation for boys' IQ, and then I will find whether the IQ of boys is normally distributed by finding if 68% of the data lies within 1 standard deviation of the mean, and 95% of the data lies within 2 standard deviations of the mean.
I shall then repeat this for the girls' IQ and decided whether they have a normal distribution in the same way.
I shall then check my results to see whether they are in line with national figures.
Hypothesis 2b
Knowing the means and standard deviations for the boys' and girls' IQ, and having two histograms should be enough for me to decide whether gender makes a difference to IQ or not.
I may also decide to draw cumulative frequency diagrams, find the medians, lower and upper quartiles, interquartile ranges, and draw box plots to provide more evidence to support my conclusion.
Hypothesis 3
How I test this hypothesis very much depends on the outcome of the other hypotheses.
I think it is likely I will be able to provide good evidence that there is strong positive correlation between IQ and Key Stage 2 results because it is commonly known that generally the more intelligent someone is the better the grades they get.
If I provide evidence to support hypotheses 1 and 2 it would mean that as there is a strong positive correlation between IQ and Key Stage 2 results, and gender makes no difference to IQ. This would also have to mean that gender makes no difference to Key Stage 2 results and so this would provide evidence to support my hypothesis that gender makes no difference to Key Stage 2 results. To further provide evidence to support this hypothesis I would use the same sample I used in Hypothesis 1 but draw two separate scatter graphs, one for boys and one for girls. I would then compare the graphs and decide whether there is any evidence to support my hypothesis that gender makes no difference to Key Stage 2 results.
If I provide evidence to support hypothesis 1 but find that gender does make a difference to IQ that would also mean gender would make a difference to Key Stage 2 results and so I would have evidence to support that both Hypotheses 2 and 3 are false.
I may have to change my plan depending on the outcome of Hypothesis 1 and 2.
Hypothesis 1
Stratified sample
Key Stage 2 results
IQ
English
Maths
Science
Average
Year 7 boys
4
00
5
4
5
5
2
8
72
2
3
3
3
3
20
02
3
3
5
4
4
36
85
3
3
3
3
5
39
10
5
5
5
5
6
11
00
4
4
4
4
Year 7 girls
7
19
95
3
3
4
3
8
14
00
4
4
4
4
9
2
07
5
5
5
5
0
5
01
5
4
4
4
1
1
03
4
4
5
4
2
84
07
5
5
5
5
Year 8 boys
3
34
02
4
4
4
4
4
60
14
5
5
5
5
5
83
99
5
4
4
4
6
8
97
4
4
4
4
7
25
87
3
3
3
3
8
22
09
5
5
5
5
Year 8 girls
9
46
10
5
4
5
5
20
49
03
4
4
5
4
21
6
00
4
4
5
4
22
4
84
3
3
3
3
23
65
04
4
5
5
5
Year 9 boys
24
10
92
4
3
3
3
25
63
08
4
5
5
5
26
45
02
5
4
4
4
27
94
88
3
3
3
3
28
30
92
4
3
3
3
Year 9 girls
29
25
89
4
3
3
3
30
37
08
4
4
5
4
31
41
97
4
4
4
4
32
31
16
5
5
5
5
33
24
78
3
3
3
3
34
27
03
3
3
4
3
Year 10 boys
35
59
09
4
5
5
5
36
86
92
3
4
4
4
37
79
90
3
3
3
3
38
8
00
4
4
4
4
Year 10 girls
39
89
95
3
3
3
3
40
66
90
3
3
3
3
41
25
05
5
5
4
5
42
8
00
4
4
4
4
Year 11 boys
43
44
98
4
4
4
4
44
79
08
5
4
4
4
45
45
18
5
5
5
5
46
20
10
4
4
5
4
Year 11 girls
47
37
00
4
4
4
4
48
2
08
5
5
5
5
49
39
97
3
3
3
3
50
31
04
4
4
4
4
Scatter graph to show relationship between Key Stage 2 results and IQ
There is good evidence to support my hypothesis that there is a strong positive correlation between IQ and Key Stage 2 results.
Looking at the graph the evidence is very significant in showing me the relationship between IQ and Key Stage 2 results. I can see that there is strong positive correlation between Key Stage 2 results and IQ. Almost all of the points on my graph are in the lower left quadrant and upper right quadrant and the line of best fit I've drawn has a positive gradient and so has positive correlation.
I am confident that my results are reliable because my sample size was big enough to be a fair representative of the data and to have enough points to plot a good graph. My sampling method was reliable as it avoided bias. My sample was stratified and it was completely random as I used the random number function on my calculator to collect the data. I did not know the reliability of the data as it was secondary, collected by another school, and sometimes when a lot of data is copied mistakes are made. However I found no data I considered to be suspect and I think this data seems to be reasonable reliable.
I am confident that my results are valid for the whole school because my sample was representative of the whole school.
Hypothesis 2a
Stratified sample
Girls
IQ
Boys
IQ
24
99
80
03
2
85
01
2
07
99
3
79
02
3
3
00
4
78
17
4
75
00
5
40
96
5
31
92
6
54
86
6
20
00
7
15
06
7
9
71
8
24
06
8
63
00
9
44
11
9
8
00
0
4
04
0
55
12
1
92
10
1
15
05
2
29
08
2
32
00
3
33
94
3
7
00
4
9
09
4
1
03
5
72
88
5
12
99
6
31
07
6
56
00
7
97
00
7
24
17
8
09
03
8
89
00
9
3
98
9
7
74
20
90
09
20
08
94
21
75
08
21
26
98
22
10
04
22
5
97
23
39
89
23
59
03
24
8
02
24
56
00
25
53
85
25
17
26
26
05
13
26
92
00
27
38
00
27
86
00
28
23
06
28
02
00
29
75
10
29
32
00
30
2
05
30
0
00
31
42
94
31
78
03
32
91
01
32
62
16
33
88
89
33
31
34
34
6
06
34
34
00
35
06
35
48
16
36
5
10
36
99
00
37
89
90
37
16
94
38
47
97
38
61
00
39
81
13
39
97
00
40
30
07
40
69
00
41
99
00
41
32
14
42
37
00
42
28
26
43
16
02
43
50
86
44
54
86
44
29
02
45
94
72
45
71
02
46
93
01
46
42
13
47
85
01
47
91
04
48
82
89
48
75
00
49
20
02
49
22
09
50
35
96
50
53
00
I will now group my data for the boys' IQ and use this to draw a histogram.
Group
Frequency
Class width
Frequency Density
70 - 85
2
5
0.1
86 - 95
4
0
0.4
96 - 100
26
5
5.2
01 - 105
8
5
.6
06 - 115
4
0
0.4
16 - 125
3
0
0.3
26 - 135
3
0
0.3
Histogram to show the IQ of boys in year 8 at Mayfield School
The histogram I have drawn to show the IQ of boys in year 8 at Mayfield School has a curve with a 'bell-shape' which suggests that the IQ is normally distributed but to provide further evidence of this I am going to now find the mean and standard deviation of the data. If 68% of the data lies within 1 standard deviation of the mean, and 95% of the data lies within 2 standard deviations of the mean, this means it is normally distributed.
I found the mean of the boy's IQ from my sample to be 101.
To find 1 standard deviation of my data I will use the following formula:
f is the symbol I'm using for frequency
x is the symbol I'm using for the midpoint of the group
? is the symbol I'm using for standard deviation
? is the symbol I'm using for 'the sum of'
is the symbol I'm using for the mean
Group
x
f
x-
(x-)2
f(x-)2
70 - 85
77.5
2
-23.5
552.25
104.5
86 - 95
90.5
4
-10.5
10.25
441.0
96 - 100
98.0
26
-3.0
9.0
234.0
01 - 105
03.0
8
2.0
4.0
32.0
06 - 115
10.5
4
9.5
90.25
361.0
16 - 125
20.5
3
9.5
380.25
140.75
26 - 135
30.5
3
29.5
870.25
2610.75
?f
50
? f(x-)2
5924.0
= 11
The mean of the boys' IQ from my sample is 101
standard deviation is 11
01 - 11 = 90
01 +11 = 112
2 standard deviations is 22
01 - 22 = 79
01+ 22 = 123
68% of the data must lie within 1 standard deviation of the mean so between 90 and 112.
From looking at the data in my sample I can see that 39 of the 50 boys have an IQ between 90 and 112, which means 78% lie within 1 standard deviation of the mean. This suggests that so far the data is normally distributed but for it to be normally distributed 95% of the data must lie within 2 standard deviations of the mean so between 79 and 123.
From looking at the data in my sample I can see that 45 of the 50 boys have an IQ between 79 and 123, which means only 90% lie within 2 standard deviations of the mean. Since 95% of the data must lie within 2 standard deviations of the mean for it to be normally distributed, my results suggest that the data isn't quite normally distributed.
I will now group the girls' data and use this to draw a histogram.
Group
Frequency
Class width
Frequency Density
70 - 85
2
5
0.1
86 - 95
9
0
0.9
96 - 100
9
5
.8
01 - 105
2
5
2.4
06 - 115
7
0
.7
16 - 125
0
0.1
Histogram to show the IQ of girls in year 8 at Mayfield School
The histogram I have drawn to show the IQ of year 8 girls at Mayfield School has a 'bell-shaped' curve which suggests that the data is normally distributed.
To provide further evidence to support my hypothesis I will now find the mean and standard deviation of the data. If 68% of the data lies within 1 standard deviation of the mean, and 95% of the data lies within 2 standard deviations of the mean, this means it is normally distributed.
I found the mean of the girls' IQ from my sample to be 97.
To find 1 standard deviation of my data I will use the same formula as I used to find 1 standard deviation of the boys' data:
Group
x
f
x-
(x-)2
f(x-)2
70 - 85
77.5
2
-19.5
380.25
760.5
86 - 95
90.5
9
-6.5
42.25
380.25
96 - 100
98.0
9
.0
.0
9.0
01 - 105
03.0
2
6.0
36.0
432.0
06 - 115
10.5
7
3.5
82.25
3098.25
16 - 125
20.5
23.5
552.25
552.25
?f
50
? f(x-)2
5232.35
= 10
The mean of the girls' IQ from my sample is 97.
standard deviation is 10.
97 - 10 = 87
97+ 10 = 107
2 standard deviations is 20
97 - 20 = 77
97+ 20 = 117
68% of the data must lie within 1 standard deviation of the mean so between 87 and 107.
From looking at the data in my sample I can see that 37 of the 50 girls have an IQ between 87 and 107 which means 74% of the data lies within 1 standard deviation of the mean. This suggests that so far the data is normally distributed but for it to be normally distributed 95% of the data must lie within 2 standard deviations of the mean so between 77 and 117.
From looking at the data in my sample I can see that 49 of the 50 girls have an IQ between 77 and 117 which means that 98% of the data lies within 2 standard deviations of the mean.
Since over 68% of the data lies within 1 standard deviation of the mean, and over 95% of the data lies within 2 standard deviations of the mean, this suggests that the IQ of the girls in year 8 is normally distributed.
Across Britain IQ is normally distributed with mean 100 and standard deviation 15.
The mean of the boys' IQ in year 8 is 101 which is very close to Britain's mean of 100, as is the mean of the girl's IQ of 97. The standard deviation of the boy's IQ in year 8 is 11 which is also close to Britain's standard deviation of 15, as is the standard deviation of the girl's IQ of 10.
There is some evidence to support my hypothesis that IQ of boys and girls is normally distributed, in line with national figures.
Looking at my graphs and results I can see that the IQ of boys in year 8 is not quite normally distributed across the year however the IQ of girls in year 8 is. I don't think this is extremely significant as I am trying to find evidence to support my hypothesis that IQ of boys and girls is normally distributed, in line with national figures. My results that the means and standard deviations of the IQ of boys and girls in year 8 are practically in line with national figures is quite significant as this supports my hypothesis that IQ of boys and girls is normally distributed, in line with national figures, which means that my results are valid across Britain and not just at Mayfield School.
I am confident that my results are reliable because I took a stratified sample of 50 for both boys and girls which provided me with enough data to be a fair representative of the data. My sampling method was also reliable as it avoided bias because I used the random number function on my calculator to obtain my sample. As far as I can tell the data I have used is reliable as I have not come across any suspect data, though I still do not know this for sure as this is secondary data so was collected by someone I don't know.
I am confident that my results are valid for not only Mayfield School but also across Britain because my results corresponded with national figures.
I could find further evidence for my conclusions by testing this hypothesis on another year group to see whether the results still remain the same.
Hypothesis 2b
The mean of the sample of the IQ of boys in year 8 I calculated was 101 however the mean of the sample of the IQ of girls in year 8 was 97. This would suggest that boys have a higher IQ than girls though I don't think the difference in the means is big enough to be considered significant and is little evidence to support that gender does influence IQ.
I will now draw 2 cumulative frequency diagrams and box plots, using the same stratified samples, to provide further evidence to support my hypothesis that gender makes no difference to IQ.
Boys
Group
Upper class boundary
Frequency
Cumulative Frequency
70 - 85
85.5
2
2
86 - 95
95.5
4
6
96 - 100
00.5
26
32
01 - 105
05.5
8
40
06 - 115
15.5
4
44
16 - 125
25.5
3
47
26 - 135
35
3
50
Cumulative Frequency Diagram of IQ of boys in year 8 at Mayfield School
I will now use this cumulative frequency diagram I have drawn to draw a box plot. I have already used the cumulative frequency diagram to find the Median and the Lower and Upper Quartiles.
Girls
Group
Upper class boundary
Frequency
Cumulative Frequency
70 - 85
85.5
2
2
86 - 95
95.5
9
1
96 - 100
00.5
9
20
01 - 105
05.5
2
32
06 - 115
15.5
7
49
16 - 125
25
50
Cumulative Frequency Diagram of IQ of girls in year 8 at Mayfield School
I will now use this cumulative frequency diagram I have drawn to draw a box plot. I have already used the cumulative frequency diagram to find the Median and the Lower and Upper Quartiles.
Box plot to show IQ of boys in year 8
Box plot to show IQ of girls in year 8
Now I have drawn box plots to show the IQ of both boys and girls in year 8 I can evaluate whether they provide evidence to support my hypothesis that gender makes no difference to IQ.
The box plot showing IQ of boys in year 8 shows me that the boys have a much narrower interquartile range than the girls', meaning most of the data is closer to the median; however the girls' data is more spread out.
Comparing the two box plots I can tell that some of the boys in year 8 have a higher IQ than the girls. However the medians of the two sets of data are nearly the same, though the girls' median is a little higher.
There is some evidence to support my hypothesis that gender makes no difference to IQ.
Although I have found that some of the boys in year 8 have a higher IQ than girls, I have also found that the girls' median is a little higher than the boys. These results are fairly significant as they show us to some extent that gender makes no difference to IQ
I am confident that my results are reliable because I used the same stratified sample as I used for Hypothesis 2a.
I am confident that my results are valid for the whole school because when I tested Hypothesis 2a I provided some evidence to support the hypothesis that IQ is normally distributed, in line with national figures so not only would my results be valid for the whole school but they should also be valid for the nationwide.
Hypothesis 3
Since I have provided some evidence to support hypotheses 1 and 2 this means there is evidence to support my hypothesis that gender makes no difference to Key Stage 2 results. This is because if there is strong positive correlation between Key Stage 2 results and IQ, and gender makes no difference to IQ, then gender wouldn't make a difference to Key Stage 2 results either.
I am going to use the same sample I used in Hypothesis 1 to draw two scatter graphs, one for boys and one for girls, to try and provide further evidence to support my hypothesis that gender makes no difference to Key Stage 2 results.
Stratified sample for boys
Key Stage 2 results
IQ
English
Maths
Science
Average
Year 7 boys
4
00
5
4
5
5
8
72
2
3
3
3
20
02
3
3
5
4
36
85
3
3
3
3
39
10
5
5
5
5
11
00
4
4
4
4
Year 8 boys
34
02
4
4
4
4
60
14
5
5
5
5
83
99
5
4
4
4
8
97
4
4
4
4
25
87
3
3
3
3
22
09
5
5
5
5
Year 9 boys
10
92
4
3
3
3
63
08
4
5
5
5
45
02
5
4
4
4
94
88
3
3
3
3
30
92
4
3
3
3
Year 10 boys
59
09
4
5
5
5
86
92
3
4
4
4
79
90
3
3
3
3
8
00
4
4
4
4
Year 11 boys
44
98
4
4
4
4
79
08
5
4
4
4
45
18
5
5
5
5
20
10
4
4
5
4
Stratified sample for girls
Key Stage 2 results
IQ
English
Maths
Science
Average
Year 7 girls
19
95
3
3
4
3
14
00
4
4
4
4
2
07
5
5
5
5
5
01
5
4
4
4
1
03
4
4
5
4
84
07
5
5
5
5
Year 8 girls
46
10
5
4
5
5
49
03
4
4
5
4
6
00
4
4
5
4
4
84
3
3
3
3
65
04
4
5
5
5
Year 9 girls
25
89
4
3
3
3
37
08
4
4
5
4
41
97
4
4
4
4
31
16
5
5
5
5
24
78
3
3
3
3
27
03
3
3
4
3
Year 10 girls
89
95
3
3
3
3
66
90
3
3
3
3
25
05
5
5
4
5
8
00
4
4
4
4
Year 11 girls
37
00
4
4
4
4
2
08
5
5
5
5
39
97
3
3
3
3
31
04
4
4
4
4
Scatter graph showing relationship between Boys IQ and Key Stage 2 results
Scatter graph showing relationship between Girls' IQ and Key Stage 2 results
Comparing these two scatter graphs I've drawn I can see that they are very similar. Neither of the graphs show either boys or girls to have gotten higher grades than the other. I think this is quite significant as it provides good evidence to support my hypothesis that gender makes no difference to Key Stage 2 results.
I am confident that my results are reliable because I have used the same stratified sample as I used in Hypothesis 1 and I believe this to be a reliable sample as it avoided bias because I used the random number function on my calculator to obtain it.
I am confident that my results are valid for the whole school because the stratified sample I used was representative of the whole school.
I managed to find evidence to support all three hypotheses.
I did not come across any suspect data throughout my investigation so as far as I know all the data I used was reliable but I really cannot be sure as I wasn't the person who collected the data.
I could extend my investigation by testing whether there is still strong positive correlation between IQ and just the English Key Stage 2 result as opposed to an average of all three, English, Maths and Science, results.
I could also extend my investigation by testing whether Age influences Key Stage 2 results. For example, those whose birthday is in September may be at an advantage compared to those whose birthday is in July or August.
Jess Blair 10.7