Maths Coursework Data
Handling Task
Statistics Data
In this piece of coursework I have been set the task to find out about the students in our school.
I need to prove the following hypothesis:
'Pupils in Band A perform better than pupils in Band B'
I must suggest whether the hypothesis is correct or incorrect. I will do this by comparing Band A with Band B in the following areas:
* Mean averages from key stage three- (level tiers 3-5, 4-6, 5-7)
* Range of scores
* Modal and median of the scores.
There are many different techniques and methods which I can use to solve my above problem. I will use some techniques which will enable me to work out means, ranges, modes and medians of scores. To help me with my work I could also use cumulative frequency graphs, box plots and interquartile ranges. These will all help me to compare the differences between the two bands from looking at their scores.
I am going to focus on 96 pieces of data (pupils) which I shall be analysing the levels and scores of both bands A and B.
The first thing that I am going to do is, to compare the overall results from both bands regarding their scores from a maths SATS paper. This would involve me using a process called stratified sampling. This basically involves me reducing the amount of data that I need to compare. This method is seen as time consuming on a very large scale of data, which is handy for me. To compare the results I will need to sample the data. I am aiming to have a stratified sample size of 30 as it is nearly a third of my total data. Stratified sampling ensures that a fair proportion of pupils are chosen from both bands.
I will use the maths levels data first, so now I need to find out how many pupils there are in each band.
Band A = 42
Band B = 54
Then, I need to find out exactly how many pupils from both Bands A and B that were entered into certain math tiers. They range from tier levels 3-5, 4-6 and 5-7.
Maths entry levels
TIERS
3-5
4-6
5-7
Band A
Band B
TIERS
3-5
4-6
5-7
Band A
21
8
3
Band B
21
6
7
As all the pupils are doing different tiers to one another I need to use a certain equation to find out my stratified sample size of 30 pieces of data. This equation will help me work out how many students I need to choose from each tier of each band.
The equation is as follows: number of pupils in tier * 30 (sample size)
Total number of pupils
Band A
Tier 3-5 = 21 * 30 = 6.5625 7 pupils
96
Tier 4-6 = 8 * 30 = 2.5 3 pupils
96
Tier 5-7 = 13 * 30 = 2.5 4 pupils
96
Band B
Tier 3-5 = 21 * 30 = 6.5625 7 pupils
96
Tier 4-6 = 16 * 30 = 5 5 pupils
96
Tier 5-7 = 17 * 30 = 5.3125 5 pupils
96
The sum of my entire stratified sample sizes adds up to make a total of 31 for my sample size. My sample size has gone slightly above of what it should be due to the fact that some working outs required rounding off, which may give me results that are not totally accurate.
7+ 3+ 4+ 7+ 5+ 5= 31
I will use a method called systematic sampling when choosing my sample data. One way in which I will do it is by picking the first student then every third one.
Band A
Tier 3-5 - Pupils chosen= 100, 118, 128, 136, 147, 172, 185
Tier 4-6 - Pupils chosen= 101, 134, 174
Tier 5-7 - Pupils chosen= 98, 130, 142, 163
Band A
Tier 3-5 - Pupils chosen= 97, 107, 153, 158, 168, 181, 191
Tier 4-6 - Pupils chosen= 106, 122, 133, 150, 189
Tier 5-7 - Pupils chosen= 105, 114, 120, 140, 170
I have now picked my sample data from each of the tiers. I will need to record the levels. After this I will be able to find the mean, range, mode and median of Band A where I can then compare and analyse it with Band B.
I am totally aware that I can compare the marks for these students although there are issues regarding the different levels. I.e. a high mark on an easier paper still can give a lower level than a low score on a higher tier paper.
Band A
Tier 3-5 = 4, 3, 4, 3, 4, 5, 5 mean for band A= 68/14 = 4. 86 (2dp)
Tier 4-6 = 6, 5, 5 range = 7 - 3 = 4
Tier 4-6 = 5, 6, 6, 7 mode = 5
Median = 14+1 = 7.5th = 5 is the median
2
The data in order is: 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 7 = the value is in between the two highlighted values, therefore it becomes a 5 = the median.
Lower quartile = 1/4 * 14 = 3.5th = 4
Upper quartile = 3/4 * 14 = 10.5th = 5.5
...
This is a preview of the whole essay
Tier 4-6 = 6, 5, 5 range = 7 - 3 = 4
Tier 4-6 = 5, 6, 6, 7 mode = 5
Median = 14+1 = 7.5th = 5 is the median
2
The data in order is: 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 7 = the value is in between the two highlighted values, therefore it becomes a 5 = the median.
Lower quartile = 1/4 * 14 = 3.5th = 4
Upper quartile = 3/4 * 14 = 10.5th = 5.5
Interquartile = 5.5 - 4 = 1.5
Band B
Tier 3-5 = 3, 4, 4, 4, 3, 3, 4 mean for band A= 80/17= 4. 71 (2dp)
Tier 4-6 = 5, 6, 5, 5, 6 range = 6 - 3 = 3
Tier 4-6 = 6, 6, 6, 6, 6 mode = 6
Median = 17+1 = 9th = 5 is the median
2
The data in order is: 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6 = 5 is the median.
Lower quartile = 1/4 * 17 = 4.25th = 4
Upper quartile = 3/4 * 17 = 12.75th = 13
Interquartile = 13 - 4 = 9
As you can see from the above data, band A are performing better overall as a year then Band B. Band A has a mean average level of 4.86 (2dp) whereas Band B has a average mean level of 4.71 (2dp) so I can easily see that Band A have performed better then Band B. When comparing the range of the levels, Band A has the highest difference between levels. Their range varies from 3-7 whereas band B's range varies between levels 3-6. This tells me that Band A has a fairer spread of pupils doing different tiers. When comparing the median of both Band A and Band B they are equally performing as well as each other. In band A the mode level that the pupils have taken is 5, whereas with Band B their mode level is 6. This allows me to suggest that Band B pupils have taken higher tier papers than Band B.
Now I am going to compare each tier with each other instead of using the maths levels. This should then show me, in which sections pupils are not performing as well as the other band. I will be able to make comparisons between the two tiers. I will again use stratified samples for each of my tiers.
Tier 3-5
Band A, stratified samples are = 73, 32, 88, 57, 92, 102, 102
Mean = 73 + 32 + 88 + 57 + 92 + 102 + 102 = 78
7
Range = 102 - 32 = 70
Band B, stratified samples are = 62, 93, 88, 77, 39, 39, 92
Mean = 62 + 93 + 88 + 77 + 39 + 39 + 92 = 70
7
Range = 93 - 39 = 54
When comparing the mean averages for the tier 3-5 paper, Band A have performed better than Band B. Band A have a mean average of 78 whereas band B has a mean average of 70. The difference between the two averages is 8 marks. Band A have also performed better than Band B when you compare the ranges of the scores between the two. Band A have a range of 70, to band B's 54. Overall Band A has better than band B in this tier because both the mean averages and range prove this as they are both higher.
Tier 4-6
Band A, stratified samples are = 99, 68, 84
Mean = 99 + 68 + 84 = 83.66.....= 83.67 (2dp)
3
Range = 99 - 68 = 31
Band B, stratified samples are = 79, 90, 85, 61, 91
Mean = 79 + 90 + 85 + 61 + 91 = 81.2
5
Range = 91 - 61 = 30
When contrasting Band A and band B in the tier 4-6 paper, band A have again performed better just like they did in tier 3-5. Both their mean scores and ranges are higher than Band B. Yet this time the differences were only very slight. Band A have a mean average of 83.67 (2dp), compared to band B's 81 .2. As you can see there is not a lot between the two bands. The ranges are even closer, because Band A range is 31 and band B range is 30. There is only 1 between the two ranges.
Tier 5-7
Band A, stratified samples are = 38, 61, 75, 110
Mean = 38 + 61 + 75 + 110 = 71
4
Range = 110 - 38 = 72
Band B, stratified samples are = 70, 84, 67, 78, 87
Mean = 70 + 84 + 67 + 78 + 87 = 77.2
5
Range = 87 - 67 = 2 0
When comparing the mean averages for the scores by pupils in tier 5-7, Band B have preformed better than Band A. They have an average score of 77.2, this is 6.2 more than Band A's mean score of 71. Band B have the better average mean score in this tier, but Band A have the better range between the scores. Band A's score range is 72 a massive 52 greater than Band B's range of 20. so in this tier Band B have performed better in an overall mean score, but Band A have got a better range of scores.
At this stage of this piece of investigation, I will start to analyse the scores the students achieved in a Maths SATS paper. I will use cumulative frequency tables / graphs and box plots. There are two ways in which I will look at the scores. Firstly by comparing tier by tier and then contrasting the overall band scores. I will do it tier by tier because pupils can get a high mark on an easier tier but someone with a low score can get a higher level than them in the higher tier.
I am not going to sample the data that is to be used in the below tables. This is because by sampling there would not be enough data to draw a cumulative graph. So the scores for Band A and Band B will be put in tables, tier by tier. By doing this I should see which pupils are performing well in which tier.
Tier level 3-5 Maths
Band A
Score (s)
Tally
Frequency
Mid-point
Frequency * Mid-point
Cumulative
Frequency
0 < s < 10
0
5
0
0
0 < s < 20
0
5
0
0
20 < s < 30
25
25
30 < s < 40
3
35
05
4
40 < s < 50
2
45
90
6
50 < s < 60
55
55
7
60 < s < 70
65
65
8
70 < s < 80
2
75
50
0
80 < s < 90
2
85
70
2
90 < s < 100
4
95
380
6
00 < s < 110
5
05
525
21
Total = 21
Total = 1565
Mean = 1565 = 74. 52380952
21
= 74. 52 (2dp)
Band B
Score (s)
Tally
Frequency
Mid-point
Frequency * Mid-point
Cumulative
Frequency
0 < s < 10
0
5
0
0
0 < s < 20
0
5
0
0
20 < s < 30
0
25
0
0
30 < s < 40
4
35
40
4
40 < s < 50
0
45
0
4
50 < s < 60
4
55
220
8
60 < s < 70
2
65
30
0
70 < s < 80
75
75
1
80 < s < 90
4
85
340
5
90 < s < 100
3
95
285
8
00 < s < 110
3
05
315
21
Total = 21
Total = 1505
Mean = 1505 = 71.6666.........
21
= 71.67 (2dp)
Tier level 4-6 Maths
Band A
Score (s)
Tally
Frequency
Mid-point
Frequency * Mid-point
Cumulative
Frequency
0 < s < 10
0
5
0
0
0 < s < 20
0
5
0
0
20 < s < 30
0
25
0
0
30 < s < 40
0
35
0
0
40 < s < 50
0
45
0
0
50 < s < 60
0
55
0
0
60 < s < 70
65
65
70 < s < 80
0
75
0
80 < s < 90
5
85
425
6
90 < s < 100
2
95
90
8
Total = 8
Total = 680
Mean = 680 = 85
8
Band B
Score (s)
Tally
Frequency
Mid-point
Frequency * Mid-point
Cumulative
Frequency
0 < s < 10
0
5
0
0
0 < s < 20
0
5
0
0
20 < s < 30
0
25
0
0
30 < s < 40
0
35
0
0
40 < s < 50
0
45
0
0
50 < s < 60
55
55
60 < s < 70
65
65
2
70 < s < 80
4
75
300
6
80 < s < 90
3
85
255
9
90 < s < 100
4
95
380
3
00 < s < 110
3
05
315
6
Total = 16
Total = 1370
Mean = 1370 = 85. 625
16
= 85. 63 (2dp)
Tier level 5-7 Maths
Band A
Score (s)
Tally
Frequency
Mid-point
Frequency * Mid-point
Cumulative
Frequency
0 < s < 10
0
5
0
0
0 < s < 20
0
5
0
0
20 < s < 30
0
25
0
0
30 < s < 40
35
35
40 < s < 50
45
45
2
50 < s < 60
0
55
0
2
60 < s < 70
5
65
325
7
70 < s < 80
3
75
50
0
80 < s < 90
85
85
1
90 < s < 100
0
95
0
1
00 < s < 110
05
05
2
10 < s < 120
2
15
230
4
Total = 14
Total = 975
Mean = 975 = 69. 64285714
14
= 69. 64 (2dp)
Band B
Score (s)
Tally
Frequency
Mid-point
Frequency * Mid-point
Cumulative
Frequency
0 < s < 10
0
5
0
0
0 < s < 20
0
5
0
0
20 < s < 30
0
25
0
0
30 < s < 40
0
35
0
0
40 < s < 50
45
45
50 < s < 60
2
55
10
3
60 < s < 70
65
65
4
70 < s < 80
3
75
225
7
80 < s < 90
6
85
510
3
90 < s < 100
2
95
90
5
00 < s < 110
05
05
6
10 < s < 120
15
15
7
Total = 17
Total = 1365
Mean = 1365 = 80. 29411765
17
= 80. 29 (2dp)
Overall scores in Maths
Band A
Score (s)
Tally
Frequency
Mid-point
Frequency * Mid-point
Cumulative
Frequency
0 < s < 10
0
5
0
0
0 < s < 20
0
5
0
0
20 < s < 30
25
25
30 < s < 40
4
35
40
5
40 < s < 50
3
45
35
8
50 < s < 60
55
55
9
60 < s < 70
7
65
455
6
70 < s < 80
4
75
300
20
80 < s < 90
8
85
680
28
90 < s < 100
6
95
570
34
00 < s < 110
6
05
630
40
10 < s < 120
2
15
230
42
Total = 42
Total = 3220
Mean = 3220 = 76. 6666........
42
= 76. 67 (2dp)
Band B
Score (s)
Tally
Frequency
Mid-point
Frequency * Mid-point
Cumulative
Frequency
0 < s < 10
0
5
0
0
0 < s < 20
0
5
0
0
20 < s < 30
0
25
0
0
30 < s < 40
4
35
40
4
40 < s < 50
45
45
5
50 < s < 60
7
55
385
2
60 < s < 70
4
65
260
6
70 < s < 80
8
75
600
24
80 < s < 90
3
85
105
37
90 < s < 100
9
95
855
46
00 < s < 110
7
05
735
53
10 < s < 120
15
15
54
Total = 54
Total = 4240
Mean = 4240 = 78. 51851852
54
= 78. 52 (2dp)
Conclusion
To conclude this investigation, I can see a very clear picture from my first hypothesis; the statement suggested that 'Pupils in Band A perform better than those pupils in Band B in our school'.
I don't necessarily agree with this statement, as after I had sampled my work, I have found out that Band B is performing considerably better than Band A. When you look at the levels that each band achieves, Band B's tends to be that slightly better on a whole. I followed up the levels by comparing the scores from the sampled students from each band.
I furthered this investigation by analysing the whole population from which I gathered grouped tables of the scores. In most instances Band B scores seemed to be better and more consistent than those of Band A. On most occasions Band B had a larger average mean score than the other band. In my views this hypothesis can be easily argued against. When I used the grouped data to produce cumulative graphs and box plots, again on many incidents Band B had a low interquartile range which suggests for starts their scores seem to be more dependable. The median of the scores suggested Band B had higher scores therefore are performing better than Band A.
After finally recapping on all of this assignment, ultimately I come to a decision regarding the first hypothesis. Overall Band A are not as good performers as Band B, and all my work proves this in some way or another.
For the second hypothesis that I have decided to use regarding the statement 'students with a good reading score also have a good English score'. To solve this statement I have used scatter diagrams, which I think are the best possible way to get a good accurate outcome. From the diagram I have noticed that pupils with good reading scores also tend to have just as fine English scores. Whilst looking at the scatter diagram, I have noticed that it is a positive correlation. This basically means the higher the reading scores the better the English scores. However there are several pupils whose cases are dissimilar to what I have suggested previously. These pupils may have had good English scores but a weak reading score etc.
So with the hypothesis that I have used as one of my own, I think it is fairly reliable, but on some occurrences there sometimes be a different situation.
In this investigation I am expected to use one of my own hypotheses, where I will try and find out if it is true or not.
The second hypothesis that I am going to use is:
'Students with a good reading score also have a good English score'.
I think the best way for me to explore this is by producing a correlation which consists of a scatter graph. By producing a graph I will be able to compare the reading scores with the English scores. As the scatter graph will also need a line of best fit, it will encourage me to suggest examples, by describing what scores a pupils gets by looking across the line.
The data which I will use in this section will again be the one I had previously gathered by sampling. So in total I will 31 pieces of data, which should help me to plot and draw a diagram, from which I could then compare and get conclusions from. I will also find out the mean score for both the reading and English score for the below students.
I will use the following data which was part of my sampling technique in the scatter diagram.
Pupils chosen by sampling
Reading Score
English Score
97
90
49
98
98
47
00
80
45
01
86
60
05
99
45
06
99
42
07
84
52
14
04
63
18
72
29
20
06
59
22
85
32
28
85
23
30
98
46
33
99
42
34
96
51
36
70
30
40
11
53
42
99
49
47
96
35
50
94
47
53
84
55
58
84
37
63
30
55
68
00
22
70
07
54
72
03
58
74
10
47
81
00
22
85
03
58
89
00
47
91
81
32
2953
389
Mean of Reading scores = 2983 = 99.43333........
31
= 99. 43 (2dp)
I have found out the mean score for the reading paper, by adding all the scores for the 31 pieces of data. After getting a total of 2953, I then divided it by the original number of data used (31) to get an overall mean score for the reading paper, which is 99.43 (2dp).
Mean of English scores = 1389 = 44. 80645161
31
= 44. 81 (2dp)
I have found out the mean score for the English scores by adding all the scores for the 31 pieces of data. After getting a total of 1389, I then divided it by the original number of data used (31) to get an overall mean score for the English paper, which is 44.81 (2dp).
When I look at the scatter diagram, I can see that my hypothesis is fairly correct. The correlation of the scores is positive, but is neither a weak nor a strong it lays just in between. The data on the spreadsheet has allowed me to recognise that a pupil performing well in their reading scores also performs well in their overall English scores.
For example, most pupils who have scores which are 85+ in the reading scores seem to be getting a fairly high score in the English. Their scores range from 45-60 in the English paper.
Examples
* Pupil one =Say if a pupil was getting 95 (high score) as their reading, score, from my line of best fit they should get around the region of 45 as their English score.
* Pupils two = in another case, say a pupil was getting a reading score of 65 (low score) as their reading score, from my line of best fit they should expect to get a score around the region of 23 in English.
So from the examples that I have chosen you can clearly see what the differences between the two pupils are. Pupil 1 was getting a high reading score, therefore also got a high English score. Whereas Pupil 2 got a modest reading score so got a low English score.
So in conclusion to this hypothesis, I think that it is correct, but on the minority of instances it may not be the case. But generally I agree with this statement and I think the work I have down should back up my case, as from it I was able to compare and justify suggestions for the work
Cumulative frequency and box plots
Now I will start comparing the box plots for each tier. These could have some effect on just how well a Band performs better than the other.
I will look at each box plot for Band A and Band B under each tier (3-5, 4-6, 5-7). By looking at it this in this prospective it could show me things comparing the two in which I couldn't compare before like when I had frequency average tables where I could find the mean scores etc. with box plots it enables you to compare the median and interquartile of each Band.
Tier 3-5
In the tier 3-5 Band A have performed slightly better than Band B. I say this because Band A has an interquartile of 52. This is only 2 more than Band B who had an interquartile of 50. So in this tier there is not a lot in between the two bands. They are basically really on even basis, whilst regarding this tier.
Tier 4-6
In the tier 4-6 Band B have totally out performed Band A. there is a missive difference of around 14 in the two Bands interquartile. Band A interquartile is a minute 6 compared to that of Band B who has a interquartile of 20.
Tier 5-7
In this certain tier again there is not a lot in it when contrasting the interquartile for both bands. In fact the difference is only the one and that in the favour of Band A. Band has got an interquartile of 17 compared to 16 of Band B.
Overall for both Bands
When comparing the overall interquartile for the two Bands it is difficult to separate them. Both Band A and Band B have the exact same interquartile as one another. Both bands managing to have a interquartile at 29. So by looking at the cumulative frequency graph and box plots to go with the overall tier section, it suggests that both performed as well as each other.
If you were to look at the individual tiers Band A hand a upper hand on Band B. Band A had a better interquartile in two of the three tiers than Band B.
Mathematics Coursework Balwant Singh Sandhu 10PT