Statistics Coursework
Extracts from this document...
Introduction
STATISTICS COURSEWORK
Tesheen Moosa
Statistics Coursework
Introduction
I have been asked to examine the student’s attendance figures from all year groups (7, 8, 9, 10 and 11) at Hamilton Community College. I will be investigating whether the age of the students affects their attendance figures at school and does it affect their learning and exam results as well? To start my research, I was given the attendance figures by the school for all of the year groups for the 2003 – 2004 academic years. I will then start to process data (attendance figures) firstly by reducing the amount of data that I will have to process using the method of stratified sampling. By using stratified sampling I will then only use a fair amount of data according to the percentage that I’m comfortable with. I will only be using 20% of the attendance figures from each year. A scientific calculator is used, to randomly select attendance figures that I am going to use, so that the new set of statistics isn’t bias and isn’t affected by my conscious decision. Using the new set of data, I will collate the data in frequency tables (to display all of the frequency distributions), in order to enable easy interpretation and analysis.
Secondly, after collating the data, I will then display the new set of data in forms of graphs/diagrams and charts so that it will be easier for me to compare and study the figures. From these graphs/diagrams and charts, I will calculate the central tendency for all the year groups (mean, median) and also the dispersion of each year group by calculating the quartiles (upper quartile, lower quartile and interquartile range) which will also ensures that the figures that I am going to process and compare are only the true average (middle 50% of the data).
Middle
81
87.11
122
92.63
163
95.79
204
98.16
41
79.47
82
87.37
123
92.89
164
95.79
205
98.42
Stratified Sampling
As you can see, the data is too big and will take a huge amount of time to process and to analyse. So, I’ve decided to make the data much smaller. I will do this using the method of stratified sampling. Stratified sampling is used when there is a large amount of data that is needed to be process. The data will first be categorised (in this case the attendance figures all belongs to the correct year), then a random sample is then chosen from each category (I’ve used a scientific calculator). The amount or size of the sample is in proportion to the size of each category within the whole data. The proportion must be the same for all categories (in this case I’ve chosen 20% of the attendance figures from each year), so that the investigation is fair. Below is how I did it:
To make the data smaller and easier to process, I’ve stratified the data by putting it all in order according to each year. Then, I’ve used the RANDOM button on the scientific calculator to get a smaller new set of data from the original data. Below is how I did it:
Firstly, I’ve counted the amount of percentage figures in each year group and enter one of the amounts of percentage figures for a particular year group into the calculator. These numbers are important, as it will tell the calculator how big the range of numbers is, e.g. 0 – 100 or 0 – 1000. So, the calculator will not give you figures bigger or higher than the maximum number entered.
I’ve used the 2nd function button to enter the RANDOM (Ran#) mode. The calculator screen should then show the amount of data for a particular year group and the word ‘Ran#’.
E.g. there are 246 percentage figures for the Year 9 (which also means that there are 246 students in Year 9).
So, I will enter the numbers 246 into the calculator and use the 2nd function button to use the RANDOM button (Ran#).
Therefore, the calculator screen should show the information below:
246Ran#
After that, all I did was press the equal button (=) continuously, while recording the numbers that comes up on the screen every time.
I will ignore the numbers that has already comes up once and put down the number that comes up next randomly on the calculator.
I will only record the amount of data that I will need. Since, that I will only be using 20% of the amount of the original data, I will take 20% away from the original amount of data.
E.g. in Year 9, there are 246 students (amount of data), I only want 20% of that amount. So, I will take 20% from 246.
20% x 246 = 49.2
Because ‘49.2’ is not a whole number, I will have to round it to get a whole number. I cannot use 49.2 because I am taking 20% out of 246 students (it is a discrete quantitative data) and you can’t have 49.2 students. However, I can’t round it down to 49 because then it will mean that 0.2 of a student is missing, so I will round it up to 50 instead. That way no one is missing.
So, I will record 50 set of different random numbers from the calculator and use the numbered list of attendance percentage figures for the Year 9 to get the new set of data.
Below are the lists of random numbers projected by the calculator for each year.
Year 7 (20% x 179 = 35.8 (36))
1, | 8, | 14, | 15, | 16, | 24, | 26, | 29, | 34, | 36, |
37, | 45, | 56, | 64, | 66, | 67, | 68, | 71, | 73, | 83, |
87, | 89, | 103, | 112, | 123, | 127, | 136, | 143, | 153, | 158, |
162, | 163, | 164, | 170, | 171, | 179 |
Repeated numbers: 56 and 103
Numbers replacing repeated numbers: 153 and 16
Year 8 (20% x 235 = 47)
14, | 15, | 17, | 20, | 22, | 28, | 33, | 35, | 39, | 41, |
46, | 47, | 54, | 60, | 61, | 63, | 64, | 67, | 69, | 75, |
76, | 78, | 79, | 81, | 88, | 90, | 91, | 93, | 94, | 96, |
97, | 120, | 122, | 128, | 140, | 141, | 145, | 176, | 180, | 192, |
196, | 201, | 205, | 222, | 223, | 220, | 233 |
Repeated numbers: 90, 90 and 14
Numbers replacing repeated numbers: 141, 176 and 94
Year 9 (20% x 246 = 49.2 (50))
5, | 8, | 10, | 15, | 18, | 25, | 27, | 28, | 34, | 35, |
37, | 40, | 43, | 44, | 45, | 50, | 53, | 56, | 59, | 60, |
72, | 79, | 95, | 96, | 100, | 129, | 130, | 140, | 145, | 148, |
151, | 153, | 158, | 164, | 166, | 167, | 168, | 169, | 178, | 181, |
183, | 193, | 195, | 202, | 213, | 215, | 234, | 240, | 241, | 242 |
Repeated numbers: 50, 79, 100, 27 and 166
Numbers replacing repeated numbers: 164, 130, 59, 28 and 151
Year 10 (20% x 242 = 48.4 (49))
3, | 4, | 8, | 12, | 16, | 22, | 24, | 36, | 44, | 46, |
49, | 52, | 58, | 62, | 76, | 79, | 83, | 85, | 91, | 95, |
98, | 107, | 114, | 117, | 121, | 126, | 127, | 128, | 136, | 144, |
145, | 146, | 153, | 157, | 158, | 162, | 173, | 177, | 185, | 187, |
195, | 211, | 221, | 225, | 231, | 234, | 240, | 241, | 242 |
Repeated numbers: 225, 4, 145, 79 and 146
Numbers replacing repeated numbers: 24, 83, 22, 85 and 114
Year 11 (20% x 243 = 48.6 (49))
4, | 5, | 6, | 7, | 30, | 32, | 33, | 35, | 41, | 46, |
52, | 64, | 69, | 70, | 71, | 75, | 77, | 80, | 87, | 92, |
95, | 96, | 102, | 105, | 109 | 118, | 120, | 121, | 122, | 123, |
131, | 132, | 136, | 138, | 140, | 150, | 151, | 165, | 173, | 180, |
188, | 189, | 203, | 206, | 208, | 213, | 215, | 237, | 238 |
Repeated numbers: 203, 80 and 203
Numbers replacing repeated numbers: 46, 118 and 180
Below are the attendance figures of all the students in Hamilton Community College that was given to me. The highlighted cells with the bold numbers are the new set of data (this is a stratified data).
Year 7
1 | 47.47 | 46 | 87.3 | 91 | 93.39 | 136 | 96.56 |
2 | 56.52 | 47 | 87.83 | 92 | 93.65 | 137 | 96.83 |
3 | 59.79 | 48 | 87.83 | 93 | 93.92 | 138 | 96.83 |
4 | 61.64 | 49 | 87.83 | 94 | 93.92 | 139 | 97.09 |
5 | 65.87 | 50 | 88.1 | 95 | 93.92 | 140 | 97.11 |
6 | 66.14 | 51 | 88.1 | 96 | 93.92 | 141 | 97.14 |
7 | 67.36 | 52 | 88.36 | 97 | 93.92 | 142 | 97.24 |
8 | 72.22 | 53 | 88.36 | 98 | 94.18 | 143 | 97.35 |
9 | 74.6 | 54 | 88.48 | 99 | 94.18 | 144 | 97.35 |
10 | 74.6 | 55 | 88.62 | 100 | 94.44 | 145 | 97.35 |
11 | 75 | 56 | 88.67 | 101 | 94.64 | 146 | 97.62 |
12 | 75.66 | 57 | 88.89 | 102 | 94.67 | 147 | 97.62 |
13 | 76.05 | 58 | 89.09 | 103 | 94.68 | 148 | 97.88 |
14 | 76.98 | 59 | 89.15 | 104 | 94.71 | 149 | 97.88 |
15 | 78.57 | 60 | 89.15 | 105 | 94.97 | 150 | 97.88 |
16 | 78.57 | 61 | 89.17 | 106 | 94.97 | 151 | 97.88 |
17 | 80.16 | 62 | 89.68 | 107 | 94.97 | 152 | 98.15 |
18 | 80.95 | 63 | 89.68 | 108 | 95 | 153 | 98.15 |
19 | 81.48 | 64 | 89.68 | 109 | 95.24 | 154 | 98.15 |
20 | 81.75 | 65 | 89.68 | 110 | 95.24 | 155 | 98.15 |
21 | 81.87 | 66 | 89.68 | 111 | 95.49 | 156 | 98.16 |
22 | 82.01 | 67 | 89.93 | 112 | 95.5 | 157 | 98.41 |
23 | 82.54 | 68 | 89.95 | 113 | 95.5 | 158 | 98.41 |
24 | 82.54 | 69 | 90.21 | 114 | 95.77 | 159 | 98.41 |
25 | 82.8 | 70 | 90.21 | 115 | 95.77 | 160 | 98.52 |
26 | 83.33 | 71 | 90.48 | 116 | 95.77 | 161 | 98.68 |
27 | 83.33 | 72 | 90.48 | 117 | 95.77 | 162 | 98.68 |
28 | 84.04 | 73 | 90.52 | 118 | 95.77 | 163 | 98.94 |
29 | 84.39 | 74 | 90.74 | 119 | 95.77 | 164 | 98.94 |
30 | 84.66 | 75 | 90.74 | 120 | 95.77 | 165 | 98.94 |
31 | 85.92 | 76 | 90.74 | 121 | 96.03 | 166 | 98.99 |
32 | 85.09 | 77 | 90.76 | 122 | 96.03 | 167 | 99.21 |
33 | 85.19 | 78 | 91.53 | 123 | 96.03 | 168 | 99.47 |
34 | 85.45 | 79 | 91.54 | 124 | 96.12 | 169 | 99.47 |
35 | 85.71 | 80 | 92.06 | 125 | 96.3 | 170 | 99.47 |
36 | 85.98 | 81 | 92.06 | 126 | 96.3 | 171 | 99.47 |
37 | 86.24 | 82 | 92.31 | 127 | 96.3 | 172 | 99.74 |
38 | 86.24 | 83 | 92.33 | 128 | 96.3 | 173 | 99.74 |
39 | 86.51 | 84 | 92.59 | 129 | 96.3 | 174 | 100 |
40 | 86.51 | 85 | 92.59 | 130 | 96.34 | 175 | 100 |
41 | 86.77 | 86 | 92.59 | 131 | 96.56 | 176 | 100 |
42 | 86.96 | 87 | 92.86 | 132 | 96.56 | 177 | 100 |
43 | 87.04 | 88 | 92.94 | 133 | 96.56 | 178 | 100 |
44 | 87.04 | 89 | 93.12 | 134 | 96.56 | 179 | 100 |
45 | 87.3 | 90 | 93.15 | 135 | 96.56 |
Year 8
1 | 16.14 | 48 | 84.66 | 95 | 90.21 | 142 | 94.97 | 189 | 97.62 |
Conclusion
As you can see, my hypothesis here is proven right. However, I do believe that there are other factors that could affect the student’s attendance for example the environment they live in. Hypothesis 2 There is a relationship between the attendance of the students and their exams results. Students who comes to school often or everyday, to learn, tend to improve and have much better exam results than those who don’t. Fortunately, for this hypothesis, even though I do not really have a solid proof, I was able to spot out during the calculations of the Spearman’s rank correlation coefficient that the highest value added goes with the students that have a full 100% attendance figure. Evaluation In conclusion to this investigation, one thing that I really learned is that doing standard deviation takes a lot of patient. If I have a chance to do this coursework again, I would like to do it using the full data to make it more specific and possibly use other graphs and diagrams to display the data. Also I would like to link all of the results in a much clearer way. This student written piece of work is one of many that can be found in our AS and A Level Probability & Statistics section. Found what you're looking for?
![]()
Read more
(The above preview is unformatted text)
Found what you're looking for?
![]() Looking for expert help with your Maths work?![]() |