# Statistics Coursework

Extracts from this document...

Introduction

STATISTICS COURSEWORK

Tesheen Moosa

Statistics Coursework

Introduction

I have been asked to examine the student’s attendance figures from all year groups (7, 8, 9, 10 and 11) at Hamilton Community College. I will be investigating whether the age of the students affects their attendance figures at school and does it affect their learning and exam results as well? To start my research, I was given the attendance figures by the school for all of the year groups for the 2003 – 2004 academic years. I will then start to process data (attendance figures) firstly by reducing the amount of data that I will have to process using the method of stratified sampling. By using stratified sampling I will then only use a fair amount of data according to the percentage that I’m comfortable with. I will only be using 20% of the attendance figures from each year. A scientific calculator is used, to randomly select attendance figures that I am going to use, so that the new set of statistics isn’t bias and isn’t affected by my conscious decision. Using the new set of data, I will collate the data in frequency tables (to display all of the frequency distributions), in order to enable easy interpretation and analysis.

Secondly, after collating the data, I will then display the new set of data in forms of graphs/diagrams and charts so that it will be easier for me to compare and study the figures. From these graphs/diagrams and charts, I will calculate the central tendency for all the year groups (mean, median) and also the dispersion of each year group by calculating the quartiles (upper quartile, lower quartile and interquartile range) which will also ensures that the figures that I am going to process and compare are only the true average (middle 50% of the data).

Middle

81

87.11

122

92.63

163

95.79

204

98.16

41

79.47

82

87.37

123

92.89

164

95.79

205

98.42

Stratified Sampling

As you can see, the data is too big and will take a huge amount of time to process and to analyse. So, I’ve decided to make the data much smaller. I will do this using the method of stratified sampling. Stratified sampling is used when there is a large amount of data that is needed to be process. The data will first be categorised (in this case the attendance figures all belongs to the correct year), then a random sample is then chosen from each category (I’ve used a scientific calculator). The amount or size of the sample is in proportion to the size of each category within the whole data. The proportion must be the same for all categories (in this case I’ve chosen 20% of the attendance figures from each year), so that the investigation is fair. Below is how I did it:

To make the data smaller and easier to process, I’ve stratified the data by putting it all in order according to each year. Then, I’ve used the RANDOM button on the scientific calculator to get a smaller new set of data from the original data. Below is how I did it:

Firstly, I’ve counted the amount of percentage figures in each year group and enter one of the amounts of percentage figures for a particular year group into the calculator. These numbers are important, as it will tell the calculator how big the range of numbers is, e.g. 0 – 100 or 0 – 1000. So, the calculator will not give you figures bigger or higher than the maximum number entered.

I’ve used the 2nd function button to enter the RANDOM (Ran#) mode. The calculator screen should then show the amount of data for a particular year group and the word ‘Ran#’.

E.g. there are 246 percentage figures for the Year 9 (which also means that there are 246 students in Year 9).

So, I will enter the numbers 246 into the calculator and use the 2nd function button to use the RANDOM button (Ran#).

Therefore, the calculator screen should show the information below:

246Ran#

After that, all I did was press the equal button (=) continuously, while recording the numbers that comes up on the screen every time.

I will ignore the numbers that has already comes up once and put down the number that comes up next randomly on the calculator.

I will only record the amount of data that I will need. Since, that I will only be using 20% of the amount of the original data, I will take 20% away from the original amount of data.

E.g. in Year 9, there are 246 students (amount of data), I only want 20% of that amount. So, I will take 20% from 246.

20% x 246 = 49.2

Because ‘49.2’ is not a whole number, I will have to round it to get a whole number. I cannot use 49.2 because I am taking 20% out of 246 students (it is a discrete quantitative data) and you can’t have 49.2 students. However, I can’t round it down to 49 because then it will mean that 0.2 of a student is missing, so I will round it up to 50 instead. That way no one is missing.

So, I will record 50 set of different random numbers from the calculator and use the numbered list of attendance percentage figures for the Year 9 to get the new set of data.

Below are the lists of random numbers projected by the calculator for each year.

Year 7 (20% x 179 = 35.8 (36))

1, | 8, | 14, | 15, | 16, | 24, | 26, | 29, | 34, | 36, |

37, | 45, | 56, | 64, | 66, | 67, | 68, | 71, | 73, | 83, |

87, | 89, | 103, | 112, | 123, | 127, | 136, | 143, | 153, | 158, |

162, | 163, | 164, | 170, | 171, | 179 |

Repeated numbers: 56 and 103

Numbers replacing repeated numbers: 153 and 16

Year 8 (20% x 235 = 47)

14, | 15, | 17, | 20, | 22, | 28, | 33, | 35, | 39, | 41, |

46, | 47, | 54, | 60, | 61, | 63, | 64, | 67, | 69, | 75, |

76, | 78, | 79, | 81, | 88, | 90, | 91, | 93, | 94, | 96, |

97, | 120, | 122, | 128, | 140, | 141, | 145, | 176, | 180, | 192, |

196, | 201, | 205, | 222, | 223, | 220, | 233 |

Repeated numbers: 90, 90 and 14

Numbers replacing repeated numbers: 141, 176 and 94

Year 9 (20% x 246 = 49.2 (50))

5, | 8, | 10, | 15, | 18, | 25, | 27, | 28, | 34, | 35, |

37, | 40, | 43, | 44, | 45, | 50, | 53, | 56, | 59, | 60, |

72, | 79, | 95, | 96, | 100, | 129, | 130, | 140, | 145, | 148, |

151, | 153, | 158, | 164, | 166, | 167, | 168, | 169, | 178, | 181, |

183, | 193, | 195, | 202, | 213, | 215, | 234, | 240, | 241, | 242 |

Repeated numbers: 50, 79, 100, 27 and 166

Numbers replacing repeated numbers: 164, 130, 59, 28 and 151

Year 10 (20% x 242 = 48.4 (49))

3, | 4, | 8, | 12, | 16, | 22, | 24, | 36, | 44, | 46, |

49, | 52, | 58, | 62, | 76, | 79, | 83, | 85, | 91, | 95, |

98, | 107, | 114, | 117, | 121, | 126, | 127, | 128, | 136, | 144, |

145, | 146, | 153, | 157, | 158, | 162, | 173, | 177, | 185, | 187, |

195, | 211, | 221, | 225, | 231, | 234, | 240, | 241, | 242 |

Repeated numbers: 225, 4, 145, 79 and 146

Numbers replacing repeated numbers: 24, 83, 22, 85 and 114

Year 11 (20% x 243 = 48.6 (49))

4, | 5, | 6, | 7, | 30, | 32, | 33, | 35, | 41, | 46, |

52, | 64, | 69, | 70, | 71, | 75, | 77, | 80, | 87, | 92, |

95, | 96, | 102, | 105, | 109 | 118, | 120, | 121, | 122, | 123, |

131, | 132, | 136, | 138, | 140, | 150, | 151, | 165, | 173, | 180, |

188, | 189, | 203, | 206, | 208, | 213, | 215, | 237, | 238 |

Repeated numbers: 203, 80 and 203

Numbers replacing repeated numbers: 46, 118 and 180

Below are the attendance figures of all the students in Hamilton Community College that was given to me. The highlighted cells with the bold numbers are the new set of data (this is a stratified data).

Year 7

1 | 47.47 | 46 | 87.3 | 91 | 93.39 | 136 | 96.56 |

2 | 56.52 | 47 | 87.83 | 92 | 93.65 | 137 | 96.83 |

3 | 59.79 | 48 | 87.83 | 93 | 93.92 | 138 | 96.83 |

4 | 61.64 | 49 | 87.83 | 94 | 93.92 | 139 | 97.09 |

5 | 65.87 | 50 | 88.1 | 95 | 93.92 | 140 | 97.11 |

6 | 66.14 | 51 | 88.1 | 96 | 93.92 | 141 | 97.14 |

7 | 67.36 | 52 | 88.36 | 97 | 93.92 | 142 | 97.24 |

8 | 72.22 | 53 | 88.36 | 98 | 94.18 | 143 | 97.35 |

9 | 74.6 | 54 | 88.48 | 99 | 94.18 | 144 | 97.35 |

10 | 74.6 | 55 | 88.62 | 100 | 94.44 | 145 | 97.35 |

11 | 75 | 56 | 88.67 | 101 | 94.64 | 146 | 97.62 |

12 | 75.66 | 57 | 88.89 | 102 | 94.67 | 147 | 97.62 |

13 | 76.05 | 58 | 89.09 | 103 | 94.68 | 148 | 97.88 |

14 | 76.98 | 59 | 89.15 | 104 | 94.71 | 149 | 97.88 |

15 | 78.57 | 60 | 89.15 | 105 | 94.97 | 150 | 97.88 |

16 | 78.57 | 61 | 89.17 | 106 | 94.97 | 151 | 97.88 |

17 | 80.16 | 62 | 89.68 | 107 | 94.97 | 152 | 98.15 |

18 | 80.95 | 63 | 89.68 | 108 | 95 | 153 | 98.15 |

19 | 81.48 | 64 | 89.68 | 109 | 95.24 | 154 | 98.15 |

20 | 81.75 | 65 | 89.68 | 110 | 95.24 | 155 | 98.15 |

21 | 81.87 | 66 | 89.68 | 111 | 95.49 | 156 | 98.16 |

22 | 82.01 | 67 | 89.93 | 112 | 95.5 | 157 | 98.41 |

23 | 82.54 | 68 | 89.95 | 113 | 95.5 | 158 | 98.41 |

24 | 82.54 | 69 | 90.21 | 114 | 95.77 | 159 | 98.41 |

25 | 82.8 | 70 | 90.21 | 115 | 95.77 | 160 | 98.52 |

26 | 83.33 | 71 | 90.48 | 116 | 95.77 | 161 | 98.68 |

27 | 83.33 | 72 | 90.48 | 117 | 95.77 | 162 | 98.68 |

28 | 84.04 | 73 | 90.52 | 118 | 95.77 | 163 | 98.94 |

29 | 84.39 | 74 | 90.74 | 119 | 95.77 | 164 | 98.94 |

30 | 84.66 | 75 | 90.74 | 120 | 95.77 | 165 | 98.94 |

31 | 85.92 | 76 | 90.74 | 121 | 96.03 | 166 | 98.99 |

32 | 85.09 | 77 | 90.76 | 122 | 96.03 | 167 | 99.21 |

33 | 85.19 | 78 | 91.53 | 123 | 96.03 | 168 | 99.47 |

34 | 85.45 | 79 | 91.54 | 124 | 96.12 | 169 | 99.47 |

35 | 85.71 | 80 | 92.06 | 125 | 96.3 | 170 | 99.47 |

36 | 85.98 | 81 | 92.06 | 126 | 96.3 | 171 | 99.47 |

37 | 86.24 | 82 | 92.31 | 127 | 96.3 | 172 | 99.74 |

38 | 86.24 | 83 | 92.33 | 128 | 96.3 | 173 | 99.74 |

39 | 86.51 | 84 | 92.59 | 129 | 96.3 | 174 | 100 |

40 | 86.51 | 85 | 92.59 | 130 | 96.34 | 175 | 100 |

41 | 86.77 | 86 | 92.59 | 131 | 96.56 | 176 | 100 |

42 | 86.96 | 87 | 92.86 | 132 | 96.56 | 177 | 100 |

43 | 87.04 | 88 | 92.94 | 133 | 96.56 | 178 | 100 |

44 | 87.04 | 89 | 93.12 | 134 | 96.56 | 179 | 100 |

45 | 87.3 | 90 | 93.15 | 135 | 96.56 |

Year 8

1 | 16.14 | 48 | 84.66 | 95 | 90.21 | 142 | 94.97 | 189 | 97.62 |

Conclusion
As you can see, my hypothesis here is proven right. However, I do believe that there are other factors that could affect the student’s attendance for example the environment they live in. Hypothesis 2 There is a relationship between the attendance of the students and their exams results. Students who comes to school often or everyday, to learn, tend to improve and have much better exam results than those who don’t. Fortunately, for this hypothesis, even though I do not really have a solid proof, I was able to spot out during the calculations of the Spearman’s rank correlation coefficient that the highest value added goes with the students that have a full 100% attendance figure. Evaluation In conclusion to this investigation, one thing that I really learned is that doing standard deviation takes a lot of patient. If I have a chance to do this coursework again, I would like to do it using the full data to make it more specific and possibly use other graphs and diagrams to display the data. Also I would like to link all of the results in a much clearer way. This student written piece of work is one of many that can be found in our AS and A Level Probability & Statistics section. ## Found what you're looking for?- Start learning 29% faster today
- 150,000+ documents available
- Just £6.99 a month
Read more
(The above preview is unformatted text)
## Found what you're looking for?- Start learning 29% faster today
- 150,000+ documents available
- Just £6.99 a month
## Looking for expert help with your Maths work? |