I am going to design and then carry out an experiment to test people's reaction times, and therefore test my initial hypothesis.
Mathematics Statistics Coursework
I am going to design and then carry out an experiment to test people’s reaction times, and therefore test my initial hypothesis.
Initial Hypothesis: Some people have faster reaction times than others
To design my investigation, I first need to carry out a preliminary test, to see what variables there are, and how I will control them. To test reaction times, I dropped a special piece of card, with numbers along it recording how many hundredths of a second it takes for the person to drop it. From the information gained by this test, I set my main rules.
However, it was noticed that some people did not catch the ruler at all, completely missing it. It was decided that if this happened, a reading of 30 would be taken as the recorded time. This is because we could not just record the time taken as ‘ - ’ as this in unplottable. We could not put ‘0’ down as the recorded time as this is the fastest time in which you can catch the ruler, and this implies the participant has a very fast reaction time as opposed to a very slow one.
Now that I have set the rules, I can make a more precise hypothesis, relevant to who I will test this on. I have decided to test the reaction times of girls at Withington, between and including years 7 and 11. This will therefore be 325 girls aged 11-16. I chose this as I think this is an appropriate sample for my original hypothesis. It is easy to collect data for, and a good way to test this very general hypothesis.
The data was collected by forms and then in year groups. This is then one large set ...
This is a preview of the whole essay
Now that I have set the rules, I can make a more precise hypothesis, relevant to who I will test this on. I have decided to test the reaction times of girls at Withington, between and including years 7 and 11. This will therefore be 325 girls aged 11-16. I chose this as I think this is an appropriate sample for my original hypothesis. It is easy to collect data for, and a good way to test this very general hypothesis.
The data was collected by forms and then in year groups. This is then one large set of data as the population size is too big to collect data individually. Hopefully the results will be accurate as these rules have been followed.
The results from the tests backed up my original hypothesis, as many different reaction times were collected from all the different girls. I can now move on to make a more precise hypothesis…
… Girls between 11 and 16 have faster reaction times in the morning than in the afternoon.
This is an appropriate hypothesis to make based on the sample I took for the entire population (325 girls aged 11-16 at WGS).
Sampling my collected data
I will sample my data further as there are too many people included. This means interpreting the data will take too much time and will be hard to plot graphs from the resulting graphs would be too cluttered. I will make my sample a size of 60 pupils, as I feel this will provide me with enough data to get accurate conclusions, and prove my hypothesis, but not give me so much data that all my graphs are cluttered.
I have chosen to do a stratified sample. This is because I wish to have a range of pupils, and if I just did a random sample, I could end up with mostly older girls, or mostly younger. After finding out how many girls must be chosen from each year, I will give each girl in each year a number. I will find out how many girls’ data will be needed from each year, and then I will use the random number generator for each separate year to determine which pieces of data I will use.
Year group 1 (years7-8): 80/325 x 60= 14.769…
Year group 2 (years8-9): 79/325 x 60= 14.584…
Year group 3 (years9-10): 83/325 x 60= 15.323…
Year group 4 (years10-11): 83/325 x 60= 15.323…
As I cannot take part of a set of data, I will have to round these numbers to the nearest whole numbers. This means that 15 sets of data will be needed from each year group.
Interpreting the data
There are many different aspects of the data I could use to make graphs, and interpret the data to prove or disprove my hypothesis. However I have chosen to use the median and the best time for each set of data with both dominant and non dominant hands, in both am and pm. I chose to do the median because it will not give the very slow or very fast times, but the most average time for the pupil, and hopefully cut out the main outliers. The best time will give me an idea of how good each of the girls are when they concentrate and try their hardest.
First of all I will create a table of all the medians and best times from which I can make my graphs.
Using Autograph, I created box and whisker diagrams, making it easier to compare my data. These are an easy way to compare data as you can easily see where the bulk of the reaction times are, and how fast they are, as they will be in the ‘box’. The whiskers will show me how many anomalies there are, as the average times will all be in the box, and the few which stretch outside the inter quartile range are not common times. The further away from the boxes that the whiskers stretch, the further away from the average the anomalies are.
I collected together both the non dominant and dominant results for in the morning, and then again for in the afternoon. I then plotted these results.
I have noticed a few things about this box and whisker diagram. Firstly, I notice that this diagram does prove my hypothesis that the particpants would have faster reaction times in the morning than in the afternoon. I notice that the interquartile range of both boxplots are similar. The interquartile range of the ‘am’ boxplot (the blue diagram at the top) is further towards the left than that of the ‘pm’ boxplot (the yellow diagram at the bottom). This shows that on average, the girls had faster reaction times in the morning than in the afternoon. However, the whisker on the ‘pm’ boxplot stretches further to the left. This tells me that to get a better idea of whether or not my hypothesis is accurate, I need to check for and eliminate any outliers. To work out which values are outliers, I must find the interquartile range and multiply it by 1.5 (call this value f ). Outliers lie between f to the left of the lower quartile and f to the right of the upper quartile.
f is equal to 4.5x1.5 which is 6.75
The lower quartile is 15.5 and the upper quartile is 20
Therefore any reaction time below 8.75 and above 26.75 is an outlier.
I will now eliminate the outliers. There were in fact no outliers in the ‘am’ boxplot but several in the ‘pm’ boxplot. This is my new boxplot.
Here I can again see that the interquartile range of the ‘am’ boxplot is further to the left, proving my hypothesis. However this time I notice that the ‘am’ boxplot has a smaller inter quartile range than the 'pm' boxplot. This means that there must be a few times that are not within the inter quartile range that are very spread out. I notice that wheras the 'am' boxplot is nearly symmetrical (perhaps with a slight positive skew), the 'pm' boxplot has quite a pronounced positive skew. This will therefore probably be the same in the histograms.
This first histogram is for 'am'
It is almost perfectly symmetrical, with normal distribution. However, to check it is of normal distribution, I must use standard deviation. Standard deviations are measured either side away from the mean.
65% of the data must be within one s.d. either side of the mean
95% of the data must be within two s.d. either side of the mean
99% of the data must be within three s.d. either side of the mean
As all three of these requirements have been met, I can conclude that my 'am' results are of normal distribution.
This is my histogram for the 'pm' results. I can see that, just as the boxplot was, this is positively skewed. This means that most of the data is of higher values (ie a longer time as it took the participants longer to catch the ruler) , and therefore most of the people were slow in this afternoon histogram. This is not of normal distribution as 95% of the data is not within 2 s.d either side of the mean.
I can conclude that my hypothesis has been proven, and that ‘Girls between 11 and 16 at Withington Girls School DO have faster reaction times in the morning than in the afternoon.’
As a small extension, I decided to test three other girls outside our school aged between 11 and 16. This will show whether the data I collected was representative of the entire population of girls aged 11-16. I predict that the results should be around the same as which school you attend should not affect whether or not your reaction times are quicker in the morning than in the afternoon. The results were as follows..
These results show that girls aged 11-16 have quicker reaction times in the morning than in the afternoon. However, to really prove this, I would need to take reaction times from many more girls aged 11-16 across the country.
If I had had more time, I would have looked into whether or not the same hypothesis could be proved in males aged 11-16. I would also have seen if age made a difference, for example would people aged 71-76 have slower reaction times than those aged 11-16. I could have looked at all sorts of aspects (age, gender, environment etc) and seen how these altered reaction times. The variables I listed at the beginning could have been altered to see how this affected reaction times.
I might also have taken certain measures to ensure my data was more accurate. For example I could have…
- taken a larger sample size- in a larger sample, trends would have been easier to identify
- made the participants repeat the experiment more than 5 times for each dominant and non dominant in am and pm.
- used a computerized device to measure the reaction times- results such as it taking some participants 0 hundredths of a second are clearly not possible and therefore inaccurate, showing how easy it is for human error to take place