Rogue values are implausible estimates that do not follow the natural pattern of the other results. In my investigation, rogue value will count as any values less than (but not including) 1.00m and any over (but not including) 2.00m. I will not include these values as they could greatly affect my final calculations such as the mean and standard deviation.
I am now going to calculate and record my fifty numbers for each year group. Below are my two tables of lengths.
In my Year Seven data set I found two rogue values (11.75m and 12.13m). I disregarded these and chose new values. I found no rogue values in my Year Eleven data set.
Accuracy and Reliability
When the experiment involving the bamboo stick was carried out, there are many factors that could have affected the data collected. I do not know precise details about the experiment, such as where and how it took place. If some pupils saw the stick in a large room and others saw it in a smaller room, it may affect how small or large pupils perceive the stick’s length. Also, if some saw it from a farther distance than others then it could seem smaller. The experiment, ideally, should have taken place in a classroom and all pupils should have seen it from the same distance. Also, each pupil should have seen it for the same amount of time and the stick should have been held at the same height and angle (horizontally).
Some pupils may have chosen to give a silly answer on purpose presenting me with rogue values. There may also have been typing errors or some other error in the recording of the data. I must also acknowledge the fact that I decided on boundaries for what I would count as acceptable values. I counted rogue values as under 1.00m and over 2.00m, but if I had raised or lowered the boundaries my mean and standard deviation calculations may have been affected, if only slightly.
As I have only taken a sample of fifty pupils’ estimates then I may be losing some accuracy. Also, I am going to group the data into classes and then do my calculations using the mid-points of these classes, therefore possibly altering my final results.
Frequency Table
As the sample of fifty is still quite large, I will now put these results into a grouped frequency table. This will help when calculating the mean and standard deviation of both sets of data, and also in the construction of my histograms.
The results of my mean and standard deviation calculations will be given to two decimal places. This is because the bamboo stick was also measured, and estimated, to two decimal places.
Year 7 Grouped Frequency Table
Year 11 Grouped Frequency Table
Grouped Mean
Using my above grouped frequency tables, I will now give estimates for the means of Years Seven and Eleven; it is known as an estimate because I have used mid-points. The mean is a type of average. In support of my hypothesis, I predict that Year Eleven’s mean will be close to 1.58m (the correct length of the stick) than Year Seven’s mean.
Year 7
Mean = Σfx
Σf
Mean = 74.7
50
= 1.494 = 1.49m
Year 11
Mean = Σfx
Σf
Mean = 78.1
50
= 1.562 = 1.56m
My results are as I expected them to be. The mean of Year Seven’s results is 9cm away from the actual length, as opposed to Year Eleven’s mean, which is only 2cm away from 1.58m. This supports my hypothesis as it shows Year Eleven’s mean to be closer to 1.58m, therefore proving that their estimates were more accurate than Year Seven’s.
I will now calculate the standard deviation, to provide further evidence in support of my hypothesis.
Standard Deviation
Standard deviation is the measure of spread about the mean of the data collected. I expect the standard deviation of Year Eleven’s results will be significantly smaller, as more of their year group will have estimated well, as stated in my hypothesis. The standard deviation results should further support my hypothesis. Below is the formula I will use to calculate standard deviation:
= Σfx² Σfx ²
Σf Σf
Year 7
S.D. = 114. 885 74.7 ²
50 50
= 0.25624…
= 0.26
Year 11
S.D. = 123.965 78.1 ²
50 50
= 0.19863…
= 0.20
My results are as expected. Year Seven’s standard deviation was 0.26 – larger than Year Eleven’s 0.20. This shows that Year Seven has a more spread out set of data, and so Year Eleven’s estimates were more accurate, proving my hypothesis correct.
Histograms
A histogram is a type of chart. It is similar to a bar chart in that it is a chart, with data represented by bars. However, a histogram shows frequency density along the y-axis, that is, how dispersed data is. Frequency is calculated by finding out the area of a bar in a histogram. In some histograms, bars can be of different widths according to the class size along the x-axis. In my histograms, all the class widths will be 0.1m.
I will use my two histograms to find out the percentage of Year Sevens and Year Elevens that have estimated within 6cm either side of the correct 1.58m measurement. I predict that 50% of Year Elevens and 30% of Year Sevens will have estimated a length between 1.52m and 1.64m.
Because of my predictions above, I expect Year Eleven’s histogram to show quite a low dispersion of results – the centre bars will be a lot higher than the outer bars. I expect the opposite in my Year Seven’s histogram, which should show a high dispersion of data. The chart will have medium or small-sized bars for every class. The modal bars on both histograms will include the actual length of 1.58m.
Histograms are drawn using a frequency density table. This table lists the frequency, class-width and frequency density of all data. Below is the formula for frequency density:
Frequency Density = Frequency
Class Width
Year 7 Frequency Density Table
Year 11 Frequency Density Table
My histograms for Year Seven and Year Eleven are on the next page.
Area of Histograms
I am now going to find the percentage of estimates that were within 6cm of the mean. To do this, I firstly have to calculate the areas of the bars. I will find the class width by taking 6cm from either side of the actual length of 1.58m. On my histogram the class widths will be 0.08 on the first bar and 0.04 on the second. I will multiply the class width by the frequency density on the y-axis.
Year 7 Areas –
Area A = 0.08 x 130
= 10.4
Area B = 0.04 x 40
= 1.6
Total = 12 (pupils)
Year 11 Areas –
Area A = 0.08 x 150
= 12
Area B = 0.04 x 120
= 4.8
Total = 16.8 (pupils)
My ‘Total’ values represent how many pupils estimated within 6cm either side of the correct length of 1.58m. I can now use these values to calculate the percentage of correct estimates within 6cm either side.
Year 7 Percentage –
12 x 100 = 24%
50
Year 7 Percentage –
16.8 x 100 = 33.6%
50
My results support my hypothesis in that they prove Year Eleven made better estimates than Year Seven. However my percentage predictions were not exactly correct. The actual percentage of Year Sevens who guessed within my range was 6% less than predicted and the actual percentage of Year Elevens was 16.4% less than predicted. However, Year Elevens’ estimates were still better than Year Sevens’, as predicted.
My histograms look like I expected them to. Year Eleven had a low dispersion of results. The actual length of the stick (1.58m) was included in the modal group on Year Eleven’s histogram. Year Seven’s histogram shape was also as I expected. The outer bars tended to be small and the modal bar did include 1.58m but not the mean of 1.49m.
My Conclusion
After doing my three types of calculations, I have provided enough evidence to support my hypothesis. I predicted that Year Eleven would have more accurate estimates than Year Seven. Below are the three calculations I carried out and their results:
The mean supports my hypothesis as it shows Year Eleven’s mean to be only 2cm away from the actual length of 1.58m, therefore proving that their estimates were more accurate than Year Seven’s, as Year Seven’s mean was 9cm away.
The standard deviation of Year Seven was bigger than that of Year Eleven than 0.06. This showed that Year Eleven had a smaller dispersion of data about the mean, meaning that their year were more accurate, as a whole, at estimation.
My histograms also proved my hypothesis correct, although my percentages were not correct. The actual percentage of Year Sevens who guessed within my range was not 30%, as I predicted, but 24% and the actual percentage of Year Elevens was 16.4% less than my prediction of 50%.
My Evaluation
Overall, my investigation went very well as I encountered no problems. All of my calculations came out as expected.
However, the accuracy of my investigation could have been improved. For example, I chose to use a random sample of fifty pupils from each year instead of the full number of over 170. Also, when I calculated the mean and standard deviation I used mid-points as opposed to actual values for the fx and fx² values. When constructing my histogram using frequency densities, I grouped my frequencies together into classes in addition to using mid-points.
There are a number of ways I could improve my investigation into estimation skills, through the analysis of other affecting factors. I could look at more than one year group or age group, such as Years 7-11 or primary schools. I could compare the estimations of adults or the elderly to those of young people. I could look at the affect of gender on estimation, by comparing the results of males to females. I could also look for similarities or differences in schools in several areas or countries.
Instead of looking at one stick length, my investigation could have looked at multiple sticks. This would show who was good at estimation accurately and who randomly guessed.
Finally, I could widen my range of results by looking at multiple areas of estimation. For example, I could investigate the estimation of area, volume or weight, as well as length. Again, this would show who was actually good at estimation, who is good at guessing and who is better at certain types of investigation. For example, I could see which age group is better at weight estimation and which are better at length estimation. I could collect several sets of results and compare them.