# "The lengths of lines are easier to guess than angles. Also, that year 11's will be more accurate at estimating."

Hisham Band                Maths Coursework – “Guesstimate”

In this investigation, 3 year groups – years 9, 10 and 11, were asked to estimate the lengths of some lines and angles, and the results that the pupils produced are going to be analysed to try and prove or disprove the hypothesis of:

“The lengths of lines are easier to guess than angles. Also, that year 11’s will be more accurate at estimating.”

The reasons I think these things are because people are more used to seeing lines than they are angles, so this could mean that they are better at estimating the length of lines. The reason I think they year 11’s will be more accurate is because they have done maths longer than the year 9’s, so they have had more experience.

I will be using an example of one line, and one angle, and the results of Year 9 and Year 11 estimates. This is secondary data which has been previously recorded, during a survey to find out the estimates that the pupils gave. This data is continuous as it is As there are 117 year 9’s and 145 year 11’s I will have to reduce the size of my sample as these numbers are too large to handle, so I will be using a stratified method to reduce the size of the samples as this method keeps the results for the year groups in proportion to each other.

I am going to be sampling 60 people in total, out of the year 9’s and year 11’s, as this is a manageable amount, and it can represent the data from the two year groups accurately as a smaller number might not show the difference in results suitably.

To choose my samples I am first going to add together the two total numbers of each year group, which is:

145 + 117 = 262  (Year 11 / Year 9)

Then I am going to do some calculations. For the year 11’s I am going to do:

(145 / 262) x 100 = 55.3

This means I need to have 55% of the sample of 60 from year 11’s results. 55% of 60 is 33, so I need 33 samples to be Year 11 samples.

For The Year 9’s I am going to do:

(117 / 262) x 100 = 44.6

This means that 45% of the sample of 60 need to be Year 9 results. 45% of 60 is 27, so I need 27 Year 9 samples, which gives you the total of 60 samples.

To get these 60 samples from the 262 results I am going to use a random systematic method. To do this I will use a random number generator to find a number from the year 11 and year 9 data, and I am then going to count down from that number, and every 7th piece of data I am going to use. (As 7 was the number that came up when I used a random number generator to find a number between 0 and 10.)

Year 9 Random Number Generator

91 was the number the generator produced for the year 9’s, so I am going to use the 91st piece of data, and then every 7th piece of data after that I am going to use until I have my 27 pieces of data. So, the numbers I am going to use are:

91, 98, 106, 113, 3, 10, 17, 24, 31, 38, 45, 52, 59, 66, 73, 80, 87, 94, 101, 108, 115, 5, 12, 19, 26, 33, 40

Year 11 Random Number Generator

127 was the number the generator produced, so I am going to start from the 127th piece of data and count down 6 pieces of data (as 6 was the number produced from the generator between 0 and 10) and then every 6th piece of data after that I am going to use until I have the 33 pieces of data I need for the year 11’s. So the pieces of data are:

127, 133, 139, 145, 6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66, 72, 78, 84, 90, 96, 102, 108, 114, 120, 126, 132, 138, 144, 5, 11, 17, 23, 29

Once I have collected my samples, I am going to draw some grouped frequency tables, which will also have frequency density on. These tables are there so that I can find the mean from grouped data. Also, because the data is put into groups it is easier to handle. I will also find out the spread of the data from the mean using standard deviation.

Then, I am going to draw some histograms, using the frequency density from the grouped frequency tables. These will show the density of the data in certain groups. This shows which group had the most estimates in it.

Next, I am going to draw some cumulative frequency tables and curves. These will show the inter-quartile range. This shows the range of the quartiles. Also, from the cumulative frequency curves I can draw some box plots. These will show the inter-quartile range, median and lower and upper quartiles in a more compact and easy to read way.

Then, I am going to draw some percentage error tables. These will show the error of the estimates and if people estimated below or above the actual size or length of the line.

I am then going to draw some scatter graphs showing the errors from the percentage error tables. From these you will be able to see if some guessed below the actual length of the line, and whether or not they guessed below the size of angle as well.

Then I am going to draw stem and leaf diagrams for each year group. From these I will be able to find the median and mode. Stem and leaf diagrams show all the data in an easy to read way.

Finally, I am going to find The Spearman’s Coefficient of rank. This shows whether or not there will be negative or positive correlation in the scatter graphs which I will then draw. These will show the estimates of the line for one individual person plotted against their estimate for the angle. From these scatter graphs you can see whether or not anybody guessed exactly the correct size or length.

These things should help me prove or disprove my hypothesis.

I have recorded the estimates in a table so as to know which pieces of data I am using they have been highlighted. I am using ICT to do parts of my work as it spreadsheets can work things out extremely quickly, but I will also check and record how I would work things out.

First of all I am going to draw some grouped frequency tables, which also show frequency density. This will make the data easier to handle and will mean I can draw a histogram, and grouped frequency graphs. Also, from the frequency tables I am going to find the mean, and I am also going to use standard deviation to find the spread of the number from the mean. If the spread is smaller, it means that the year group guessed closer to the mean value. An advantage of using standard deviation is that you use all of the data.

This is a table to show the Year 9 estimates for the length of line  2.

The average of results from the above table is the total (f) x (x) column divided by the total frequency. This is 140.75/27=5.21cm. This is 0.61cm longer than the actual length of the line. This is not very much, which means the year nines were quite accurate in estimating the length of the line.

To find the spread of this data from the mean I am going to use the equation for standard deviation from grouped data, which is:

Efx²    -   (mean)²

Ef

So, for the above table I would do:

(3x3.5²)+(4x4.25²)+(5x4.75²)+(8x5.5²)+(7x6.5²)+(0x8²)

27

This equals 28.13194, which I will now subtract the mean² from this. The mean² is 5.21², which gives 27.1441. I will subtract this to give me 0.987844, which I now find the square root of this answer, which is 0.9939. This is the spread of data from the mean. This is quite a low spread.

This is a table to show the Year 9 estimates for the size of angle 6.

The mean of the above table is 1234/27=45.70°. This is 12.7° bigger than the actual angle. This is quite a large amount which means that the year 9’s were not very accurate in their estimates of angle 6. They were better at estimating the length of the line.

To find standard deviation from this I will do:

(1x24.5²)+(3x32²)+(7x37²)+(7x42²)+(4x47²)+(5x74.5²)

27

This gives the answer of 2303.4, which I will now subtract 45.70² from to give 214.91. I then have to find the square root of this answer. This gives 14.7. This spread is quite high, which means that the estimates given by the year 9’s for the size of angle 6 was quite big. You can also see that the year nines had a lower spread of data for the length of line 2.

This is a table to show the Year 11 estimates of the length of line 2.

The mean of the above table is 169/33=5.12cm. This is 0.52cm bigger than the actual length of the line. This is very low, which shows they were quite accurate in their estimates and were better than the year 9’s.

To find standard deviation from this I will do:

(3x3.5²)+(9x4.25²)+(3x4.75²)+(14x5.5²)+(2x6.5²)+(2x8²)

33

This gives me an answer of 27.36, from which I will now subtract the mean², which is 5.12². This is 26.22, and subtracted from the precious calculation, the answer given is 1.14, which ...