• Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month
Page
  1. 1
    1
  2. 2
    2
  3. 3
    3
  4. 4
    4
  5. 5
    5
  6. 6
    6
  7. 7
    7
  8. 8
    8
  9. 9
    9
  10. 10
    10
  11. 11
    11
  12. 12
    12
  13. 13
    13
  14. 14
    14
  15. 15
    15
  16. 16
    16
  17. 17
    17
  18. 18
    18
  19. 19
    19
  20. 20
    20
  21. 21
    21
  22. 22
    22
  23. 23
    23
  24. 24
    24

Analyse a set of results and investigate the provided hypothesise.

Extracts from this document...

Introduction

Introduction

My name is Khalil Sayed-Hossen, I’m a year10 student and am carrying out the “Guesstimate” coursework task. For this coursework I am going to analyse a set of results and investigate the provided hypothesise.

Plan

Within the duration of producing this (Guestimate) coursework, I will first investigate the hypothesis given, that people estimate the length of lines better than the size of angles. Once I have done this I will begin to investigate hypothesise of my own. I will need to find away of proving and disproving these hypothesise through analysing relevant data.

The data I will be using is from a pooled set of results that members of my class have collected and combined together to form a broad, clearer set of results. To be able to compare a set of results there must be a clear comparison. Since the results of the length of the line were given in the mm and the size of the angle in °(degrees) there is no clear comparison. To be able to compare these two different types of data I will need to calculate the percentage error for each result. This is done by first calculating the differences between the actual size of the angle and the length of the line, i.e. errors, and then by using the formula: -

Error ÷ Correct × 100 = percentage error

Ways in which I can compare this data include, looking at the mean of the results, standard deviation and through producing scatter graphs. Scatter graphs are useful as, once the line of best fit has been drawn we can then analyse the inter-quartile range. I will also use any other methods that become apparent during the duration of this coursework and apply them when investigating my other hypothesis as well.  

...read more.

Middle

40

41

F

-5

-11.11111111

27

50

14

M

5

11.11111111

28

55

50

M

10

22.22222222

29

40

71

F

-5

-11.11111111

30

20

16

F

-25

-55.55555556

31

50

14

M

5

11.11111111

32

40

14

M

-5

-11.11111111

33

40

41

F

-5

-11.11111111

34

60

15

M

15

33.33333333

35

70

14

M

25

55.55555556

36

53.2

28

M

8.2

18.22222222

37

40

34

F

-5

-11.11111111

38

45

45

F

0

0

39

37

79

F

-8

-17.77777778

40

10

12

F

-35

-77.77777778

I will start by investigating the line.

I first calculated the errors, by subtracting the correct length of the line away from the guesses. Once I had calculated the errors I was then able to use the percentage error formula:

Error ÷ Correct × 100

= percentage error

In excel we do this in the percentage error column by dividing the first data point in the line error column by 45, then by multiplying this by 100 to find the percentage.

This found the percentage error for the first data point, to find the percentage error for all the other data points, because the formula is the same for each of the other data points in this column we simply highlight the first data point using the right click of the mouse, drag down and the formula works out the percentage error in each cell.  

Calculating the percentage error for angle guesstimates

angle

age

gender

Angle error

Angle percentage errors (%)

1

30

78

M

-6

-16.66666667

2

52

12

F

16

44.44444444

3

43

45

F

7

19.44444444

4

45

14

M

9

25

5

40

46

M

4

11.11111111

6

50

14

M

14

38.88888889

7

45

17

F

9

25

8

40

45

F

4

11.11111111

9

32

44

M

-4

-11.11111111

10

30

14

M

-6

-16.66666667

11

70

47

F

34

94.44444444

12

40

15

M

4

11.11111111

13

36

14

F

0

0

14

35

61

M

-1

-2.777777778

15

40

45

F

4

11.11111111

16

30

41

M

-6

-16.66666667

17

40

46

F

4

11.11111111

18

40

16

F

4

11.11111111

19

38

36

M

2

5.555555556

20

45

32

F

9

25

21

40

66

M

4

11.11111111

22

35

34

M

-1

-2.777777778

23

35

34

F

-1

-2.777777778

24

40

62

M

4

11.11111111

25

35

46

F

-1

-2.777777778

26

40

41

F

4

11.11111111

27

45

14

M

9

25

28

45

50

M

9

25

29

9

71

F

-27

-75

30

45

16

F

9

25

31

45

14

M

9

25

32

50

14

M

14

38.88888889

33

45

41

F

9

25

34

50

15

M

14

38.88888889

35

75

14

M

39

108.3333333

36

47.2

28

M

11.2

31.11111111

37

35

34

F

-1

-2.777777778

38

45

45

F

9

25

39

45

79

F

9

25

40

45

12

F

9

25

When calculating the percentage error for the angle guesstimates, we repeat the same process needed to work out the percentage errors for the line guesstimates. Except in this case we divided the errors by 36, as this was the correct size of the angle.

Now that I have calculated the percentage errors for all data points of line and angles within my sample data, I will be able to proceed with my fist method of proving or disproving the hypothesis, this will be by calculating the mean of line percentage errors and angle percentage errors. I will then compare both means.

 

Calculating the mean of the line percentage errorsimage00.png

Line percentage errors (%)

11.11111111

22.22222222

11.11111111

11.11111111

6.666666667

22.22222222

44.44444444

33.33333333

16.66666667

33.33333333

122.2222222

33.33333333

33.33333333

11.11111111

11.11111111

33.33333333

33.33333333

11.11111111

0

33.33333333

0

44.44444444

22.22222222

11.11111111

11.11111111

11.11111111

11.11111111

22.22222222

11.11111111

55.55555556

11.11111111

11.11111111

11.11111111

33.33333333

55.55555556

18.22222222

11.11111111

0

17.77777778

77.77777778

Line percentage errors (%)

-11.11111111

-22.22222222

11.11111111

11.11111111

6.666666667

22.22222222

-44.44444444

-33.33333333

-16.66666667

33.33333333

122.2222222

33.33333333

-33.33333333

11.11111111

11.11111111

33.33333333

-33.33333333

-11.11111111

0

-33.33333333

0

44.44444444

22.22222222

11.11111111

-11.11111111

-11.11111111

11.11111111

22.22222222

-11.11111111

-55.55555556

11.11111111

-11.11111111

-11.11111111

33.33333333

55.55555556

18.22222222

-11.11111111

0

-17.77777778

-77.77777778

To calculate the mean percentage error, we need to use the usual method of calculating any mean result. We need to add up all the percentage error data points and divide by how many data points there are. But before we can do this we need to make any negative percentage error data points positive. If this is not done, when we add up all the data, the negative data will subtract itself from any positive data, and this we do not want, as we are only looking at the percentage of which they were away from the correct, weather or not the guess was too high or too low, is insignificant.

Adding all percentage errors

To add the percentage errors we need to convert the negatives into positives, as said earlier. I did this in excel by squaring each negative percentage, by using the formula ^2, and then square rooting each percentage. Once I had done this I was able to add up all the percentage errors by first highlighting all the data points in the percentage error column and then by using the formula ∑ in excel, which means the sum of.This gave me the sum of all the percentage errors for the line, and the angle. The sum of the percentage errors for the line was 981.5555556% and for the angles 795%.

Line percentage errors (%)

Angle percentage errors (%)

11.11111111

16.66666667

22.22222222

44.44444444

11.11111111

19.44444444

11.11111111

25

6.666666667

11.11111111

22.22222222

38.88888889

44.44444444

25

33.33333333

11.11111111

16.66666667

11.11111111

33.33333333

16.66666667

122.2222222

94.44444444

33.33333333

11.11111111

33.33333333

0

11.11111111

2.777777778

11.11111111

11.11111111

33.33333333

16.66666667

33.33333333

11.11111111

11.11111111

11.11111111

0

5.555555556

33.33333333

25

0

11.11111111

44.44444444

2.777777778

22.22222222

2.777777778

11.11111111

11.11111111

11.11111111

2.777777778

11.11111111

11.11111111

11.11111111

25

22.22222222

25

11.11111111

75

55.55555556

25

11.11111111

25

11.11111111

38.88888889

11.11111111

25

33.33333333

38.88888889

55.55555556

108.3333333

18.22222222

31.11111111

11.11111111

2.777777778

0

25

17.77777778

25

77.77777778

25

24.53888889

23.625

Finding the mean percentage error

What I did next was divide both numbers by 40, as this was the amount of data points. I was left with the products,24.53888889% for the line, and 23.625% for the angles, which were the mean percentage errors. These are highlighted in yellow.

The hypothesis states that people estimate lines better than angles. From information I have gathered through calculating the mean result of the percentage errors I have found that my findings contradict the hypothesis, and that people tend to estimate the size of angles better than the length of lines. My assumption that people will estimate the size of the angle better than the length of the line, for reasons mentioned earlier, was found to be true through this investigation.

If I were able to make these findings more reliable I would have sampled a larger amount of data from a more extensive pool of data, as this would have decreased the effect that unreliable, bias data had on the mean.

I will now investigate through other methods of proving and disproving the hypothesis.

Cumulative frequency

I could have at this point produced a frequency graph, but due to limitation in time I have decided to produce a cumulative frequency graph as this is a clearer, indicative representation of data, and I will be able to deduce more information from it.

If we represent the percentage errors of both line and angle percentage errors individually in frequency tables, we can calculate cumulative frequencies. Once we have done this we can use these new values, when plotted and on a graph, to form a cumulative frequency curve. This is useful as we will be able to find the median from the halfway point, and we will be able to locate the upper and lower quartiles.

The upper quartile is 75% and the lower quartile is 25 %. From knowing the upper and lower quartile, we can calculate the inter-quartile range. This is found by subtracting the lower quartile from the upper quartile. The inter quartile range is half of the data distribution and shows how widely spread the data is, if the inter-quartile range is small, then the distribution is bunched together and shows more consistent results, if the inter-quartile range is large, then the distribution is spread and shows a wider variation in results.

We can compare both the line inter-quartile range and the angle inter-quartile range, and whichever is smallest, will be the most accurate, as this would mean a smaller percentage error.

Line percentage errors cumulative frequency table

Line percentage errors (%)

Frequency

cumulative frequency

upper limits

0.-10

4

4

≤ 10

11-.20

17

21

≤ 20

21-30

5

26

≤ 30

31-40

8

34

≤ 40

41-50

2

36

≤ 50

51-60

2

38

≤ 60

61-70

0

38

≤ 70

71-80

1

39

≤ 80

81-90

0

39

≤ 90

91-100

0

39

≤ 100

101-110

0

39

≤ 110

111-120

0

39

≤ 120

121-130

1

40

≤ 130

...read more.

Conclusion

√ ∑(x - x) ² ÷n.  So I firstly had to work out the sum of the (x-x) ² column, the product was 13045.912. I then divided this number by 40, to find the mean of the data, as this is the number of data points and the product was 326.14781.The final calculation I had to make to conclude with the standard deviation was to square root the mean, as I needed to find the original unit of measure, in this case it was percentage.

The standard deviation of the male line and angle estimates is 25.8% to 3.sf.



Comparing data

From investigating my hypothesis, I have found that through investigating the mean of the percentage errors for male and female estimates, males were more accurate. But when I investigated the percentage errors through standard deviation, I found that females were more consistent with estimating and that female estimates were more typical of the mean than male estimates. But this is irrelevant as the data still shows that males were more accurate as the standard deviation of the male estimates was 18.1% and the standard deviation of female estimates was 25.8%, which is a difference of 7.7%. My findings contradict my hypothesis and males were more accurate at estimating lengths of lines and size of angles.  

Evaluation

I believe that I have investigated both hypotheses as much as I could have in the time I have been given. The conclusions I have come to through my findings were based upon the data pooled by my class. I believe that some of this data may have been unreliable due to errors etc. I believe that with a more extensive pool of data, my findings would have been more conclusive an indicative a true representation.

        I have reached the end of my investigation. If the time allocation was greater, I could have investigated another hypothesis such as “Younger people estimate lines and angles better than older people”.

STATISTICAL COURSEWORK

GUESSTIMATE

COURSEWORK

Khalil Sayed-Hossen 10B

Khalil Sayed – Hossen 10B

...read more.

This student written piece of work is one of many that can be found in our AS and A Level Probability & Statistics section.

Found what you're looking for?

  • Start learning 29% faster today
  • 150,000+ documents available
  • Just £6.99 a month

Not the one? Search for your essay title...
  • Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

See related essaysSee related essays

Related AS and A Level Probability & Statistics essays

  1. Study of the height/diameter ratio of limpets inhabiting the middle shore region of exposed ...

    would also be existent, but low in number. A normal distribution for this shore would look like this: The sheltered shore sustains limpets with a higher ratio (increased tallness, reduced diameter, 0.589 being the mean). This suggests that the limpets do not need to compromise their tallness and the size of the foot does not have to be large to survive.

  2. I am investigating how well people estimate the length of a line and the ...

    by 2 to get the mean average of males and females In year 7 combined. * Add together the mean average of males and females in year 10 and divide it by 2 to get the mean average of males and females In year 10 combined.

  1. Guestimate - investigate how well people estimate the length of lines and the size ...

    of Students Total No. in Year Yr 7. Top Set 52 Yr 7. Middle Set 82 Yr 7. Lower Set 47 181 Yr 10. Top Set 62 Yr 10. Middle Set 83 Yr 10. Lower Set 38 183 I then used this method: number in set x 30 = number

  2. Statistics - Men are more accurate than women at estimating the length of a ...

    As you can see the estimating from both genders was ever so close but my 2 hypotheses are correct. Using the data from my cumulative frequency table I then drew up some cumulative frequency graphs, by doing this I can see the mid-interval value, upper quartile and lower quartile.

  1. "The lengths of lines are easier to guess than angles. Also, that year 11's ...

    From these I will be able to find the median and mode. Stem and leaf diagrams show all the data in an easy to read way. Finally, I am going to find The Spearman's Coefficient of rank. This shows whether or not there will be negative or positive correlation in the scatter graphs which I will then draw.

  2. Statistics coursework

    6 16 7 Female 91 3 4 4 11 7 Female 101 4 4 4 12 7 Female 108 5 4 5 14 KS2 Results Total of KS2 Year Group Gender IQ English Maths Science results 7 Male 101 4 4 5 13 7 Male 101 4 4 4 12

  1. Anthropometric Data

    This will mean that if the data are included in the linear regression then the fitted replica will be poor every. Positive correlation From this I'm able to say that I have met my first prediction of having a positive correlation has the ellipse is sloping upwards direction.

  2. Frequency curves and frequency tables

    The steps for calculating the mean for grouped data can be summarized as follows: 1. find the mid-value for each class. 2. multiply the frequency and the mid-value of each class. 3. find the total frequencies and the sum of all the products of frequencies and mid values. 4.

  • Over 160,000 pieces
    of student written work
  • Annotated by
    experienced teachers
  • Ideas and feedback to
    improve your own work