• Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month
Page
  1. 1
    1
  2. 2
    2
  3. 3
    3
  4. 4
    4
  5. 5
    5
  6. 6
    6
  7. 7
    7
  8. 8
    8
  9. 9
    9
  10. 10
    10
  11. 11
    11
  12. 12
    12
  13. 13
    13
  14. 14
    14
  15. 15
    15
  16. 16
    16
  17. 17
    17
  18. 18
    18
  19. 19
    19
  20. 20
    20
  21. 21
    21
  22. 22
    22
  • Level: GCSE
  • Subject: Maths
  • Word count: 4816

statistics coursework

Extracts from this document...

Introduction

Handling data- Statistics coursework

In this coursework, I will be comparing weight of the pupils with the amount of hours of T.V they watch a week and I will observe the correlation between these two factors. I hypothesis that the amount of T.V watched per week will affect weight. I hypothesis that the more TV watched per week, the higher weights. I chose this line of enquiry since I think that there may be a correlation between the two, as people who watch lots of TV will be less active and do less exercise and so will weigh more. I will be investigating to see whether this is true or not. As well as my original hypothesis I will also enquire into the difference between genders, I will investigate wherever boys or girls weigh more and wherever boys or girls watch more TV. I initially believe that boys weigh more then girls, as from my own knowledge I believe they are normally bigger and taller which should cause higher weights. I will find out wherever this is true or not.

The following table is data based on the boys and girls who attend Mayfield High school.

Year group

Boys

Girls

Total

7

151

131

282

8

145

125

270

9

118

143

261

10

106

94

200

11

84

86

170

Stratified sampling

A stratified sample is a sample in which you have a proportional amount from each year so that it is a fair sample. From my data, I need to take a proportional amount from each year, boys and girls.

For example:

There are 145 boys in year 8. In total, there are 604 boys in Mayfield high. For my sample, I will need 40 boys from the 604 boys there are. Therefore,

145 x  40 = 9. Therefore I will take 9 boys from year 8.

  604

I have used this method for each year group, boys and girls.

Year group

Boys

Girls

Total

7

10

9

19

8

9

9

18

9

8

10

18

10

7

6

13

11

6

6

12

I will need samples of 40 boys and 40 girls.

...read more.

Middle

Mid interval value x Frequency

30 < w ≤40

3

35

3 x 35 = 105        

40 < w ≤50

22

45

22 x 45 = 990

50 < w ≤60

12

55

12 x 55 = 660

60 < w ≤70

2

65

2 x 65 = 130

70 < w ≤80

1

75

1 x 75 = 75

80 w < 90

0

85

0 x 85 = 0

                            40                                                     1960        

Estimated mean =1960 ÷ 40 = 49  estimate mean = 49

The average mean for the weight of girls is lower than the average weight of boys, which proves that boys have generally a higher weight than girls.

Median interval

Weight

frequency

Cumulative frequency

30 < w ≤40

3

3

40 < w ≤50

22

25

50 < w ≤60

12

37

60 < w ≤70

2

39

70 < w ≤80

1

40

80 w < 90

0

40

   40  = 20

    2

I can see from the table that the median interval lies between 41 < w ≤ 50.

Mode

The most frequent weight lies between 41 < w ≤50.

Range

72 – 30 = 42

Average hrs. T.V watched per week

Frequency

Mid interval value

Mid interval value x Frequency

0 < h ≤ 5

1

2.5

2.5 x 1 = 2.5

5 < h ≤ 10

1

7.5

7.5 x 1 = 7.5

10< h ≤ 15

13

12.5

12.5 x 13 = 162.5

15< h ≤ 20

15

17.5

17.5 x 15 = 262.5

20< h ≤ 25

3

22.5

22.5 x 3 = 67.5

25< h ≤ 30

3

27.5

27.5 x 3 = 82.5

30< h ≤ 35

2

32.5

32.5 x 2 = 65

35< h ≤ 40

2

37.5

37.5 x 2 = 75

Boys

                                40                                                             725

Estimated Mean =725 ÷ 40 = 19 estimate mean = 18.125

Median interval

Average no. hrs T.V watched a week

Frequency

Cumulative frequency

0 < h ≤ 5

1

1

5 < h ≤ 10

1

2

10< h ≤ 15

13

15

15< h ≤ 20

15

30

20< h ≤ 25

3

33

25< h ≤ 30

3

36

30< h ≤ 35

2

38

35< h ≤ 40

2

40

40 + 1  = 20.5

    2

I can see from the table that the median interval lies between 16< h ≤ 20

Mode

The mode is the most common term (highest frequency). In this case, the mode is 16< h ≤ 20 because that has the highest frequency.

Range

40 – 3 = 37

GIRLS

Average hrs. T.V watched per week

Frequency

Mid interval value

Mid interval value x Frequency

0 < h ≤ 5

2

2.5

2.5 x 2 = 5

5 < h ≤ 10

5

7.5

7.5 x 5= 37.5

10< h ≤ 15

9

12.5

12.5 x 9 =112.5

15< h ≤ 20

8

17.5

17.5 x 8 = 140

20< h ≤ 25

5

22.5

22.5 x 5 = 112.5

25< h ≤ 30

6

27.5

27.5 x 6 = 165

30< h ≤ 35

3

32.5

32.5 x 3 = 97.5

35< h ≤ 40

2

37.5

37.5 x 2 = 75

                            40                                                      745

745 ÷ 40 = 18.625 estimated mean

The girls estimated mean is slightly higher then the boys estimated mean (by .5), which shows girls watch slightly more then TV then the boys.

Median

Average no. hrs T.V watched a week

Frequency

Cumulative frequency

0 < h ≤ 5

2

2

5 < h ≤ 10

5

7

10< h ≤ 15

9

16

15< h ≤ 20

8

24

20< h ≤ 25

5

29

25< h ≤ 30

6

35

30< h ≤ 35

3

38

35< h ≤ 40

2

40

40 + 1  = 20.5

    2

The median interval in this case lies between 16< h ≤ 20.

The boys and girls have the same median interval that suggests both boys and girls watch similar amounts of television.

Mode

The mode is the term, which has the highest frequency. In this case, the mode is 11< h ≤ 15.

Range

40 – 3 = 37

Stem and leaf

Since the data is grouped into class intervals, it also makes sense to record it in a stem and leaf diagram. This will make it easier to read off the median values.

Boys’ weight

Stem (units tens)

Leaf

Frequency

3

5, 5,

2

4

1, 1, 2, 5, 5, 5, 5, 5, 7, 7, 9, 9,

12

5

0, 0, 1, 1, 2, 3, 4, 4, 4, 5, 5, 7, 7, 9

13

6

0, 0, 0, 0, 2, 3, 3, 4, 8, 9

10

7

2

1

8

6

1

A stem and leaf diagram can be used to work out the median. In this case there are twenty numbers up to and including 53, and twenty numbers at 54 or above therefore the median is:

53 +54    =   107    =   53.5

      2               2

Girls’ weight

Stem (units tens)

Leaf

Frequency

3

0, 2, 4,

3

4

5, 5, 5, 5, 5, 5, 5, 5, 6, 7, 7, 7, 8, 8, 8, 8,8,9

18

5

0, 0, 0, 0, 2, 4, 4, 4,4, 7, 7, 7, 8, 8, 8

14

6

0, 0, 2,5

4

7

2

1

8

In this case there are twenty numbers up to and including 48, and twenty numbers at 49 or above therefore the median is:

48 + 49  =   97   =   48.5

    2              2

Hours of television watched by boys a week

Stem (units tens)

Leaf

Frequency

0

3, 7

2

1

2, 2, 2, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 9

22

2

0, 0, 0, 0, 0, 0, 1, 4, 5,7

10

3

0, 0, 1, 2

4

4

0,0

2

In this case, there are twenty numbers up to and including 17, and twenty numbers at 17 or above therefore the median is:

17 +17   =   34    =   17

    2              2

Hours of television watched by girls a week

Stem (units tens)

Leaf

Frequency

0

3, 4, 7, 8, 8,

5

1

0, 0, 1, 1, 1, 2, 2, 3, 4, 4, 4, 6, 6, 7, 7, 8, 8

17

2

0, 0, 1, 1, 2, 4, 4, 6, 6, 8, 8, 8, 8,

13

3

2, 5, 5, 9,

4

4

0,

1

In this case, there are twenty numbers up to and including 17, and twenty numbers at 18 or above therefore the median is:

17 + 18   =   35    =   17.5

       2              2

From these Stem and Leaf diagrams I can see that boys weigh more as their median was 5Kg higher and that girls watch slightly more TV as their median was .5 of an hour higher. This information agrees with the information I gathered from my estimated means above.

Scatter diagrams

Now I will use scatter diagrams. I will make a scatter diagram and I will observe the correlation of weight against hours of T.V watched a week. The correlation will show me what affect each factor has on each other. I would expect that the weight of a child would be higher with the more hours of T.V they watch. I would predict this because I would logically think that the child who watches more T.V would do less exercise and would have a higher weight because they would be sitting in front of a television instead of being active. Scatter diagrams is a very important part of my coursework. image17.png

image18.png

The results I have gathered from these scatter graphs are very strange and I did not expect this type of result. The weak negative correlation of the line of best fit suggests that the less T.V the girls and boys of Mayfield High watch, the more they weigh. I would have expected that the children who watch more T.V will weigh more, but this is not the case in Mayfield High. Maybe all the children who watch a lot of T.V do exercise whilst watching T.V, or maybe the random sample I picked by chance favoured students who were naturally thin. This is not what I would have expected.

I will now look further into the correlation between hours of TV watched and weights by using a technique called spearmen’s rank to see more clearly the strength of the correlations, wherever negative or positive, spearman’s rank provides us with a number between 1 and –1, 1 being a perfect positive correlation and –1 being a perfect negative correlation.

I can see from spearman’s rank that there is a very weak negative correlation with boys and girls that agrees with my scatter graphs, this more clearly shows how weak the correlation is, it seems to be so small that there is barely any correlation at all, the correlations being –0.11 for boys and –0.15 for girls. 0 means complete randomness so my result is quite close to this meaning weight and watching TV are possibly unrelated completely and I that definitely the more TV watched does not mean the heavier you are, in fact from spearman’s graph I could only say that the more TV watched the thinner you are, but I believe that since the number is so close to 0, there is not any real link between weight and the amount of TV watched.          

Cumulative Frequency curves

The best way of representing this data on a diagram is to draw cumulative frequency curves. If the curves are drawn on the same axis it is easier to compare the results.

When plotting cumulative frequency graphs, we plot the end point of the interval on the horizontal axis, against the cumulative frequency on the vertical axis.

Cumulative frequency can be a very powerful tool when comparing different sets of data.

Average no. Hrs of T.V watched a week

The following tables show the cumulative frequency for average number of hours of T.V watched a week, boys and girls.

Boys

Average no. hrs T.V watched a week

Frequency

Cumulative frequency

0 < h ≤ 5

1

1

5 < h ≤ 10

1

2

10< h ≤ 15

13

15

15< h ≤ 20

15

30

20< h ≤ 25

3

33

25< h ≤ 30

3

36

30< h ≤ 35

2

38

35< h ≤ 40

2

40

Girls

Average no. hrs T.V watched a week

Frequency

Cumulative frequency

0 < h ≤ 5

2

2

5 < h ≤ 10

5

7

10< h ≤ 15

9

16

15< h ≤ 20

8

24

20< h ≤ 25

5

29

25< h ≤ 30

6

35

30< h ≤ 35

3

38

35< h ≤ 40

2

40

...read more.

Conclusion

I did a stem and leaf diagram, since the data was in grouped intervals. This made it easier for me to find the mean.

The scatter graphs gave me some very strange unexpected results. I did scatter graphs so I could compare the boys’ and girls’ weight with the boys’ and girls’ hour of T.V watched a week. I would predict that the more T.V you watch, the more you weigh because the less active you are, but my results concluded that the more T.V watched, the less they weighed. I knew this because of the negative correlation. These results were very strange. When I looked further into this matter using spearman’s graph I found that there was a very weak negative correlation between hours of TV watched and weight which told me that the two are probably unrelated but if they are related at all it would only be that the more TV you watched the less you weigh, which contradicts and proves my original hypothesis wrong.

Cumulative frequency graphs gave me the ability to find the inter quartile range, which is another way of looking at the range. The smaller the range, the less spread the data. I also made box and whisker diagrams to represent this data and make it easier to compare the boys’ and girls’ results. This told me boys generally weigh more and girls watch more TV. I then used standard deviation this gave me another way of looking at my information when comparing their ranges I learnt that girls have a greater range in their weights then the boys and that boys have a greater range regarding the amount of TV they watch.

In conclusion in Mayfield High the more TV you watch the less you weigh. Boys are generally heavier then girls and girls watch slightly more TV then boys.

...read more.

This student written piece of work is one of many that can be found in our GCSE Height and Weight of Pupils and other Mayfield High School investigations section.

Found what you're looking for?

  • Start learning 29% faster today
  • 150,000+ documents available
  • Just £6.99 a month

Not the one? Search for your essay title...
  • Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

See related essaysSee related essays

Related GCSE Height and Weight of Pupils and other Mayfield High School investigations essays

  1. Edexcel GCSE Statistics Coursework

    I will investigate the data in my sample and the graphs, stating whether it supported my hypotheses or not. I hope they will show me the overall trend of the relationship between weight and height. Specifically, the box plots should hopefully show me any miss calculations, outliers or anonymous results within the data provided.

  2. GCSE Physics Coursework

    Average Deflection 1 7.5 2 12.5 3 26 4 38.5 5 45 6 54 7 64 8 71 Graph: Attached Analysis: My prediction was right because my graph shows as the weight increases, so does the deflection. It also shows the weight decreases, so does the deflection.

  1. Maths Statistics Coursework - relationship between the weight and height

    To sample, I first used stratified sampling to split the year groups into boys and girls. After I knew the amount from each year I needed, I used a random number generator that chose random students which would provide a way of sampling that wasn't bias.

  2. GCSE Maths Statistics Coursework

    4 19 27 4< x ? 5 12 39 5 < x ? 6 1 40 The cumulative frequency tables below show the spread of data (Average SAT's Results) of the male and female people in my sample. A box plot will be plotted to show the inter-quartile range and to see who is more intellectually smarter (Year 10 or Year 11).

  1. Statistics coursework Edexcell

    F = 14.375 = Standard deviation = V(? FX2 / ? FX) - = V(5806/230) -14.375 =3.16 Year Nine Boys BMI Midpoint(x) Frequency(f) FX FX2 12?X<14 13 0 0 0 14?X<16 15 0 0 0 16?X<18 17 2 34 578 18?X<20 19 4 76 1444 20?X<22 21 3 63 1323 22?X<24 23 4 92 2116 24?X<26 25 0

  2. Maths Coursework. Statistics

    55 4 50 200 10,000 0.4 55 � 65 2 60 120 7,200 0.2 65 � 75 2 70 140 9,800 0.2 Total: 12 620 33400 Mean: ?fx =620/12=51.7 ? f Mode: 35 � 45 and 45 � 55 Median: 45 � 55 Standard deviation: 10.51 (to 2 d.p)

  1. mayfield course work -boys are generally heavier than girl. This has to do with ...

    1.6 - 1.69 0.49 1.67 1.6 - 1.69 1.6 - 1.69 0.39 20+ male female 1.66 1.65 - 1.69 1.65 - 1.69 0.29 1.75 1.75 - 1.79 1.75 - 1.79 0.19 CUMULATIVE FREQUENCY: the cumulative frequency graph will be used to compare the boys and girls weight in each age group.

  2. Maths Data Handling

    Final Conclusion The stratified sample of 50 students over age and gender shows that there is a mean height of 157 cm for the boys and 156 cm for the girls, and a mean weight of 50.36 kg for the boys and a mean weight of 47.46 kg for the girls.

  • Over 160,000 pieces
    of student written work
  • Annotated by
    experienced teachers
  • Ideas and feedback to
    improve your own work