# statistics coursework

Extracts from this document...

Introduction

Handling data- Statistics coursework

In this coursework, I will be comparing weight of the pupils with the amount of hours of T.V they watch a week and I will observe the correlation between these two factors. I hypothesis that the amount of T.V watched per week will affect weight. I hypothesis that the more TV watched per week, the higher weights. I chose this line of enquiry since I think that there may be a correlation between the two, as people who watch lots of TV will be less active and do less exercise and so will weigh more. I will be investigating to see whether this is true or not. As well as my original hypothesis I will also enquire into the difference between genders, I will investigate wherever boys or girls weigh more and wherever boys or girls watch more TV. I initially believe that boys weigh more then girls, as from my own knowledge I believe they are normally bigger and taller which should cause higher weights. I will find out wherever this is true or not.

The following table is data based on the boys and girls who attend Mayfield High school.

Year group | Boys | Girls | Total |

7 | 151 | 131 | 282 |

8 | 145 | 125 | 270 |

9 | 118 | 143 | 261 |

10 | 106 | 94 | 200 |

11 | 84 | 86 | 170 |

## Stratified sampling

A stratified sample is a sample in which you have a proportional amount from each year so that it is a fair sample. From my data, I need to take a proportional amount from each year, boys and girls.

For example:

There are 145 boys in year 8. In total, there are 604 boys in Mayfield high. For my sample, I will need 40 boys from the 604 boys there are. Therefore,

145 x 40 = 9. Therefore I will take 9 boys from year 8.

604

I have used this method for each year group, boys and girls.

Year group | Boys | Girls | Total |

7 | 10 | 9 | 19 |

8 | 9 | 9 | 18 |

9 | 8 | 10 | 18 |

10 | 7 | 6 | 13 |

11 | 6 | 6 | 12 |

I will need samples of 40 boys and 40 girls.

Middle

Mid interval value x Frequency

30 < w ≤40

3

35

3 x 35 = 105

40 < w ≤50

22

45

22 x 45 = 990

50 < w ≤60

12

55

12 x 55 = 660

60 < w ≤70

2

65

2 x 65 = 130

70 < w ≤80

1

75

1 x 75 = 75

80 w < 90

0

85

0 x 85 = 0

40 1960

Estimated mean =1960 ÷ 40 = 49 estimate mean = 49

The average mean for the weight of girls is lower than the average weight of boys, which proves that boys have generally a higher weight than girls.

### Median interval

Weight | frequency | Cumulative frequency |

30 < w ≤40 | 3 | 3 |

40 < w ≤50 | 22 | 25 |

50 < w ≤60 | 12 | 37 |

60 < w ≤70 | 2 | 39 |

70 < w ≤80 | 1 | 40 |

80 w < 90 | 0 | 40 |

40 = 20

2

I can see from the table that the median interval lies between 41 < w ≤ 50.

Mode

The most frequent weight lies between 41 < w ≤50.

Range

72 – 30 = 42

Average hrs. T.V watched per week | Frequency | Mid interval value | Mid interval value x Frequency |

0 < h ≤ 5 | 1 | 2.5 | 2.5 x 1 = 2.5 |

5 < h ≤ 10 | 1 | 7.5 | 7.5 x 1 = 7.5 |

10< h ≤ 15 | 13 | 12.5 | 12.5 x 13 = 162.5 |

15< h ≤ 20 | 15 | 17.5 | 17.5 x 15 = 262.5 |

20< h ≤ 25 | 3 | 22.5 | 22.5 x 3 = 67.5 |

25< h ≤ 30 | 3 | 27.5 | 27.5 x 3 = 82.5 |

30< h ≤ 35 | 2 | 32.5 | 32.5 x 2 = 65 |

35< h ≤ 40 | 2 | 37.5 | 37.5 x 2 = 75 |

Boys

40 725

Estimated Mean =725 ÷ 40 = 19 estimate mean = 18.125

### Median interval

Average no. hrs T.V watched a week | ## Frequency | Cumulative frequency |

0 < h ≤ 5 | 1 | 1 |

5 < h ≤ 10 | 1 | 2 |

10< h ≤ 15 | 13 | 15 |

15< h ≤ 20 | 15 | 30 |

20< h ≤ 25 | 3 | 33 |

25< h ≤ 30 | 3 | 36 |

30< h ≤ 35 | 2 | 38 |

35< h ≤ 40 | 2 | 40 |

40 + 1 = 20.5

2

I can see from the table that the median interval lies between 16< h ≤ 20

Mode

The mode is the most common term (highest frequency). In this case, the mode is 16< h ≤ 20 because that has the highest frequency.

### Range

40 – 3 = 37

## GIRLS

Average hrs. T.V watched per week | Frequency | Mid interval value | Mid interval value x Frequency |

0 < h ≤ 5 | 2 | 2.5 | 2.5 x 2 = 5 |

5 < h ≤ 10 | 5 | 7.5 | 7.5 x 5= 37.5 |

10< h ≤ 15 | 9 | 12.5 | 12.5 x 9 =112.5 |

15< h ≤ 20 | 8 | 17.5 | 17.5 x 8 = 140 |

20< h ≤ 25 | 5 | 22.5 | 22.5 x 5 = 112.5 |

25< h ≤ 30 | 6 | 27.5 | 27.5 x 6 = 165 |

30< h ≤ 35 | 3 | 32.5 | 32.5 x 3 = 97.5 |

35< h ≤ 40 | 2 | 37.5 | 37.5 x 2 = 75 |

40 745

745 ÷ 40 = 18.625 estimated mean

The girls estimated mean is slightly higher then the boys estimated mean (by .5), which shows girls watch slightly more then TV then the boys.

Median

Average no. hrs T.V watched a week | ## Frequency | Cumulative frequency |

0 < h ≤ 5 | 2 | 2 |

5 < h ≤ 10 | 5 | 7 |

10< h ≤ 15 | 9 | 16 |

15< h ≤ 20 | 8 | 24 |

20< h ≤ 25 | 5 | 29 |

25< h ≤ 30 | 6 | 35 |

30< h ≤ 35 | 3 | 38 |

35< h ≤ 40 | 2 | 40 |

40 + 1 = 20.5

2

The median interval in this case lies between 16< h ≤ 20.

The boys and girls have the same median interval that suggests both boys and girls watch similar amounts of television.

### Mode

The mode is the term, which has the highest frequency. In this case, the mode is 11< h ≤ 15.

### Range

40 – 3 = 37

### Stem and leaf

Since the data is grouped into class intervals, it also makes sense to record it in a stem and leaf diagram. This will make it easier to read off the median values.

Boys’ weight

## Stem (units tens) | ## Leaf | ## Frequency |

3 | 5, 5, | 2 |

4 | 1, 1, 2, 5, 5, 5, 5, 5, 7, 7, 9, 9, | 12 |

5 | 0, 0, 1, 1, 2, 3, 4, 4, 4, 5, 5, 7, 7, 9 | 13 |

6 | 0, 0, 0, 0, 2, 3, 3, 4, 8, 9 | 10 |

7 | 2 | 1 |

8 | 6 | 1 |

A stem and leaf diagram can be used to work out the median. In this case there are twenty numbers up to and including 53, and twenty numbers at 54 or above therefore the median is:

53 +54 = 107 = 53.5

2 2

## Girls’ weight

## Stem (units tens) | ## Leaf | ## Frequency |

3 | 0, 2, 4, | 3 |

4 | 5, 5, 5, 5, 5, 5, 5, 5, 6, 7, 7, 7, 8, 8, 8, 8,8,9 | 18 |

5 | 0, 0, 0, 0, 2, 4, 4, 4,4, 7, 7, 7, 8, 8, 8 | 14 |

6 | 0, 0, 2,5 | 4 |

7 | 2 | 1 |

8 |

In this case there are twenty numbers up to and including 48, and twenty numbers at 49 or above therefore the median is:

48 + 49 = 97 = 48.5

2 2

## Hours of television watched by boys a week

Stem (units tens) | Leaf | Frequency |

0 | 3, 7 | 2 |

1 | 2, 2, 2, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 9 | 22 |

2 | 0, 0, 0, 0, 0, 0, 1, 4, 5,7 | 10 |

3 | 0, 0, 1, 2 | 4 |

4 | 0,0 | 2 |

In this case, there are twenty numbers up to and including 17, and twenty numbers at 17 or above therefore the median is:

17 +17 = 34 = 17

2 2

## Hours of television watched by girls a week

Stem (units tens) | Leaf | Frequency |

0 | 3, 4, 7, 8, 8, | 5 |

1 | 0, 0, 1, 1, 1, 2, 2, 3, 4, 4, 4, 6, 6, 7, 7, 8, 8 | 17 |

2 | 0, 0, 1, 1, 2, 4, 4, 6, 6, 8, 8, 8, 8, | 13 |

3 | 2, 5, 5, 9, | 4 |

4 | 0, | 1 |

In this case, there are twenty numbers up to and including 17, and twenty numbers at 18 or above therefore the median is:

17 + 18 = 35 = 17.5

2 2

From these Stem and Leaf diagrams I can see that boys weigh more as their median was 5Kg higher and that girls watch slightly more TV as their median was .5 of an hour higher. This information agrees with the information I gathered from my estimated means above.

## Scatter diagrams

Now I will use scatter diagrams. I will make a scatter diagram and I will observe the correlation of weight against hours of T.V watched a week. The correlation will show me what affect each factor has on each other. I would expect that the weight of a child would be higher with the more hours of T.V they watch. I would predict this because I would logically think that the child who watches more T.V would do less exercise and would have a higher weight because they would be sitting in front of a television instead of being active. Scatter diagrams is a very important part of my coursework.

The results I have gathered from these scatter graphs are very strange and I did not expect this type of result. The weak negative correlation of the line of best fit suggests that the less T.V the girls and boys of Mayfield High watch, the more they weigh. I would have expected that the children who watch more T.V will weigh more, but this is not the case in Mayfield High. Maybe all the children who watch a lot of T.V do exercise whilst watching T.V, or maybe the random sample I picked by chance favoured students who were naturally thin. This is not what I would have expected.

I will now look further into the correlation between hours of TV watched and weights by using a technique called spearmen’s rank to see more clearly the strength of the correlations, wherever negative or positive, spearman’s rank provides us with a number between 1 and –1, 1 being a perfect positive correlation and –1 being a perfect negative correlation.

I can see from spearman’s rank that there is a very weak negative correlation with boys and girls that agrees with my scatter graphs, this more clearly shows how weak the correlation is, it seems to be so small that there is barely any correlation at all, the correlations being –0.11 for boys and –0.15 for girls. 0 means complete randomness so my result is quite close to this meaning weight and watching TV are possibly unrelated completely and I that definitely the more TV watched does not mean the heavier you are, in fact from spearman’s graph I could only say that the more TV watched the thinner you are, but I believe that since the number is so close to 0, there is not any real link between weight and the amount of TV watched.

Cumulative Frequency curves

The best way of representing this data on a diagram is to draw cumulative frequency curves. If the curves are drawn on the same axis it is easier to compare the results.

When plotting cumulative frequency graphs, we plot the end point of the interval on the horizontal axis, against the cumulative frequency on the vertical axis.

Cumulative frequency can be a very powerful tool when comparing different sets of data.

## Average no. Hrs of T.V watched a week

The following tables show the cumulative frequency for average number of hours of T.V watched a week, boys and girls.

Boys

Average no. hrs T.V watched a week | ## Frequency | Cumulative frequency |

0 < h ≤ 5 | 1 | 1 |

5 < h ≤ 10 | 1 | 2 |

10< h ≤ 15 | 13 | 15 |

15< h ≤ 20 | 15 | 30 |

20< h ≤ 25 | 3 | 33 |

25< h ≤ 30 | 3 | 36 |

30< h ≤ 35 | 2 | 38 |

35< h ≤ 40 | 2 | 40 |

Girls

Average no. hrs T.V watched a week | ## Frequency | Cumulative frequency |

0 < h ≤ 5 | 2 | 2 |

5 < h ≤ 10 | 5 | 7 |

10< h ≤ 15 | 9 | 16 |

15< h ≤ 20 | 8 | 24 |

20< h ≤ 25 | 5 | 29 |

25< h ≤ 30 | 6 | 35 |

30< h ≤ 35 | 3 | 38 |

35< h ≤ 40 | 2 | 40 |

Conclusion

I did a stem and leaf diagram, since the data was in grouped intervals. This made it easier for me to find the mean.

The scatter graphs gave me some very strange unexpected results. I did scatter graphs so I could compare the boys’ and girls’ weight with the boys’ and girls’ hour of T.V watched a week. I would predict that the more T.V you watch, the more you weigh because the less active you are, but my results concluded that the more T.V watched, the less they weighed. I knew this because of the negative correlation. These results were very strange. When I looked further into this matter using spearman’s graph I found that there was a very weak negative correlation between hours of TV watched and weight which told me that the two are probably unrelated but if they are related at all it would only be that the more TV you watched the less you weigh, which contradicts and proves my original hypothesis wrong.

Cumulative frequency graphs gave me the ability to find the inter quartile range, which is another way of looking at the range. The smaller the range, the less spread the data. I also made box and whisker diagrams to represent this data and make it easier to compare the boys’ and girls’ results. This told me boys generally weigh more and girls watch more TV. I then used standard deviation this gave me another way of looking at my information when comparing their ranges I learnt that girls have a greater range in their weights then the boys and that boys have a greater range regarding the amount of TV they watch.

In conclusion in Mayfield High the more TV you watch the less you weigh. Boys are generally heavier then girls and girls watch slightly more TV then boys.

This student written piece of work is one of many that can be found in our GCSE Height and Weight of Pupils and other Mayfield High School investigations section.

## Found what you're looking for?

- Start learning 29% faster today
- 150,000+ documents available
- Just £6.99 a month