# GCSE Statistics Coursework

Extracts from this document...

Introduction

Statistics Coursework

### Introduction

The Daily Mail has a higher average circulation (Circulation refers to the number of sold, reduced price and free copies of a title distributed on an average day over the stated period of time.) This leads me to believe that the Daily Mail is a more popular newspaper by a large margin. I believe this because the Daily Mail average circulation is 39.5% higher than the Daily Mirror. Furthermore the Daily Mail could not have achieved a 40% higher average than the Daily Mirror within a day. This leads me to believe that the Daily Mail has had a strong solid and larger user base than the Daily Mirror for a long period of time. I obtained my secondary data from http://nmauk.co.uk.

### Average Circulation: -

Daily Mail = 2,350,694

Daily Mirror = 1,684,660

### Hypothesis One: -

### “Daily Mail will have more adverts than the Daily Mirror.”

I believe The Daily Mail will have more adverts than the Daily Mirror because it is a more popular newspaper. Furthermore because it is more popular, business would wish to advertise on it because the adverts would get more exposure because more people read the Daily Mail. To add, because more people read the newspaper it would give the adverts a higher success rate, also customers who have had a good experience may recommend it to their friends who in turn would advertise in the Daily Mail. I will need to categorise the articles in the newspaper, this will tell me the total number of articles in the newspapers (excluding and including adverts). I can use this data to create a composite bar chart to represent the proportion of the different genre of articles. This will be done so I can use it to clearly show the proportion of different genres.

Middle

5

5

18

6

11

29

The 21.5th number lies in this category so this means that the median is six for the Daily Mirror.

Upper Quartile = 42 [Total amount of Numbers (n)] + 1 / 4 = 9.75 x 3 = 32.25th number.

No. of letters | Frequency | Cumulative Frequency |

1 | 0 | 0 |

2 | 1 | 1 |

3 | 7 | 8 |

4 | 5 | 13 |

5 | 5 | 18 |

6 | 11 | 29 |

7 | 6 | 35 |

The 29.25th number lies in this category because it is bigger than 29 which put it in the category above, so this means that the upper quartile is seven for the Daily Mirror.

Interquartile range = 7 − 4 = 3

I will now use the interquartile range to search for outliers.

Outliers = 3 [Interquartile range] × 1.5 = 4.5 [n]

4 [Lower Quartile] − 4.5 [n] = −0.5

7 [Upper Quartile] + 4.5 [n] = 11.5

There are no outliers because there are no values which are outside this range.

I will now do the same for the Daily Mail:

Lower Quartile = 38 [Total amount of Numbers (n)] + 1 / 4 = 9.75th number. (I will round this up to 10.)

No. of letters | Frequency | Cumulative Frequency |

1 | 0 | 0 |

2 | 4 | 4 |

3 | 6 | 10 |

The 9.75th (10th) number lies in this category so this means that the lower quartile is three for the Daily Mirror.

Median = 38 [Total amount of Numbers (n)] + 1 / 2 = 18.5th number. (I will round this up to 19.)

No. of letters | Frequency | Cumulative Frequency |

1 | 0 | 0 |

2 | 4 | 4 |

3 | 6 | 10 |

4 | 5 | 15 |

5 | 7 | 22 |

The 18.5th (19th) number lies in this category so this means that the median is five for the Daily Mirror.

Upper Quartile = 38 [Total amount of Numbers (n)] + 1 / 4 = 9.75 x 3 = 29.25th number.

No. of letters | Frequency | Cumulative Frequency |

1 | 0 | 0 |

2 | 4 | 4 |

3 | 6 | 10 |

4 | 5 | 15 |

5 | 7 | 22 |

6 | 3 | 25 |

7 | 7 | 32 |

The 29.25th number lies in this category, so this means that the upper quartile is seven for the Daily Mirror.

The box and whisker plots have no outliers; however the daily mail has a larger inter quartile range. The daily mail box and whisker plot is neutrally skewed, while the daily mirror box and whisker plot is negatively skewed, implying that most the values are at a higher range. However I will need to do a normal distribution diagram because it is more accurate. This is because there are no outliers to affect the normal distribution diagram.

Interquartile Range = 7 − 3 = 4

Outliers = 4 [Interquartile range] × 1.5 = 6 [n]

3 [Lower Quartile] − 6 [n] = −3

7 [Upper Quartile] + 6 [n] = 13

There are no outliers because there are no values which are outside this range.

I will now need to work out the mean and standard deviation using the values in the table which has the frequency for the word lengths.

I used my graphics calculator to calculate the mean and standard deviation for the Daily Mail and Mirror word lengths. However here is how I would do it by hand.

Formulas:

- Mean = ∑fx/ ∑f
- Standard Deviation = √ ∑fx² / ∑f − Mean²
- or, √∑f(x – mean)²/ f

Daily Mirror:

- Mean = 5.74
- Standard Deviation = 2.12

Daily Mail

- Mean = 5.34
- Standard Deviation = 2.31

I will now need to use the mean as the peak for my normal distribution diagram and the standard deviation to show were 95% of the data lies. To do this I must multiply the standard deviation by 2 and not ‘3’ because if I multiply it by 3 I will get negative numbers. I will then use the value I obtained by multiplying the standard deviation by 2 and I will ± it from the median.

### Daily Mirror:

Mean =5.74

Standard Deviation = 2.12

2.12 x 2 = 4.24

6 − 4.24 = 1.7

6 + 4.24 = 10.24

Daily Mail:

Mean = 5.34

Standard Deviation = 2.31

2.31 x 2 = 4.62

5 − 4.62 = 0.38

5 + 4.62 = 9.62

The Daily Mail clearly has a larger spread from the mean implying that the Daily Mail uses words with a larger range of length.

### Hypothesis Three: -

### “The Daily Mail will have small headlines and a multitude of text.”

I believe the Daily Mail will have small headlines and a multitude of text. On the other hand, the Daily Mirror will have large headlines and small area of text. I am lead to believe this because the Daily Mirror is aimed at people who are busy and do not have the time to read a large amount of text. Furthermore the large headlines are needed to attract the busy customer’s attention. The small brief of text is designed to give people the general gist of the story, while the Daily Mail is designed to give readers detailed information on the story. I am going to get a sample of thirty articles, which will be stratified by category. I will make sure these articles are in proportion. I will then measure the area of the headline and text, after this I plan to plot the data on a scatter graph. I will then work out the mean to plot the line of best fit which will allow me to interpolate and extrapolate. I will then need to work out the spearmans rank correlation of the headline and text. This will tell me the relationship between the headline and text.

### Hypothesis Three

I will now rank the area of headline and text accordingly for Spearmans Rank:-

I am going to use stratified sampling to choose the article of which I will measure the area of headline and text.

### Stratified Sampling: -

I used stratified sampling to choose the articles that I was going to measure the area of. This was in ‘hypothesis three’, I stratified the articles by genre and made sure each sample was in proportion to the total amount of articles there was of that genre. I used stratified sampling because it would be representative of all the types of articles. Because some articles area may differ because of the type of article it is. For example if the newspaper is specialised in entertainment articles, the entertainment articles would be bigger. However I did not include adverts within the sample because I do not consider them as articles.

Daily Mail = 280 – 138(adverts) = 142

Daily Mirror = 326 – 152(adverts) = 174

Here is an example of how I put the articles in proportion:

38[no. of articles in that genre in the newspaper]/142[total no. of articles] = X [variable] / 30 [Sample Size]

Or, 38 x 30 / 142 = X

Or, 1142 / 142 = X

Or, 8.028 = 8 = X

Daily Mirror:

Sport = 8

Crime = 4

Politics = 2

Celebrities = 5

Health = 1

Entertainment = 5

Social = 1

Finance = 1

Opinion = 2

Art = 1

Daily Mail

Sport = 6

Crime = 2

Politics = 5

Celebrities = 1

Health = 1

Entertainment = 6

Social = 4

Finance = 4

Opinions = 1

I also will use quota sampling to choose my sampling size.

### Quota Sampling: -

I used quota sampling because I was asked by my teacher to choose a sample of at least 30 articles for hypothesis three. However I used stratified sampling to put the different genre of articles into proportion.

And I will actually use convenience sampling to choose the articles I am going to measure the area of headline and text for.

### Convenience Sampling: -

I used convenience sampling when choosing the articles for which I would measure the area of headline and text out of the array of articles I had, however I did this because it was convenient and was not time consuming.

Daily Mirror Spearmans Rank: -

Headline | 131.25 | 10.85 | 30 | 92.25 | 12.74 | 96 | 155.04 | 231 | 132 |

Headline Rank | 26 | 5 | 11 | 20 | 6 | 22 | 28 | 29 | 27 |

Text Rank | 26 | 5 | 16 | 23 | 7 | 22 | 24 | 25 | 27 |

Text | 201.25 | 46.50 | 101.25 | 162.50 | 56.00 | 162.00 | 181.25 | 198.00 | 216.00 |

D | 0 | 0 | -5 | -3 | -1 | 0 | 4 | 4 | 0 |

D² | 0 | 0 | 25 | 9 | 1 | 0 | 16 | 16 | 0 |

Headline | 86 | 25 | 6 | 45.5 | 92.45 | 9.8 | 105 | 47.5 | 89.25 |

Headline Rank | 18 | 9 | 1 | 13 | 21 | 4 | 24 | 14 | 19 |

Text Rank | 6 | 3 | 1 | 11 | 13 | 2 | 30 | 21 | 12 |

Text | 52.50 | 33.25 | 21.70 | 80.85 | 90.00 | 30.00 | 448.00 | 135.00 | 85.50 |

D | 12 | 6 | 0 | 2 | 8 | 2 | -6 | -7 | 7 |

D² | 144 | 36 | 0 | 4 | 64 | 4 | 36 | 49 | 49 |

Conclusion

### Limitations

There were many things to add, optimise and improve about my investigation if I were to conduct it again. Firstly, the source for my information was not totally reliable due to the fact that I only chose one sample from one particular day. To gain a more accurate sample I would need to choose sample from a different day and week. I would then be able to compare the samples to gain an average between the two to give me a more accurate view of the investigation. I couldn’t do this because of lack of time and because I am doing it by myself. Furthermore I didn’t include the adverts when I was gathering samples for my area of headline and text; however I do not think the adverts are relevant because the content of the advert was made by third party members and not directly influenced by the newspaper however the editor of the newspaper still has some control over it. Also my secondary data was slightly out of date; however I doubt this had much impact on my work because the Daily Mail average circulation is 39.5% higher than the Daily Mirror. This implies that it already has a strong user base and it is unlikely people would change newspapers because people tend to stay with what they trust.

Mohammed Jubair Jalil 10AJN 03/02/06

This student written piece of work is one of many that can be found in our GCSE Comparing length of words in newspapers section.

## Found what you're looking for?

- Start learning 29% faster today
- 150,000+ documents available
- Just £6.99 a month