The Guardian vs. The Mirror - I am doing an investigation into the statistical differences between the daily tabloid newspapers, and the weekly broadsheet newspapers.
The Guardian vs. The Mirror
I am doing an investigation into the statistical differences between the daily tabloid newspapers, and the weekly broadsheet newspapers.
My overall hypothesis is that the daily tabloid papers - here represented by the Saturday edition of The Mirror, a daily tabloid - make an easier read than the more comprehensive broadsheet - here represented by the Guardian, a weekly broadsheet - To reach a conclusion, I plan to test three hypothesise in specific area. I will use a range of sampling methods, and presentation of data, in order to form valid conclusions.
Planning
- My hypothesis is that the number of letters per word will be greater in the Guardian than in the Mirror.
Number of letters - I will count the number of letters in every fourth word.
In order to make my calculations accurate enough to reach a valid conclusion, I must collect a minimum of twenty pieces of data from each newspaper. I was planning to collect data from fourth word, in the first sentence on each page. However, if my second hypothesis is correct, then the sentences in the Guardian will be longer than those in The Mirror. This would corrupt the results, as some would be more accurate than others. So, I have decided to take the fourth and the eighth word from the first article on each page. The sections of each paper I have chosen are twenty-five pages long, so this will provide more than enough data to support any conclusion I reach, and should incorporate all sections of each newspaper.
I will display my results in a data frequency chart. Then I will use averages and histograms, to compare the results and draw my conclusion.
2 - My second hypothesis is that the number of words per sentence will be fewer in The Mirror than in the Guardian.
Number of words - I'll count the number of words in the first sentence, on each page.
In order to make my calculations accurate enough to reach a valid conclusion, I must collect a minimum of twenty pieces of data from each newspaper. The section I've chosen from each newspaper is twenty-five pages long, so I will collect data from the first sentence, in the first article of every page. This should incorporate all sections of both newspapers, and provide more than enough data to support any conclusion I reach.
I will display my results in a data frequency chart. Then I will use standard deviation, averages, histograms, box and whisker diagrams, and the quartile range, to compare the results and draw my conclusion.
3 - My hypothesis is that the larger the number of words in the headline, the longer the article. I also believe that the number of words in the headline and/or article, will be greater in the Guardian than in The Mirror.
Number of words in the headline - All words will be included.
Length of article - The Guardian has a standard column width, so I could simply measure the length of the column with a ruler. However, The Mirror uses two different standard widths. I can't exclude columns of one width, as there may be a pattern to which articles have the wider width column, and which articles have the narrower one. The Mirror is not separated into sections, in the same way the Guardian is - e.g. finance, politics, sport - but the column width may be it's equivalent way of sectioning off different forms of article. I may, ...
This is a preview of the whole essay
Length of article - The Guardian has a standard column width, so I could simply measure the length of the column with a ruler. However, The Mirror uses two different standard widths. I can't exclude columns of one width, as there may be a pattern to which articles have the wider width column, and which articles have the narrower one. The Mirror is not separated into sections, in the same way the Guardian is - e.g. finance, politics, sport - but the column width may be it's equivalent way of sectioning off different forms of article. I may, therefore, be excluding a large part of the newspaper, and, in the process, invalidating all my results, and conclusions. So instead, I will use the average number of words per sentence, which I calculated whilst working on my second hypothesis, and then count the number of sentences. I'll multiply the number of sentences, which the average number of words per sentence. I will use this to calculate a reasonably accurate estimate of the number of words per article.
In order to make my calculations accurate enough to have a valid conclusion, I must collect a minimum of twenty pieces of data from each paper. The section of each paper I have selected is twenty-five pages long, so I will use the first article on every page. This should incorporate all sections in the Guardian, and all the column widths used in The Mirror. This will provide more than enough data to support the conclusion I reach.
I will display my results in a scatter graph, and use a line of best fit, to see if there is a positive correlation between the number of words in the headline, and the article. I will also work out the mean ratio of number of headline words to number of words in article, for each newspaper, and use this to compare them. If my hypothesis is correct, we may be able to compare the length of the articles, simply by looking at the headline.
Collecting Data
No. of Letters
Frequency of The Mirror
Total
-2
IIIII III
8 x 2 = 16
3-4
IIIII IIIII IIIII IIIII
20 x 4 = 80
5-6
IIIII IIIII III
3 x 6 = 78
7-8
IIIII IIII
9 x 8 = 72
9-10
0 X 10 = 0
1-12
0 X 12 = 0
-
No. of Letters
Frequency of The Mirror
Cumulative Frequency
-2
IIIII III
8 (+ 20)
3-4
IIIII IIIII IIIII IIIII
28 (+13)
5-6
IIIII IIIII III
41 (+ 9)
7-8
IIIII IIII
50 (+ 0)
9-10
50 (+ 0)
No. of Letters
Frequency of the Guardian
Total
-2
IIIII
5 x 2 = 10
3-4
IIIII IIIII III
3 x 4 = 52
5-6
IIIII IIIII IIII I
6 x 6 = 96
7-8
IIII
4 x 8 = 32
9-10
IIIII IIII
9 x 10 = 90
1-12
III
3 x 12 = 36
No. of Letters
Frequency of The Guardian
Cumulative Frequency
-2
IIIII
5 (+ 13)
3-4
IIIII IIIII III
8 (+ 16)
5-6
IIIII IIIII IIIII I
53 (+ 4)
7-8
IIII
57 (+ 9)
9-10
IIIII IIII
66 (+ 3)
1-12
III
69 (+ 0)
The information in these data frequency charts, and cumulative frequency charts can also be displayed in many different ways. First, I will use it to work out the mode, range and mean number of letters per word:
Mean for The Mirror: 8 16
20 80
13 78
9 + 72 +
50 246
246/50 = 4.92
Mean for the Guardian: 5 10
13 52
16 96
4 32
9 90
3 + 36 +
50 316
316/50 = 6.32
My hypothesis was correct. The mean number of letters per word in the Guardian is 1.4 more than the mean number of letters per word in the Mirror.
Mode for the Mirror: 3-4
Mode for the Guardian: 5-6
This is also proves my hypothesis, as the mode for the Guardian
is higher than the mode for the Mirror. However, the mode for the Guardian is still low, but that is because there are still many more words with 5-6 letters than there are with 12. However if you look at the range:
Range for the Mirror: 8-1= 7
Range for the Guardian: 12-1=11
You can see that the Guardian does have a much larger range, demonstrating that it does use the longer words I predicted it would, although it obviously still has to use average length words as well.
I will also use a histogram to back up the results shown above in a clearer format.
No. of Letters
Frequency of the Mirror
Frequency Density
-2
IIIII III
8/1 = 1
3-4
IIIII IIIII IIIII IIIII
20/3 = 6.66
5-6
IIIII IIIII III
3/5 = 2.6
7-8
IIIII IIII
9/7 = 1.29 (2dp)
No. of Letters
Frequency of the Guardian
Frequency Density
-2
IIIII
5/1 = 5
3-4
IIIII IIIII III
3/3 = 4.33
5-6
IIIII IIIII IIIII I
6/5 = 3.2
7-8
IIII
4/7 = 0.57
9-10
IIIII IIII
9/9 = 1
1-12
III
3/11 = 0.27
The Mirror Histogram
y
6.7
6
5
4
3
2
1
x
0 1-2 3-4 5-6 7-8
The Guardian Histogram
y
5
4
3
2
1
x
0 1-2 3-4 5-6 7-8 9-10 10-11
I conclude that my hypothesis was correct. The number of letters per word in the Guardian is greater than the number of letters per word in The Mirror. I think I tested it effectively and fairly, although it may have been useful to check the consistency of each result using standard deviation, or by working out the inter-quartile range. However, this information still clearly proves my hypothesis.
No. of Words
Frequency of the Mirror
Total
-5
I
x 5 = 5
6-10
III
3 x 10 = 30
1-15
IIII
4 x 15 = 60
6-20
IIIII IIIII IIII
4 x 20 = 280
21-25
III
3 x 25 = 75
No. of Words
Frequency of the Guardian
Total
-5
0 x 5 = 0
6-10
I
x 10 = 10
1-15
I
x 15 = 15
6-20
I
x 20 = 20
21-25
IIIII II
7 x 25 = 175
26-30
IIIII II
7 x 30 = 210
31-35
I
x 35 = 35
36-40
IIIII
5 x 40 = 200
41-45
II
2 x 45 = 90
2 -
I will now use these figures in the data frequency chart, to work out the averages of each, and compare the two. The mean, mode and range:
The mean of the Mirror: 1 5
3 30
4 60
14 280
3 + 75 +
25 450
450/25 = 18
The mean for the Guardian: 1 10
1 15
1 20
7 175
7 210
1 35
5 200
2 + 90 +
25 755
755/25 = 30.2
My hypothesis was correct. The mean number of words per sentence in the Guardian was 12.2 more than the mean number of words per sentence in the Mirror.
The Mode for the Mirror: 16-20
The Mode for the Guardian: (23+28)/2 = 25.5
This backs up my earlier results, proving my hypothesis. The mode for the Guardian is larger than the mode for the Mirror, although the range demonstrates the true extent of the Guardians sentences.
Range for the Mirror: 25 - 5 = 20
Range for the Guardian: 90 - 10 = 80
The Guardian has a much larger range. This is because there are always some exceptions, and there was one sentence which had only 6-10 words.
I will use a histogram to back up my results:
No. of Words
Frequency of the Mirror
Frequency Density
-5
I
/5 = 0.2
6-10
III
3/10 = 0.3
1-15
IIII
4/15 = 0.26 (2dp)
6-20
IIIII IIIII IIII
4/20 = 0.7
21-25
III
3/25 = 0.12
No. of words
Frequency of the Guardian
Frequency Density
-5
0/5 = 0
6-10
I
/10 = 0.1
1-15
I
/15 = 0.06 (2dp)
6-20
I
/20 = 0.05
21-25
IIIII II
7/25 = 0.28
26-30
IIIII II
7/30 = 0.23 (2dp)
31-35
I
/35 = 0.29 (2dp)
36-40
IIIII
5/40 = 0.125
41-45
II
2/45 = 0.04 (2dp)
The Mirror Histogram
y
0.7
0.6
0.5
0.4
0.3
0.2
0.1
x
1-5 6-10 11-15 16-20 21-25
The Guardian Histogram
y
0.3
0.2
0.1
0 0-5 6-10 11-15 16-20 21-25 26-30 31-35 36-40 41-45
I conclude that my hypothesis was correct. The number of words per sentence in the Guardian was greater than the number of words per sentence in the Mirror. I think my investigation was fair, and the results were clear.
3 - The Mirror
No. of words per headline
No. of Sentences per article
No. of words per article
Ratio
Decimal Ratios (3dp)
2
21
21 x 18 = 378
:189
0.005
4
29
29 x 18 = 522
2:261
0.008
4
4 x 18 = 252
:252
0.004
3
20
20 x 18 = 360
:120
0.008
5
71
71 x 18=1278
5:426
0.012
5
3
3 x 18 = 234
5:234
0.021
3
32
32 x 18 = 576
3:32
0.094
7
24
24 x 18 = 432
7:24
0.292
5
5
5 x 18 = 270
5:15
0.333
8
1
1 x 18 = 198
8:11
0.727
4
24
24x8=2232
:31
0.323
3
31
31 x 18 = 558
3:31
0.097
5
97
97 x 18=1746
5:97
0.052
1
24
24 x 18 = 432
1:24
0.042
9
9 x 18 = 342
:19
0.053
3
27
27 x 18 = 486
:9
0.111
9
66
66 x 18=1188
3:22
0.136
9
43
43 x 18 = 774
9:43
0.209
2
33
33 x 18 = 594
2:33
0.061
7
11
11x18=1998
7:111
0.063
9
20
20 x 18 = 360
9:20
0.45
2
5
5 x 18 = 270
2:15
0.133
7
1
1 x 18 = 198
7:11
0.636
8
24
24 x 18 = 432
:3
0.333
3
37
37 x 18 = 666
3:37
0.081
The average number of words per sentence: 18
The Guradian
The average number of words per sentence: 30.2
No. of words in headline
No. of sentences in article
No. of words per article (0dp)
Ratio
Decimal Ratio (3dp)
4
53
53x30.2=1601
4:1601
0.002
7
8
8 x 30.2 =544
7:544
0.013
6
9
9 x 30.2 =574
3:287
0.010
5
9
9 x 30.2 =574
5:574
0.009
7
39
39 x30.2=1178
7:1178
0.006
2
26
26 x30.2=1565
2:1565
0.008
7
4
4 x 30.2 =423
7:423
0.017
8
61
61 x30.2=1842
4:921
0.004
8
24
24 x 30.2 =725
8:725
0.011
6
7
7 x 30.2 =513
6:513
0.012
8
27
27 x 30.2 =815
8:815
0.010
1
35
35 x30.2=1057
1:1057
0.010
4
51
51 x30.2=1540
:385
0.003
7
7
7 x 30.2 =513
7:513
0.014
9
55
55 x30.2=1661
9:1661
0.005
5
65
65 x30.2=1963
5:1963
0.003
2
5
5 x 30.2 =453
2:453
0.004
4
2
2 x 30.2 =362
2:181
0.011
5
5
5 x30.2 = 453
5:453
0.011
2
23
23 x 30.2 =695
2:695
0.002
6
31
31 x 30.2 =936
:156
0.006
8
6
6 x 30.2 =483
8:483
0.017
1
29
29 x 30.2 =876
1:876
0.013
9
4
4 x 30.2 =423
:47
0.021
The Mirror
y
68
x
64
x
60
56
x
x
52
x
48
44
40
x
36
x
32
x
x
28
x x
24 x x
20 x x
x x x
16 x x
x x x
12 x
8
4
0 2 4 6 8 10 12 X
0
The Guardian
Y
130
x
120
x
110
100
x
90
80
x
70
x
60
50
x
40
x
x x
30 x x
x x x
x x
20 x x
x x
x x x x
10
0 2 4 6 8 10 12 14 16
10
There doesn't appear to be any direct colleration between the no. of words in the headline, and the number of words in the article. I conclude that my hypothesis was incorrect, and therefore, I do not have the data to compare the two results.
My overall conclusion is that my original hypothesis was correct. The fact that the Guardian uses longer words, has longer sentences, and you can see from my third investigation, has longer articles, shows that it is aimed at the more intelligent reader, who intends to find out more about the subject in context.
I think my investigation went well, although section three's results were disappointing. I think I investigated them in the quickest, fairest way possible, and displayed them clearly, in number of ways, represent the results.