The times
Now I will calculate the mean, mode, median and range number out of the word length. But first of all I am going to explain what these numbers are.
Mean:
To calculate the mean heading you have to multiply the word length
with the amount of times this word exists… so e.g. when a word is 7 letters long and exists 5 times you have to multiply 5 x 7.You have to do that with every word, and than add it up. In the case of the sun it would be 1279.
Than you have to add up the total amount of words which are in the case of the sun 301. Finally you have to divide the first part through the second part which would be 1279/301 = 4.24
So the formula for the mean discrete data is defined as:
Mean = sum of items/number of items
Mode:
The mode of a set of data is the value which occurs most often.
So for example in The Sun the answer would be 3 because the word with 3 letters occurs most often, in this case 67 times.
Median:
The median is the middle value when the data is arranged in order size.
So you calculate the median by putting the scores in order of increasing size in the case of the word length for the Sun it would be:
3, 4, 7, 9, 17, 17, 19, 39, 58, 66, 67
If there is an even number of values in the data then the median is the mean of the middle two values .In the case of the word length of the Daily Mail it would be: 0, 1, 1, 1, 2, 7, 12, 17, 26, 26, 30, 47, 60, 70
So you add up 12 with 17 and divide it by 2, this equals 14.5
Range:
Range means how far from the smallest value to the biggest.
E.g. In the case of the sentence length of the sun the smallest sentence, just got 1 word. Instead the biggest sentence got 11 words. So therefore the difference between these 2 numbers would be the range number, in this case 10.
Word length
Sentence Length
The second stage is to count the words in a sentence to get the sentence length.
Because the number of sentences is so small for every frequency I am going to use class intervals, which gives me greater numbers.
The Sun
The Daily Mail
The Times
I am now going to calculate the mean, mode, median and range number, but this time of the sentence length.
The median number can not be found exactly, but I can at least say in which group it would be.
Sentence length
Paragraph length
The third stage is to count how many sentences are in one paragraph. All these 3 stages are from the same newspaper articles.
The Sun
The times
Daily Mail
I will again calculate all the numbers.
Paragraph length
But:
The paragraph length does not really help me in my investigation because the data I get is far too small to make a real comparison between the different newspapers and is therefore not useful for drawing cumulative frequency graphs and comparing them.
Cumulative frequency graphs
Through the cumulative frequency curve I am able to find out 3 vital statistics.
- Median: Exactly halfway up, then across, then down and read off from the bottom scale.
- Lower and upper quartiles: Exactly ¼ and ¾ up the side, then across then down and read off the bottom scale.
- The inter-quartile range: The distance on the bottom scale between the lower and upper quartile.
I will than draw a Box and Whisker diagram for every graph, to compare my results
The sun – word length
Median: 3
Lower quartile: 1.9
Upper quartile: 4.8
Inter - quartile range: 2.9
The daily mail – word length
Median: 2.7
Lower quartile: 2.1
Upper quartile: 5.2
Inter - quartile range: 3.1
The times – word length
Median: 3.5
Lower quartile: 2.2
Upper quartile: 5.5
Inter - quartile range: 3.3
Sentence Length
The Sun – Sentence length
Median: 8.7
Lower quartile: 4.8
Upper quartile: 12.1
Inter - quartile range: 7.3
The Daily Mail – Sentence length
Median: 7.4
Lower quartile: 4
Upper quartile: 9.8
Inter - quartile range: 5.8
The Times – sentence lenght
Median: 4.8
Lower quartile: 2.1
Upper quartile: 8.3
Inter - quartile range: 6.2
I will now cut all the Box and Whisker diagrams out to compare them with each other.
Word length:
The Sun
The Daily Mail
The Times
Sentence Length:
The Sun
The Daily Mail
The Times
I will first of all compare the three Box and Whisker diagrams for the word length, and than the other three for the sentence length:
Word Length:
The word length differs in many ways, depending on the different newspaper.
The Sun and the Daily Mail got pretty similar Box and Whisker diagrams, but even there some differences can be identified.
The lower quartile of the Sun is a little bit smaller than the one of the Daily Mail, and the upper quartile is also smaller.
This results in a small difference in the inter-quartile range which is 1.9 for the Sun and 3.1 for the Daily Mail. But the median number of the Daily Mail is smaller than the one from the Sun.
The diagram for the times looks completely different. The first thing that can be seen is that the Timers got a far higher upper quartile than the 2 other newspapers, which confirms my prediction that higher quality newspapers got a greater number of longer words.
The Times got the greatest inter-quartile range from all 3 newspapers and also its median number is the highest.
Sentence Length:
The Sun got the highest median number of all the newspapers. The second greatest number is from the Daily Mail and the smallest from the Times.
The Sun also got the greatest lower and especially the greatest upper quartile number with 12.1. It also has the biggest inter-quartile range with 7.3.
After the Sun, the Daily Mail had the second biggest numbers in all three areas, which are lower, upper and inter quartile range.
The Times instead had very small numbers in these three areas. The lower quartile number for example was 2.1 compared to the second greatest which was the Daily Mail in this case who had 4.
Conclusion
At the beginning of the coursework I had some difficulties with my articles, it took me a long time to collect all the necessary data and I made mistakes, which forced me to double check every result.
I had some help from friends and teachers who explained me the differences between mean, mode, median and range
I wanted to find out if you are able to see the differences in quality of different newspapers by looking at their word, sentence and paragraph length.
I was able to draw cumulative frequency tables and Box and Whisker diagrams to show my results. I did not draw graphs for the paragraph lengths because I found out that this data does not show you anything, because of the simple reasons that it just doesn’t give you enough data to compare.
I predicted that the amount of long words and paragraphs will decrease as the quality of the newspaper decreases. This trend can be seen. The Times, which I predicted was the highest quality newspaper had the biggest amount of longer words and sentences. The median quality paper which I predicted would be the Daily Mail had the second largest number of long words and sentence, and the Sun is clearly the worst paper, and got like the trend shows the smallest amount of long words and sentences.
But overall I found out that it is not possible or at least highly difficult to say how high the quality of a newspaper is by looking at this data.
That is because the amount of words, sentences or paragraphs used depends primarily on the author and the story he writes.
So you can not really compare newspapers by this investigation.
If I would do the coursework again with more time, I would investigate more than 3 newspapers of every quality to get a clearer picture. I would also investigate more than one article from each newspaper.