Word Length
Article 1- ‘Northern Rock Bank Crisis’
Standard deviation = Mean=
From these graphs you can denote that in both newspapers 4 letter words are the most common. You can also see that 1, 2, 3 letter words are pretty similar as in the English language it doesn’t matter how complex the language is you still have to use these 1, 2, 3 length words in order for the passage to make sense. The best bars to compare in this graph are the 4 letter words to the 9 letter words. You can see that there are definitely bigger words in the broadsheet represented by the red line I have put on. You can also see from the graphs and the data collected the Broadsheet has a higher mean showing us that the Broadsheet has more, big words in than the Tabloid again giving us a hint as to the complexity of the language. By comparing the standard deviation of word length of both newspapers we can see that the Tabloids standard deviation is smaller therefore telling us that more of its values are closer to the mean. This is evident as you can see on the Tabloid the values are closer to the 5 and 6 bars than the Broadsheet that has 1s and 12s on its graph. This shows us that in the Broadsheet there are more words that are closer to the extremes than the Tabloid that maybe has a lower mean but its word length is more consistent. So the Broadsheet in this article might have more 12 letter words but it also has more 1 letter words whereas the Tabloid has a more words that are a medium size.
These box and Whisker plots again back up my earlier comments about the Broadsheet having more extremes of word lengths. You can see on the two plots that the Broadsheet has a smallest word and a largest word in its article. From this graph you can denote that the Medians are the same as well as the upper and lower quartiles proving that the middle 50% of the results, the interquartile ranges, are mostly the same. Where as the second 50% of the data in both newspapers are very different with the broadsheet’s being much more spread out compared to the Tabloids.
This evidence so far supports my hypothesis but there is insufficient evidence at this stage to prove my Hypothesis as the graphs are so similar and the means are so close. At this point the Broadsheet has had more long words in, maybe giving us an indication as to the complexity of the language and the readability of the newspaper.
Article 2- Sir Alex Ferguson’s Attack
This graph shows us that the Broadsheet article has less smaller words i.e. words containing 1 to 6 letters but more larger words i.e. 7 to 12 letter words than the Tabloid article. Also again similar to that of the first article the means might be close but again the Broadsheet’s is higher meaning on average that the Broadsheet had larger words, which is evident from the graphs. Unlike the first article these two articles have a much closer standard deviation but even so the Tabloid’s is smaller meaning that more of its words are closer to it mean of 4.78 and the Broadsheets words are more spread out.
Again this evidence supports my hypothesis as this article proves again that the Broadsheet has larger words due to its more complicated language and the Tabloid has smaller words due to its simpler language as it is aimed at the less educated reader.
Article 3- Alonzo’s Future Uncertain
These graphs show again the word length of the two articles. Both graphs show that there most frequently occurring word lengths are the 2 and 3 letter words but this is just because every article needs these to make sense. The mean in both these articles is very close but again the Broadsheet comes out on top meaning that there are more longer words in the Broadsheet yet again backing up my hypothesis. The standard deviation in these articles is different to the four previous articles as in these 2 articles the Broadsheet has a smaller standard deviation whereas in the previous 4 articles the Tabloid has had the smaller standard deviation. In these articles it is clear that the Broadsheet’s values are closer to the mean this time as it has a smaller standard deviation. You can also see on both that the 7 letter words are higher than usual I think this is because this article is about ‘Alonzo’ a ‘Formula 1’ driver therefore this article is obviously going to contain many words like ‘Mclaren’ and ‘Formula’ which cannot be avoided in both articles.
From this pair of box and whisker diagrams it is clear that the Broadsheet’s median is higher than the Tabloids but in contrast to the two previous box and whisker plots the extremes on these are the same with the smallest and biggest values on the Broadsheet being the same as the smallest and biggest values on the Tabloid. Also the upper quartile and lower quartile are the same on both graphs, making the interquartile range the same.
This final graph shows the overall word length of all six articles. It shows the total of the 300 words (100 words out of each of the three articles in that newspaper) out of the Broadsheet compared to the 300 hundred words out of the Tabloid. As you can see it is very clear that the Tabloid has more smaller words ranging from 1-6 letters than the Broadsheet and the Broadsheet has more bigger words ranging from 7-12 letters than the Tabloid. It is also clear that both newspapers have more than double the amount of 1-6 letter words than 7-12 letter word as they are more common in the English language.
From my results of these six articles I have proved my first hypothesis to be correct as in all three articles in each of the newspapers the Broadsheet has had the higher mean, meaning on average it has had more, longer words in than the Tabloid. This gives us an indication of the complexity of the language as one would assume the longer the word is the more complicated it may seem.
% of page covered in picture and title
Article 1- ‘Northern Rock Bank Crisis’
% area of pictures = area of pictures ×100
area of page
Same formula for % area of text just swap it for area of pictures.
These pie charts show that the area devoted to pictures and title and the actual text itself in a Tabloid and a Broadsheet article are very different. The first chart shows the Broadsheets percentages and 54% of the area of the article is devoted to text and 46% is devoted to the picture and the title. This supports my hypothesis that the Broadsheet would have more area that is text and less that is picture and title although it is too close for definite yet. The second chart is the Tabloids percentages which again supports my hypothesis as the Tabloid article has more of its area covered in picture and title than it does of text. This is because of the reader it aims its paper at; the people that would rather see a picture of the news and have the News given to them through pictures than have to read through and find out what is going on.
Article 2- ‘Alex Ferguson’s Attack’
Again these pie charts support my hypothesis as they show that the Broadsheet devotes more of its page to text rather than pictures and title with the Tabloid again showing that it devotes more of its page to pictures and big extravagant titles.
Article 3- ‘Alonzo’s Future Uncertain’
These graphs show again the fact that the Tabloid devotes more of its page to pictures and big titles in contrast to the Broadsheet that covers more of its page in text.
Total Areas
These last two charts are very useful as they yet again prove my hypothesis as they are showing all three articles put together from each newspaper. Making it very clear that the Broadsheet has more space on its page devoted to Text as it is designed for the more educated reader and the Tabloid has more of its page devoted to pictures and big titles as it is designed for the less educated.
By comparing these results we can see that readability of the Broadsheet is harder as not only does it have more text to read but as we discovered earlier it has longer more complicated words making the language more complex and challenging. I think that the mathematical process that best show the comparison between the two types of newspapers is the mean word length because you can really assess the word length and which newspaper has on average the more, longer words giving us an idea of the complexity of the language with the results you get. The mathematical process that I think offers least significance is the interquartile range as especially with these results it did not show us anything that wasn’t already made clear to us from the standard deviation.
The limitations of this project are that most of these claims are very stereotypical when you say that the Broadsheets are more aimed at the more educated older reader and unless you do an assessment of the age of the buyers you cannot prove this for sure. Also a lot of this project is based on the fact that if a word is longer then it means that it is more complicated. Whereas if you were to get a word like ‘beleaguered’ for example it is a very posh word that an uneducated person would have trouble knowing what it means and then take a word like ‘championship’ which is longer but I think that you know as well as I do that ‘beleaguered’ is a lot more complex and harder to understand and is not your average everyday word.
In order to achieve more thorough results I would have liked to look at more articles in order to get a wider range of results making them more reliable as it is I only had enough time to assess three from each paper. Suitable extensions could have been to count the number of word in each article but you would have needed to get more articles for that as three is not enough in order to prove or disprove your hypothesis you would need a suitable sample size of about 30 articles out of each paper to assess that properly. Another very good extension could have been to compare the age of readers and the education of the readers as this would tie in neatly with the rest of your project as you can actually state who the newspaper is read by giving you an idea of the reading skills of those types of people and from that assessing the readability of the newspapers themselves.
From my results in this project I have proved my earlier hypothesis stating that the Tabloid is easier to read than the Broadsheet as it has less complex vocabulary and more pictures; causing the Tabloid to appeal to a younger reader or a less educated reader and a Broadsheet to appeal to an older more educated reader.
By Joe Sharp UVP