The results which I do gather will help to provide evidence on which newspapers have a high or low readability.
I will be counting the amount of letters within one word to measure for readability. For continuous data I will be taking the size of images upon one page and finding the percentage which is used for images as apposed to text. The more images there are suggests that it is easier to read as there are fewer words on the article.
The samples for the investigation were taken at random by using a random number generator found on a calculator. The random number will define what day the paper should be from, what page it should be from and which word to count. The preliminary results are shown below. The table does not show the results well so a graph was added to better interpret the results.
My hypotheses using the preliminary results are that the two “Broadsheets” have a higher readability to the “Tabloid”. I also believe that the “Times” is more complex than the “Independent”.
When I did the pre-test I found that the different papers wrote differently for different topics. I decided therefore that I should take the different newspaper and put the paper into categories. I took the main three categories of the papers. These were World News, Home News and Sport.
I also discovered that in the papers the categories were written in different proportions. I took a collection of papers from each of the three I am studying and found the percentage of which the newspapers wrote on the different categories. The exact percentages are shown below.
By finding the different percentages of the three categories it is possible to collect a stratified sample. This stratified sample helps to give a better reflection of the newspaper. The number of results taken for each newspaper shown above is a stratified sample. To get a stratified sample the newspapaer were counted for how many pages they had in total and then how much of those pages were used for the various topics. By finding the amount of pages that the three categories amount o it is then possible find the amount of a certain category contributes to the newspaper.
Collecting, Processing & Representing Data
The method in which I have collected my discrete data is random. I took a calculator to give me random numbers, using a random number generator. The number was used to define from which day, from which page, from which paragraph, from which sentence and the word number within the sentence. For example, the number 315041022. The first number defines the day, 3, from 1-5 for Monday to Friday. The next two numbers define upon what page the sample is taken from, 15, this would mean that the sample is taken from page 15. The next two digits are for the paragraphs, 04, the paragraphs then followed by the word number also two digits. This is followed by the last two numbers defining the word within the sentence.
The categories have been stratified so that it is possible to compare the papers overall just using these three categotries . I have also not taken any results from the Front Page of the three newspapers and will not take results from the television & radio guide. I will not be taking result from the front page as the front page mainly contains advertisements, pictures and titles. Thus it is not a reliable source from which readability can be measured. I will also not be taking results from the pages of the television & radio guides as these will all be the same in all the newspapers and therefore will not show the differences between the three papers. Entire pages of adverts were also ignored for the reason that they did not give any data for word length and were not a reliable source from which continuous data could be deduced.
Although the results can be displayed as a table of results, above, to get a quick idea of the actual way in which the data is spread it is more easily seen in a graph as shown on the next page.
Whilst collecting the continuous data it proved to be a extremely time consuming task which proved at times to be quite difficult as the images were not in regular shaped frames or boxes. For this reason I felt that only a small amount of continuous data was needed as I found that whilst searching through the papers before the investigation most of the images were all roughly the same size in area. It already possible to see from the table above that the “Daily Mail” puts larger images within its newspapers. Although the frequency of the images upon one single page were about the same in all the papers the “Daily Mail” would use the images a great deal more than writing an article.
To find the central tendency I have used the mean. I have used the mean as in my results there are no rouge results which would drastically effect the mean. From the results by using the mean I can show that the “Tabloid”, “Daily Mail” has a lower complexity for readability. The words on average are shorter and the newspaper contains a far higher percentage of images on a page than the “Times” or “Independent”, both “Broadsheets”, has. This does not surprise me as it is what I predicted. Shown below is the average using the mean of all three newspapers.
It is possible to see that the two “Broadsheets” are very similar, which is to be expected, however as I have mentioned above there is a difference between of almost two letters. For a method in which it was possible to see the difference in the dispersion of the word length of the newspapers I took the cumulative frequency graphs and took the inter-quartile range. The results of this can be seen below. Firstly the three papers will be compared for World News. I will then continue with the Home News and Finally the Sport section from the three papers. I have then taken all three sections which can be done as I have used a stratified sample and place the three newspapers into graph for overall word length across the papers as well.