Pre-Test
The results from the pre-test: -
The Times:
Average number of words per sentence= total number of words/20
= 31.65
The Daily Mirror:
Average number of words per sentence= total number of words/20
=17.25
The Guardian:
Average number of words per sentence= total number of words/20
=32.05
The results from my pre-test show that the broadsheet paper has a longer average sentence length by only a small margin compared to the quality. The broadsheet has a longer average sentence length to the tabloid paper by a large margin. The tabloid paper had the shortest mean sentence length from all three papers.
The Guardian paper has an average sentence length of 32.05 that was very close to the result of the Times, which had 31.65. The Daily Mirror had an average sentence length of 17.25. In my opinion, these results will reflect on the total results with the Tabloid paper having the least. It is very difficult to predict which paper will have the longest sentence length from the Broadsheet and Quality paper.
Hypotheses
- The sentence length of the Quality and Broadsheet will be similar.
- The sentence length of the Quality and Broadsheet will be longer than the Tabloid
- News category sentence length will be longer than sport in all three newspapers.
Details of Survey and Results
The Times:
The Daily Mirror:
The Guardian:
To find the number of sentences needed for each section a number of different steps were taken. Firstly, the total area for each of the four different sections was taken which was done by drawing boxes around the text that was to be included. The area was then found by measuring one side of the drawn box and multiplying it by the other to find the area in cm². Next all the totals for each section were added together to find the total of each paper. The total for each section was then divided by the total for the paper. The total number of sentences that I wanted to sample was then decided at 65. Finally the following equation was used to decide how many sentences each section will have tested for sentence length in the stratified sample:
Area of Section/Total Area of Newspaper * Chosen number of sentences.
For example, the total of the news of the Guardian paper was 8105.22cm² and the total area of the paper was 15483.66cm². These divided by each other are 0.523. This multiplied by 65 is 34.025 but as it is virtually impossible to have 34.025 sentences it was rounded to 34. This has been to all the number of sentences for all the newspapers with rounding up as well when necessary.
Two separate tables were used to record the results of the sample. The first table recorded the sentence lengths found for each newspaper in a tally chart format. A separate tally chart was used for each section within the newspaper so that it would be easier to compare the sentence lengths of the different sections later on. For example:
The next table that was used was put in groups and was for the entire paper. This also had an extra column of cumulative frequency, which added up the numbers as going along. This allowed me to make a cumulative frequency curve that in turn allowed me to make a box plot. A box plot would show the inter-quartile range and median as well as the positions of the lower and upper quartile that would help to make comparing the newspapers sentence length a lot easier. For example:
Graphs and Box Plots
The following tables are tally charts showing the number of word found in a sentence for the stratified sample of all the papers. The different categories have been split up into different tables.
The Daily Mirror-News
The Daily Mirror-Entertainment
The Daily Mirror-Business
The Daily Mirror-Sport
The Times-News
The Times-Entertainment
The Times-Business
The Times-Sport
The Guardian-News
The Guardian-Entertainment
The Guardian-Business
The Guardian-Sport
The next set of tables with the cumulative frequency column added on:
The Times
The Guardian
The Daily Mirror
The cumulative frequency graphs and box plots for all three newspapers are on the next page.
Comparisons
In this section all three of the newspapers sentence length will be compared by their median and inter quartile range. These values were used because the median will give an average sentence length of the entire paper in question. The inter quartile range is not affected by extreme values and will show the amount the data is dispersed therefore, a larger inter quartile range means that’s the data is more spread out. This will make sure the results are accurate and not affected by very short or very long sentence lengths, but also make sure that the extreme values do count in the average with the median.
Firstly the Times will be examined. By looking at the box plots it can be seen that the Times has an inter quartile range of 17 so its data is quite spread out. This range of the paper is 65 which shows that the data is more representative of the total amount of sentences in the entire paper. Its skew is slightly positive which means there is a slightly larger value of small sentence length value than longer sentence lengths. The average sentence length of the paper is 27 which can be seen by looking at the median on the box plots.
Next the Guardian will be examined. By looking at the box plots it can be seen that the Guardian has an inter quartile range of 14 so its data is rather consistent. The range of the paper is 60 which shows that the data is representative of the total amount of sentences in the entire paper. Its skew is very slightly negative but only by a few values. This shows how the data is spread out quite evenly. The average sentence length of the paper is 28.5 which can be seen by looking at the median on the box plots.
Next the Daily Mirror will be examined. By looking at the box plots it can be seen that the Guardian has an inter quartile range of 15 so its data is evenly spread out. The range of the paper is only 40 which tells us that the data is more consistent throughout the paper. Its skew is exactly symmetrical that tells us again that the data is evenly distributed throughout the paper. The average sentence length of the paper is 22.5 that can be seen by looking at the median on the box plots.
The Times and the Guardian have very similar lower quartiles but the upper quartile is bigger on the Times than the Guardian. This tells us that the data in the Times is more spread out with it having a larger inter quartile range. The full range of the paper is also very similar which tells us that both the samples were very representative of the average sentence length throughout the whole paper. The median on the Guardian is slightly bigger than that of the Times which shows that the Guardian has a longer average sentence length. The Mirror on the other hand, has a very small range that tells us how the data is very consistent throughout the entire paper. Its Inter quartile range is bigger than that of the Guardian that suggests that its data is more spread out. It has a perfectly symmetrical skew that also suggests that the data it evenly distributed throughout the paper. Its median is smaller then that of the other two papers that suggests that it has a lower average sentence length.
One of the hypotheses stated that the news category sentence length will be longer than sport in all three newspapers. To prove this a separate investigation was carried out. The same methods were used as the other investigation but only the two categories sport and news were examined. The results from the stratified sample were taken for sport from all three newspapers and put together. The same was done for news. Cumulative frequency curves were then drawn up and box plots made.
The results:
Sport
News
Cumulative frequency curves and box plots are on the next page.
The box plot for the news shows that it has an inter quartile range of 15 so its data is evenly spread out. The range of the news is 45 that is quite small so the data is very consistent throughout the section in all three of the papers. Its skew is very negative which shows that there is a greater value of large sentence length than smaller sentence length. The average sentence length of the paper is 30 that can be seen by looking at the median on the box plots.
The box plot for sport has an inter quartile range of 20 so the data is gain reasonably spread out. The range of the sport is 65 that is large so shows that the data is more representative with big and small values. Its skew is perfectly symmetrical which again gives the idea of the data being evenly distributed throughout the section for all three papers. The average sentence length of the paper is 30 that can be seen by looking at the median on the box plots.
The inter quartile range is larger on the sports section than on the news but the skew on the news section is very negative. This tells us that the sport section has a variety of sentence length but the news section has the majority of longer sentence length. The medians of the two categories are exactly the same as the extreme high values compensate for the majority of low values in the sports section. The range of the news section is extremely low compared to the sports section, which has a mean of 65 with news having 45. This shows how the results for the news section are more consistent whereas the sports section having more extreme high values.
Were My Hypotheses Correct?
My first hypotheses stated that the sentence length of the quality and broadsheet would be similar. This was proven to be correct as the lower quartile of both papers were very similar as well as the median being very similar. The lower quartile on the quality paper is 19.5 with the lower quartile on the broadsheet paper being 21. This proves the average sentence length was similar. The Times paper, the quality, had a larger inter quartile range but this just means the data was more dispersed throughout the paper. Its inter quartile range was 17 with the Guardian having an inter quartile range of only 14. The Times also had a higher overall range, 65, so this again proves the data was more dispersed throughout the paper than the Guardian, the broadsheet paper. Overall, even though the data was more dispersed throughout the paper, the quality had roughly the same sentence length than that of the broadsheet paper.
My second hypotheses stated that the sentence length of the quality paper and broadsheet paper will be longer then that of the tabloid paper. This was also proven to be correct, as the median of the tabloid paper was smaller than the other two papers. The median of the Times is 27 with the median of the Guardian being 28.5. The median of the quality, The Daily Mirror, was only 22.5. The inter quartile range of the tabloid was 15 whereas the inter quartile range of the broadsheet was smaller than this at only 14. This only proves that the data in the tabloid paper is more dispersed than the in the broadsheet paper but the overall sentence length is bigger with the broadsheet.
The third hypothesis that was stated was that news category sentence length will be longer than sport in all three newspapers. This hypothesis was not proven to be correct. The median of both categories was exactly the same that means that the average sentence length was the same. Both the medians were at 30 of the news and sport categories. The skew of the news category is very negative which proves that the news category actually had longer sentences than the sport that is a perfectly symmetrical skew. Although the news category had longer sentence lengths with a negative skew, the extreme high values made up for the low values in the sport section so the median was the same on both the categories. The news category had a smaller inter quartile range of 15 with the sport having one of 20. This tells us how the data of the sport section was more dispersed throughout the section. The data of the news category was then more consistent with there being mot many extreme high or low values.
There were many differences in readability, which were influenced according to the topic. The quality and broadsheet papers both had similar long sentence lengths, which generally means the paper is less readable. The tabloid paper had short sentence lengths on average, which generally means the paper is easier to read. The news section had the longest sentence length with the entertainment having the least.
There may have been some different influences on the results that were gained. Firstly, the random numbers chosen by the calculator. Even though they were totally random the calculator a number of times, which influenced the results, repeated the same numbers. Secondly, the in the sport category there was a major event which was taking place as a F1 team had been accused of cheating. This then meant that there was more writing in the sport section for all three papers than there would have normally been if the paper were chosen on a different day.
There were a number of limitations of the investigation, which mainly involved the size of sample taken. The bigger that the sample is the more accurate that the results would have been. This could be a method to improve the investigation and to make it more accurate. Improving the investigation in this manner could allow there to be readability table on every newspaper according to its genre and would allow people to chose which type of newspaper they would like to read.