Hypothesis:
I think that in a broadsheet paper the average length of words will be greater than the average in a tabloid. I also think that the average sentence length will be greater in a broadsheet than a tabloid newspaper. I think this is true because the broadsheet newspaper are aimed at an audience of intellectuals and provide more informative articles in the news, whereas the tabloid is more aimed at entertaining people and is similar to a magazine which is full of gossip and shorter, briefer articles.
Brief Pilot:
I will begin with a brief pilot study so that I can recognise any problems that may occur when collecting my actual data. I will collect a small amount of data from a short article in any paper to allow me to see any problems that could occur and so that if they do, then I know how to counter them.
My brief pilot data:
Problems that occurred were:
- Apostrophes e.g. Emily’s, It’s
- Hyphen’s
- Abbreviations
- Numbers i.e. digits e.g. 38
- Times e.g. 7.00pm
- Headlines
- Sub-headings
- Acronyms e.g. NATO
How I will counter these problems:
- If I come across apostrophes I will ignore the apostrophe e.g. can’t and either count it as one word or as a four letter word.
- If I come across hyphen’s I will ignore them completely and count it as one word.
- Abbreviations I will ignore completely.
- If I come across numbers e.g. 38 I will not count it at all.
- If I come across times e.g. 8.00pm I will not count it at all.
- Headlines I will not count.
- Sub-headings I will not count.
- Acronyms I will not count.
Plan:
I will collect my data from a broadsheet paper; The Times and from a tabloid paper; The Sun. I am going to select a sample of 200 words and sentences from both papers, as I can’t count the number of words and sentences in the entire newspapers.
My samples will be unbiased because I am taking samples of quite a large size and they are representative of both newspapers, because when I am collecting my samples for the number of words per sentence, I will select 50 sentences from 4 corresponding sections of the paper e.g. articles with similar relevance so it’s representative of each newspaper.
When I am collecting samples for the sentence length I will select 50 sentences from 4 corresponding sections of the paper so that my data is representative of each newspaper as well as being unbiased.
I will record my data in tally charts and when collecting the length of the sentences I will use grouped frequencies and I will then calculate the mean for both the sentence length and the word length. I will then put my data for word length into a bar graph and my grouped data of sentence length into a histogram so that I can compare the distributions. I will also use a cumulative frequency diagram in order to calculate the median and IQR by drawing box plots. This will show me the average word and sentence length and will also show me how varied the results are and I will be able to compare the distributions easily.
Finally I will use a summary table so that all my statistics can be shown together and can be compared.