I believe that the number of letters per word will be greater, on average, in a broadsheet newspaper than in a tabloid newspaper.
Maths Coursework-statistics
Hypothesis one
I believe that the number of letters per word will be greater, on average, in a broadsheet newspaper than in a tabloid newspaper.
Hypothesis two
I also believe that the number of words per sentence will also, on average be greater in a broadsheet newspaper than in a tabloid newspaper.
The reasons for these assumptions are due to the association of intelligence with broadsheet newspapers, and a certain lack of intelligence with tabloids. The general belief is that broadsheet newspapers such as 'The Times' are aimed at the higher earning. More intelligent reader and so, you would expect a higher quality of English used, therefore longer words and sentences. Tabloid newspapers are supposedly aimed at the lower earning, less intelligent reader and so the quality of English used wouldn't be as good as that found in a broadsheet, so the length of words and sentences one would assume would be smaller.
Plan
To test my hypotheses I would need to record the length of words and sentences form both a broadsheet and a tabloid newspaper. I decided that I would find an article on the same issue, in the belief that this would make the results fairer as the writers would be using the same topic of discussion, so the differences between both articles would be more apparent. To a certain degree this method is random sampling, except that I haven't been totally random in my choice of article. So I decided that I would record the length of the first one hundred words from each article, and also record the number of words in the first ten sentences of the same article. From these results I would then hopefully be able to draw conclusions with regards to my two hypotheses. If I were to make the results fairer I would need to take the length of words from more articles, as it would give more accurate representation of the whole newspaper. If I were to really increase the accuracy of my findings I would need to take samples from many different broadsheet and tabloid newspapers, ideally every single one, in order to fully test my hypothesis. The same for my second hypothesis, to make the results more accurate, it would have been better taking more results from the same paper, but a different article and also from a range of newspapers.
I also ignored any names or titles found in either article, as if used in both, they would be the same and they are nor proper words really anyway.
Results
To carry out my investigation I bought a copy of 'the Times' newspaper and a copy of 'The Daily Mail', broadsheet and tabloid respectively. I had originally planned to draw up tables, with each set of one hundred words' lengths. But as I began this I realised that how time consuming it would be, so I decided to draw up frequency tables instead. You simply put the raw data into a tally chart, for the number of times each length is recorded and you are left with the frequency of each record in you results.
(Seen below)
Results for 'the Times' (word length and sentence length)
Length of word
Frequency
3
2
21
3
5
4
9
5
1
6
1
7
8
8
2
9
6
0
1
2
2
0
3
Sentence number
Length
24
2
38
3
7
4
45
5
37
6
34
7
26
8
...
This is a preview of the whole essay
(Seen below)
Results for 'the Times' (word length and sentence length)
Length of word
Frequency
3
2
21
3
5
4
9
5
1
6
1
7
8
8
2
9
6
0
1
2
2
0
3
Sentence number
Length
24
2
38
3
7
4
45
5
37
6
34
7
26
8
37
9
35
0
32
Results from 'the Daily Mail' (word length and sentence length)
Length of word
Frequency
3
2
21
3
7
4
8
5
0
6
7
7
4
8
2
9
5
0
1
0
2
3
Sentence number
Length
22
2
32
3
1
4
29
5
39
6
29
7
7
8
1
9
5
0
3
After collecting the results there are a number of tasks to carry out so that I can compare the two sets of results, and more importantly compare them against the two hypotheses.
Averages
The first thing that I carried out using the data was finding the average length of word. The way that this is discovered, using my frequency tables, is by multiplying each word length by the frequency (the number of times it was found) and then to find the sum of the total length of letters. Once you have that number, one can simply divide by the number of words, in my case one hundred, and there you have your average.
* The times' average word length was = 458 = 4.58
100
* The mail's average word length was = 454 = 4.54
100
The next average to carry out would be the average length of sentences in each of the two articles. This time all I needed to do was find the total number of sentences for each newspaper, and then divide by the total number of sentences, in this case, ten.
* The times' average sentence length was = 315 = 31.5
10
* The 'Mail's average sentence length was = 208 = 20.8
10
Graph's
After completing the averages section, I then went on to construct some graphs using Microsoft excel. These graphs are known as frequency graphs, as they use the frequency of each score, or record in a graph, and if you put two sets of relevant data into the same graph, it is a very useful tool to help compare those sets of data. They focus on the peak, or the highest frequency, of your results and also the spread, which shows how far the results are from the mean (average). For example, whether most of the words are concentrated around the mean, or if they are spread out, across the graph.
I decided to put the two sets of data, from the times and the daily mail together on the graphs, in order to be able to compare them easily.
Length of words-Frequency graph
As this is a frequency graph, I cannot take the results form the sentence length and make these into the same form, as they are not frequencies. And, as the results for sentence were not collected in direct association with the word length, I cannot use a scatter graph to establish any relationship. So I decided to do the following. I decided to sort the data into order; from lowest to highest, then similar to a scatter graph, I would plot points onto a graph. Then also similar to a scatter graph I would draw a line of best fit for these points. This line of best fit now means something different, the steepness of the gradient, show that there isn't a consistency in the results, so the flatter the line, the. As I have rearranged the data, the x-axis now becomes meaningless, so the sentence number isn't really the sentence number it just refers to the first to tenth sentences, in order of size.
(Graph can be seen overleaf)
Analysis of my results
So what do my results show, and how much do they agree with my original hypotheses?
Averages (means)
The averages form the two sets of results of word lengths, for the times and the daily mail surprised me. In accordance with my first hypothesis I had expected the length of words on average to be greater in a broadsheet newspaper, in comparison to a tabloid. On average they were but the margin between was no gulf, the difference between the two averages was a mere 0.04. This shows that the length of words in these two articles were not as short in the tabloid, and in that in this article at least the length of words was on the whole, more or less the same. So the results agree with hypothesis, so at least for the first hundred words in the two articles, on average the broadsheet newspaper used longer words than the tabloid newspaper.
After seeing the difference, or rather a lack of it in terms of word length I had then expected to see the same distinct similarity between the two sets of results for sentence length. But, in this case, my original thoughts were true and to the same extent that I had imagined. This time the difference between the two mean lengths was 10.7, a quite significant difference. This shows that the length of sentences, at least in these two sets of results for broadsheet and tabloid newspapers that the length of sentences is greater in broadsheets than it is in tabloids.
So both hypotheses were true, in these results at least.
Frequency graph
The frequency graph shows also indicates the lack of difference between the lengths of words. Both sets of results peak at the same point, two, and the majority of the other results, for both papers are not spread too far from the mean lengths. This backs up further the similarities between the results, buts till, the results agree with the hypothesis. But they almost do not.
Length of sentence graph (previous page)
The length of sentence graph shows the following things. The sentence length on average is greater in the broadsheet paper than in the tabloid, a point supported by the mean length. The broadsheet newspaper has consistently longer sentences, shown by the less steep gradient of the line of best fit. The broadsheet newspaper's sentence length is consistent, as the gradient of the line shows, as there are quite small sentences and rather large sentences, and so the tabloid has a smaller mean length of sentences. So this graph shows that the broadsheet newspaper has on average longer sentences, a point supported by the mean lengths and therefore supporting the second hypothesis.
Analysis of data
What types of data have I been working with in my investigation?
The data I have been using has been two things, quantitative and continuous. This means simply that I have using number and that it would have been possible for the data to have any value.
Conclusion
There are two main things to consider at this point of my investigation:
* Are my results reliable enough for me to draw a firm conclusion from? And if so...
* Do my results support my original predictions (hypotheses)?
With regards to the reliability of my results I would say that they are in some ways and in others they are not. For the purpose of my investigation they are, as it matters little if they perfectly correct or not, but there aren't enough results, to say that I could draw a firm conclusion for every broadsheet and tabloid newspaper, or even for the two I used. I chose only one article, ideally I would have been better taking five or six articles, at random to ensure that they aren't biased, but not from one broadsheet or tabloid but several. Ideally to really be able to say that you could sure as possible that they are reliable you should collect data from every single one on sale, but to take data from the papers every day for a week or a month, to try and limit the fact that the results found on one day are not simply chance. My results are reliable for analysing the length of the first hundred words, and the first sentences in two articles form the times and the daily mail on Monday the 27th October 2003. In that respect they are reliable.
So, assuming that my results are reliable enough to draw a conclusion I would say this. The results showing word length did support my prediction, that on average the broadsheet newspaper uses longer words than the tabloid. One must always remember though that data proves nothing, so I cannot be sure that the results I have found are true to reality, the data could simply be down to chance. As I said the results I have are not really sufficient enough to draw a rock solid conclusion regarding sentence length. To test the significance of my results I could carry out a statistical test, to test the strength of my evidence.
The same for the second hypothesis, the results are really not sufficient but from my results I am able to say that they supported by prediction, that there are longer sentences in broadsheet newspapers. And as I had though, there was a significant gap between the broadsheet and the tabloid.
As far as it goes, my results are useful enough, at least for my investigation. My data suggests that there are longer words and longer sentences in broadsheet newspapers than in tabloid newspapers. So, for someone who likes to read longer words in longer sentences I would say, "Go read a broadsheet newspaper!"
Possible extensions
* If I were to test this question further I would take more results, from every broadsheet and tabloid newspaper on sale, and collect results on every day or every other for a month, to ensure of a greater reliability.
* I could test the lengths of words and sentences against newspapers and magazines.
* I could test length of words in a magazine against the length of words on the Internet.
* In future I were to carry out a similar investigation I would collect more data, and try to ensure that my results are significant, so that I could draw a firmer conclusion.