Solutions
-
Tabloid: number of images on front page: 3
Image 1: 33.5 x 10.5 = 351.5
Image 2: 4.5 x 6.7 = 30.15
Image 3: 4.1 x 4 = 16.4
Total area: 351.75
30.15
+16.40
398.3cm²
Broadsheet: number of images on front page: 3
Image 1: 19 x 8.5 = 161.5
Image 2: 3.5 x 3.6 = 12.6
Image 3: 12 x 5 = 60
Total area: 161.5
12.6
+ 60.0
234.1cm²
-
Broadsheet: 12.5 x 9.5 = 118.75cm²
Tabloid: 14.5 x 6.5 = 94.25 13 x 9.5 = 123.5
94.25 + 123.5 = 217.75cm²
-
Broadsheet: 9.3 x 13.2 = 122.7cm²
Tabloid: 4.3 x 4.8 = 20.64cm²
-
Broadsheet: Total pages: 18
Sports pages: 0.33333333333333333333333333333333
0.33333333333333333333333333333333
18
Tabloid: Total pages: 60
Sports pages: 39
39
60
-
Tabloid: 1 page of international news 1
60
Broadsheet: 5 pages of international news 5
18
Readability
Hypothesis
I believe that the broadsheet newspaper will have a higher number of longer words than the tabloid and will therefore be harder to read.
To prove this I will create two frequency tables of the number of letters in a word for each newspaper. The data will not be grouped. From these cumulative frequency tables I will construct two cumulative frequency graphs. I will also create two grouped frequency tables in order to also construct two histograms. I will analyse both the histogram and the cumulative frequency graph in order to find out whether my hypothesis is correct.
To avoid bias I will collect a random sample of words. In order to do this I will use the RAN# button on my calculator. I will number each page in the tabloid from 1 to 60 and each page from the broadsheet 1 to 18. I will also number the first 200 words on each page 1 to 200.
I will press SHIFT RAN# to give me a random number between 0 and 1 I will then multiply this number by 60 or 18 (depending on which newspaper it is that I am collecting data from) and then round it to an integer to identify the page of the newspaper. I will again press SHIFT RAN# to give a 2nd number between 0 and 1 and multiply that by 200 and round it to an integer to identify the word on that page.
If this process gives the same word on that page then I will ignore that result and repeat my process until a different word is selected, however if the same word appears on a different page or in a different position on the same page I will not ignore it.
Number of letters in 100 words extracted from a broadsheet newspaper
Number of letters in 100 words extracted from a Tabloid newspaper
Analysis and Evaluation
In the first section of this project I came up with a few simple hypotheses and used simple calculations like finding area and working out percentages to prove them.
However for my readability hypothesis I came up with a more complex problem and therefore needed to perform more complex statistical calculations and procedures in order to verify my final hypothesis.
My readability hypothesis was that “the broadsheet newspaper will have a higher number of longer words than the tabloid newspaper and will therefore be harder to read.” To prove that this was true I needed to gather a sufficient amount of random words from each newspaper and compare the lengths of the words by counting the letters.
I collected 100 words from each newspaper, count how many letters there are in each and put the data in two separate frequency tables (one for broadsheet and one for tabloid) from which I constructed two cumulative frequency tables so that I could draw two cumulative frequency graphs and box plot diagrams.
I also created two grouped frequency tables for each newspaper so that I could construct a Histogram for each set of data. I did all of this so that I could make observations that will prove that my hypothesis is true and also see if I can make any improvements which would make my results more accurate.
To avoid bias I used the RAN# button on my calculator to collect a completely random sample of words.
The cumulative frequency table for the words in the tabloid has a Median of 3.2 this suggests a low representation of the number of letters per word, this can also be said about the Mode, as it is very similar to the Median. The Mode is 3 and this shows that most of the words taken from my random sample of words in a tabloid have 3 letters.
The Median of the broadsheet is 5.7, which shows a larger estimate of the number of letters per word.
Apart from this obvious conclusion I have drawn on a number of other variables to act as a contingent to this.
For example the Histogram can show us the general spread (whether it is positively of negatively skewed) of the data. The tabloid justifies my hypothesis very well as it is negatively skewed which means the majority of the data is concentrated towards the lower quartile than the higher.
However the broadsheet does not display such a theoretical fit. I had expected a more so negatively skewed graph; of course I did get a comparatively more negatively skewed graph.
The cumulative frequency graph also depicted a lot of information. The gradient of the tabloid is a lot steeper at the beginning of the graph, showing that most of the data is within the first quartile. As we go further down in the ‘x axis’ the line gets flatter. The lower quartile is 1.8 and the upper quartile is 5.6, giving an inter quartile range of 3.7 the inter quartile range for the tabloid is much less than that of the broadsheet showing a more consistent unbiased distribution centred around the Median.
The box plot diagram tells gives us another perspective of our data.
For example the Median is closer to the lower quartile in the tabloid than in the broadsheet. This was hypothesised, but a larger discrepancy was anticipated.
The Median of the broadsheet although further away from the lower quartile than the tabloid is still less than we expected. This is to do with bias, because there are large amounts of 3-4 letter words in all newspapers regardless of whether it is broadsheet or tabloid, there will never be a completely positively skewed graph of letters in words for any newspaper.
I have concluded that my tabloid data has fitted my hypothesis very smoothly, which suggests that tabloids have a larger amount of words with few letters.
My broadsheet did not fit my hypothesis ideally but never the less did support the main idea. I believe this to be because there are a large amount of 3 letter words in all newspapers