I could work out the percentage of the page devoted to advertisements, pictures, text and headlines and represent this data in the form of a pie chart or a bar chart.
Again I shall use a range of tabloid and broadsheet newspapers to allow me to come up with a general conclusion which may support my hypothesis. To make any calculations accurate enough to draw a valid conclusion I shall collect data from 7 pages of each newspaper. Therefore in total I shall collect data from 28 pages. I shall select these pages at random by using my calculator. I shall first count the number of pages in the newspaper (x) and then enter the following in to my calculator:
'x Ran#
For example, If “The Times” has 88 pages, I would enter “ 88 Ran# “. This would allow the calculator to select a number between 0 and 88 at random. I shall then use the first seven numbers taken from the calculator reading as the pages from which I shall collect the data.
The pages I shall collect data from will be similar in both types of newspapers ie. Pages on sport from both newspapers will be compared. This will allow me to produce a valid conclusion, as it would be unjust to compare a section on sport with a section on current events, as the section on sport is likely to contain more pictures.
Once I have collected my results and calculated the percentage’s of text and pictures presented on the page for each newspaper I can present my results in the form of a pie chart. I can then collate my data from the newspapers to produce a pie chart showing the percentage of text and pictures on a page for Tabloid and Broadsheet Newspapers rather than specific newspapers.
Hypothesis 3:
♦ Tabloid Newspapers give an “easier read” than Broadsheet Newspapers (lower reading age)♦
The idea of a newspaper providing an “easier read” can be explained in he following way.
“Readability is concerned with the problem of matching between reader and text. An accomplished reader is likely to be bored by simple repetitive texts. A poor reader will soon become discouraged by texts that s/he finds too difficult to read fluently.
This is likely to happen when the text is:
- Poorly printed
- Contains complex sentence structures
- Long words or
- Too much material containing entirely new ideas”
(This information has been obtained from the website )
A factor that affects readability concerns the words and sentences chosen by the author. This factor is the most easily proven and so I shall use this when investigating my hypothesis.
The term ‘reading age’ is used to indicate the age of a reader who could just about understand the text. Therefore a piece of text with a lower reading age provides an easier read than text with a higher reading age.
In my investigation I shall attempt to calculate the reading ages of the different articles from both newspapers. This will be done by using calculations involving sentence length and the number of syllables. To calculate the reading age of the articles I shall use the “Gunning ‘FOG’ Readability Test”. I shall calculate the reading age of 10 articles from each newspaper. This provides a sample size of 40 in total. I shall select these articles at random by using the same method for random sampling used to collect data for hypothesis 2.
Gunning ‘FOG’ Readability test:
- I shall select samples of 100 words (i.e. use the first 100 words of the article)
- I shall calculate L, the average sentence length (number of words divided by the number of sentences)
- In each article I shall count the number of words with 3 or more syllables and then find N, the average number of these words per article.
Te grade level needed to understand the material = (L+ N) x 0.4
So the reading age = [(L+ N) x 0.4] + 5 years.
To gain accurate results I may also use another test to calculate the reading ages for the articles. A different test to calculate the reading age of a piece of writing involves reading points off a graph, which correspond to a reading age. I may use the following “Fry Readability Graph” to do so:
Fry Readability Graph:
- I shall select samples of 100 words (i.e. Use the first 100 words of the article)
- I shall then find y, the average number of sentences per 100-word passage
- I shall find x, the average number of syllables per 100-word passage
The corresponding point of (x,y) can be located on the Fry graph (below) to determine the reading age, in years.
I may use both of these tests to calculate the reading age because they are both suitable for all ages, from infant to upper secondary.
When calculating the readability of each article/text I may use both of the tests described above. I can collect my data and present it in the form of a tally chart. This data can then be represented in the form of a histogram. This would be possible if the data is collected in groups (e.g. reading ages in groups: 10 – 14, 15 – 18 etc.) I could calculate the mean, median and mode reading ages of the articles in the newspaper as well as the quartile ranges. I could also calculate the dispersion/variation in the reading ages by calculating the standard deviation of the data using the same formula described in Hypothesis 1.
I would finally distinguish which newspaper gives an “easier read”. This could be done considering a number of factors. The comparison of the mean reading age of the two newspapers could show which provides an “easier read”. It could be argued that the newspaper that provides less variation in the reading ages of articles provides an easier read as all articles are similar in vocabulary and style which appeals to a certain age group.
Once I have obtained all of the data and represented this in a number of ways I can interpret and evaluate the data collected to come up with valid conclusions that may support or contradict my hypothesis. I will then also be able to produce a general conclusion that may link all three hypotheses.
Data Collection and Presentation
Hypothesis 1:
I used the following table to collect the data relevant to investigating the first hypothesis:
Etc.
I then transferred the data in to a different table to calculate the standard deviation of my results. The following tables show the data I collected concerning word length for each newspaper. They also include relevant calculations that I used to work out the standard deviation of the data. I collected the data from two articles written on the similar topics from each newspaper. E.g. I sampled 100 words of each newspaper from an article written on “the Iraq Crisis” and another 100 words from an article written on “Current National News/Events”. The results are all correct to two decimal places.
BROADSHEET NEWSPAPERS:
Sample 1
Sample 2
Sample 3
Sample 4
Therefore the mean word length for Broadsheet Newspapers = 5.49 (2dp)
The mean Standard Deviation of the word lengths = 7.57 (2dp)
The standard deviation of the word lengths indicates the amount of variation/dispersion from the mean of the word lengths. I shall compare the standard deviation of the word lengths from Broadsheet Newspapers with the Standard Deviation of word lengths from Tabloid Newspapers later in this investigation. This should help to show which type of newspaper has the least variation in word length.
TABLOID NEWSPAPERS:
Sample 1
Sample 2
Sample 3
Sample 4
Therefore the mean word length for Tabloid Newspapers = 4.33 (2dp)
The mean Standard Deviation of the word lengths = 4.57 (2dp)
I shall later compare these results with those obtained for Broadsheet Newspapers to determine which type of newspaper contains the least variation in word length.
Hypothesis 2:
When investigating my second hypothesis I collected my data in the following table:
Etc.
For each newspaper I collected data from 7 different pages. I then calculated the areas of the page that consist of text and the area of the page that consists of pictures or advertisements. I converted these figures in to percentages to show the percentage of the page consisting of text or pictures. Once I had collected all data and worked out the percentages for each page of the newspapers, I calculated the mean percentages and areas of the pages devoted to text or pictures. I then produced a pie chart representing this information for each Newspaper. Finally I calculated the mean percentages of pictures/text per page for Broadsheet and Tabloid newspaper and I presented my results in the form of a pie chart. I also produced a bar chart, which compares the two sets of data for the different types of newspaper. It will be much easier to notice any significant differences between the two sets of data if they are presented on the same graph/chart. I shall use this bar chart and the other results to come to any conclusions regarding my hypothesis.
In the following section I shall present the tables in which my data was collected and any calculations used to work out the averages and percentages:
I have also separated my data in to Tabloid and Broadsheet Newspapers:
Broadsheet Newspapers
The following pie charts are presentations of the mean values of the data collected. There are two pie charts:
Fig. 1 – A pie chart representing data collected from “The Times”
Fig. 2 – A pie chart representing data collected from “The Guardian”
Tabloid Newspapers
The following pie charts are presentations of the mean values of the data collected. There are two pie charts:
Fig. 3 – A pie chart representing data collected from “The Daily Mirror”
Fig. 4 – A pie chart representing data collected from “The Daily Mail”
I also calculated the mean percentage of pictures/text per page for the two types of newspapers (tabloid and Broadsheet). I calculated this by collating the results from the two tabloid newspapers and the two broadsheet newspapers. I then calculated the mean percentages of these sets of data. My results are shown in the following table:
I used these results to form three charts. I produced a pie chart to show the percentage of text/pictures per page for Tabloid Newspapers. I then produced a similar pie chart showing similar data concerning Broadsheet Newspapers. Finally, I produced a bar chart that presents both sets of data for Tabloid and Broadsheet newspapers on the same graph. This should make it much easier to tell any differences and I shall use this to compare my results and draw valid conclusions regarding my hypothesis. I shall use all of these three charts to draw conclusions when interpreting and evaluating the data.
The following charts are:
Fig. 5 – A pie chart representing data concerning Broadsheet
Newspapers.
Fig. 6 – A pie chart representing data concerning Tabloid Newspapers.
Fig. 7 – A bar chart comparing results from both Tabloid and
Broadsheet Newspapers.
Hypothesis 3:
For my third hypothesis I collected the data in the following table:
In this table: - Etc.
L represents the average sentence length
N represents the average number of words with 3 or more
syllables per article.
I sampled 10 different articles from each newspaper at random when collecting my data. My results are as follows:
BROADSHEET NEWSPAPERS
TABLOID NEWSPAPERS
I then collated the data from the newspapers to produce a table that represented data for Tabloid Newspapers and Broadsheet Newspapers rather than the specific newspapers. I produced the following table and used the data to calculate the mean Reading age for Broadsheet Newspapers and the mean Reading Age for Tabloid Newspapers.
I cannot produce any graphs that are relevant to investigating my hypothesis and so I shall use the above table and results in the interpreting and evaluating section of my investigation to draw any conclusions with regards to my hypothesis.
Interpretation and Evaluation of Data
Hypothesis 1:
I can compare the two sets of data obtained for Broadsheet and Tabloid newspapers in many ways:
- Comparing Standard Deviation
- Comparing the three averages (mean, median and mode)
- Comparing Histograms and Frequency Polygons
- Comparing Cumulative Frequency Graphs (quartile ranges etc.)
However not all of these comparisons are relevant to investigating my hypothesis. As I am investigating the variation in word length in the newspapers, the Standard Deviation of the sets of data would be the most useful to compare. The standard deviation of the data is a measure of how far from the mean the data is spread. The standard Deviation provides a way of comparing the dispersion of the two sets of data with each other. This will show the variation in word length of both of the newspapers. The newspaper with a lower standard deviation has a lower variation in word length. My results show that Broadsheet Newspapers have a standard deviation (of 7.57) which is greater than the standard deviation of word lengths for Tabloid Newspapers (which is 4.57). There is a significant difference between the two values of standard deviation for Tabloid and Broadsheet Newspapers. The Tabloid Newspaper has a lower standard deviation which means that there is little difference the word lengths which deviates a smaller amount from the mean than the broadsheet Newspapers where the range in word lengths is much greater. These results therefore support my hypothesis, that there is less variation in word length in Tabloid Newspapers than in Broadsheet Newspapers.
Comparing the three averages, histograms, frequency polygons and cumulative frequency graphs would not have been very useful when investigating my hypothesis. I could have used the quartile ranges from cumulative frequency graphs as they help to show the spread of the data. However they do not show the dispersion of the data as specifically as the standard deviation does. My initial intention was to calculate the three averages and quartile ranges of the data as well as the standard deviation. I also planned to plot Cumulative Frequency Graphs and Frequency Polygon’s from my data. However I realised that most of this was unnecessary, as they would not help in investigating the hypothesis.
If I were to carry out the same investigation again I would prefer to collect data from a larger sample size. This could involve me collecting data from every Broadsheet and Tabloid Newspaper. I could not do this, as I did not have enough time to carry out the investigation in such detail. I would also sample more than the first 100 words of each article. If I carried out the investigation like this I would most likely obtain much more accurate results which could prove or disprove my hypothesis. As I only collected data from a small sample size, my results are not accurate enough to prove or disprove my hypothesis. I can only say whether they support my hypothesis or not. If I carried out the same investigation again I could collect enough data to allow me to produce cumulative frequency graphs, frequency polygons, or histograms if the data was grouped. However I would use this to investigate word lengths and not variation in word lengths. I could even investigate similar hypothesis such as “Articles in Broadsheet Newspapers contain longer sentences than in Tabloid Newspapers”. I could even investigate further by comparing my results from newspapers with data collected from magazines. However all of this would be time consuming.
Hypothesis 2:
In the pie charts I produced, Fig. 1 and Fig. 2 are specific to the two newspapers, “The Times” and “The Guardian”. Therefore these are specifically useful when investigating my hypothesis. This is the same case with Fig. 3 and Fig. 4. I made these pie charts to show present the data from each Newspaper. They help to give an overall idea of the amount of space per page used up by pictures and text. More useful diagrams are Fig. 5, Fig. 6 and Fig.7. The first two pie charts show the mean amount of space pr page consisting of text and pictures for Broadsheet and Tabloid newspapers. There is obviously a huge difference between the proportions of text and pictures presented on a page in Tabloid and Broadsheet Newspapers. The two pie charts show the data specific to each type of newspaper whereas the final bar chart presents both sets of data from Tabloid and Broadsheet Newspapers on the same graph. This makes it much easier to notice the significant difference between the two papers and the mean amount of space used by text and pictures per page. Using Fig. 7, it is clear that Broadsheet newspapers have a higher percentage of text per page than Tabloid Newspapers. There is a significant difference in the percentages of text. There is a difference of around 42%.
From my results it seems as though tabloid newspapers are almost split evenly with almost half of the newspaper consisting of pictures and the other half of text, whereas Broadsheet Newspapers definitely contain almost four times as much text in the whole newspaper than pictures. This is probably because Broadsheet Newspapers are directed towards a profession or business class audience and so articles contain more specific detail and contain all the facts about the subject or event. This is different in tabloid newspapers as they contain more generalised articles which people can easily “dip in to” and contain basic facts and terms as they are meant for a wider general audience. As can be seen my data supports my hypothesis, that the percentage of text per page is much greater in Broadsheet Newspapers than in Tabloid Newspapers.
When investigating this hypothesis I encountered a few problems. Whilst planning how I would collect the data I did not take in to account the fact that on each page there is a blank border. I decided to use the second method I described in my plan to calculate the percentage of text per page. This meant that any space taken up by the border or other blank areas on the page were counted as being part of the text. The newspapers each had borders of a different thickness and so my data and results may not be very accurate. If I were to investigate the same hypothesis again, the area of the page from which I would collect the data would not include the border. The newspapers each have a border of a different thickness. As I would not be including the border when collecting data, I would be able to gain accurate results. To ensure my results are accurate I would also use a larger sample when collecting data. As I did not have much time in this investigation I chose to collect data from 7 pages from each newspaper. My results and conclusions would definitely be more accurate If I was able to use a larger sample of around 30 pages per newspaper. I could even sample the whole newspaper although this would be time consuming.
It could also be argued that there was an element of bias when I collected my results. I chose the pages to sample through random sampling. This meant that there could have been the chance that I sample more pages from the sport section of a newspaper than any other newspapers. The sport section is more likely to contain more pictures which would therefore affect my results. A better way of collecting the data fairly would be through stratified sampling. This form of sampling would take in to account the whole population (whole newspaper). I would then select a number of pages from each section of the newspaper in proportion to the number of sections and other newspapers. The data from each newspaper is in proportion and so results and conclusions are likely to be more accurate and can prove or disprove my hypothesis.
Throughout the investigation most continuous data for hypothesis 2 had been subjected to estimations. As the data I collected (i.e. The Length and Height of the pictures in order to calculate their area), is obtained by measurement, it is important to consider how accurate the information is. In measuring anything I was limited in accuracy by the equipment available and my own human limitations. For example, in hypothesis 2 the continuous data representing the area taken up by pictures was measured using a ruler only accurate to the nearest centimetre. This therefore means that the maximum error could be 0.5cm above the measurement and 0.5cm below it. If this applied to an area where smaller measurements had to be taken, then it can be seen how significant this estimation can be. If I were to carry out the same investigation, I would use a more accurate ruler and would distinguish which parts of the newspaper are considered to be a part of the picture or text.
Hypothesis 3:
From the data I collected concerning reading ages it is clear that tabloid Newspapers have a lower reading age than Broadsheet Newspapers. There is a difference of an average reading age of 5 years between the two newspapers. The readability test takes in to account the sentence construction, word lengths and the number of syllables in each word. This means that a lower reading age will mostly likely have shorter sentences with shorter words and will consist of more words with less than 3 syllables. A lower reading age would therefore provide an “easier read”. I can therefore say that my results support my hypothesis, that Tabloid Newspapers give an “easier read” than Broadsheet Newspapers.
When collecting and presenting my data I encountered a few problems. Firstly the process of collecting the data was quite time consuming, as it involved finding the word lengths, sentence lengths and number of syllables per word. I only use a sample size of 20 from each newspaper. This may be too small to draw a valid and accurate conclusion. I would prefer to use a larger sample size, possibly of around 50 articles from each newspaper, if I carried out the investigation again. This would be time consuming, however it would guarantee I gained accurate results and that I would be able to derive valid conclusions with regards to my hypothesis.
Once I had collected my data I was not quite sure how to present it. The only useful way to present the data seemed to be in the form of a table. If I carried out the investigation on a larger scale I could produce frequency tables for the reading ages and then I could plot frequency polygons, cumulative frequency graphs or bar charts. However not all of these would be relevant when investigating the hypothesis.
When collecting my data I also found it quite difficult to calculate the reading ages using the “Gunning FOG Readability Test” as it was the first time I had used this. I also planned to use the “Fry Readability Graph” to collect data. If I calculated the reading ages of the same articles using both methods I would be more likely to get an accurate reading age as a result. However as I did not have enough time to do this I was only able to use one of the Tests.
It could also be argued that the newspaper that provides less variation in the reading ages of articles provides an easier read as all articles are similar in vocabulary and style which appeals to a certain age group. I could investigate the hypothesis further by calculating the standard deviation of the reading ages of articles from a certain newspaper. The newspaper that has a lower standard deviation of reading ages would show less variation in the reading ages and so it could be argued that it gives an “easier read”. When investigating the hypothesis again I would have to be more specific as to what an “easier read” is.
Conclusions:
There is less variation in word length in articles from tabloid newspapers than in articles from broadsheet newspapers. This also suggests that Tabloid newspapers provide an “easier read” than broadsheet newspapers, as less variation in word length would suggest a lower reading age (in terms of readability) with less variation in reading ages. Less variation in the reading ages of the articles provides an “easier read” as all the articles are similar in vocabulary and style which appeals to a certain age group. The tabloid newspapers also contain more pictures (more space is taken up by the use of pictures) than in broadsheet newspapers. Less text is presented on a page in a tabloid newspaper than in a broadsheet newspaper because broadsheet newspapers are written in more specific detail for a business class audience.
It is justifiable to say that to improve the results of the investigations in to all of the hypothesis a simple method could be employed. The accuracy and reliability of any conclusions would be helped if more data was collected from the newspapers.