Data Handling Project

Authors Avatar

Timothy Howard – Maths Coursework

Data Handling Project

 

I have investigated the word length of newspapers, to be able to make a conclusion on the type of reader. My investigation centers around the hypothesis, “More intelligent people read broadsheets.” This statement is stereotypical, however to be able to prove that hypothesis, another hypothesis has to be proven; “Broadsheet newspapers have longer words than Tabloid newspapers.” What I want to find out is the length of words in a Broadsheet newspaper and then the length of words in a Tabloid newspaper. For the hypothesis to be proved or disproved, there requires results and data, which must be collected and presented to show the hypothesis to be correct. I have also used other hypotheses, “Words in the News section of all newspapers, will have longer words than the other sections of that newspaper.” I also will use the statement, “Broadsheets give a higher proportion of the newspaper to news articles, than tabloid newspapers.” These different hypotheses will contribute to the investigation.

Firstly the investigation requires newspapers, as the data is being collected from them. The investigation needs differing newspapers, there is no point in an investigation with two similar tabloid newspapers, as the results that we would expect would be too similar. Therefore I have chosen the newspapers which are different. My broadsheet is “The Herald Tribune”, which is a traditional long established broadsheet newspaper. My Tabloid newspaper is “The Sun” which is very popular and one of the first Tabloid newspapers. I have however decided to include in my investigation another newspaper, as it could act as control between the newspapers; it would make my investigation more balanced and therefore more accurate. However what newspaper should I choose, should I choose another tabloid, which I could use to compare different tabloids, or should I choose another broadsheet; both these ideas are useful, but they are not fair to the investigation as the purpose is to compare a broadsheet to a tabloid. I decided to use “The Daily Mail” which would describe itself as a quality tabloid, the newspaper is set out as a tabloid, with A3 size pages, it has no separate sections like in a broadsheet, however unlike red-top tabloids (ie. The Sun) it would be expected to have longer words. The investigation is to sample word length in the different newspapers, to do this accurately we need to have the same quantity of a sample, as it would be unfair to sample in one newspaper only 10 words and sample 1000 words in a different newspaper. It also needs to be long enough for it to be accurate therefore I decided to sample per newspaper 100 words. However then I considered the variation in word lengths in one newspaper, such as a sport section which would have different topics than the news or business section. I then decided that I must separate the newspaper, by page into different sections, by the headings:

  • News
  • Entertainment
  • Business
  • Sport
  • Other (Advertisements)

These different sections would then allow me later on to analyse word lengths in particular sections and to be able to compare them to other newspapers. This is necessary to prove or disprove my hypothesis, “Words in the news section of all newspapers, will have longer words than the other sections of that newspaper.”

As I have chosen to sample 100 words per newspaper, the words must not be sampled in a biased way; I cannot just choose the first few words in each article, or to choose the first word on each line. As there is a possibility that this practise could damage the results as they were not randomly selected. In order to allow the investigation to be unbiased, I must select the data randomly, I would do this by using my calculators “RANDOM” function. It is a simple method of ensuring unbiased data collection. To obtain its random numbers you have to:

  1. Enter the number you wish the data to be selected out from. So if you entered “50” the results would be anything from 0-50.
  2. Press the “2nd F” button. (note. This may be different for other calculators)
  3. Press the button “RANDOM” which is the same button as the “7” however as it is in “2nd F” is different.
  4. Then press “=” to obtain a randomly selected number between 0 and what ever number you entered.
  5. you can keep pressing the “=” button to continue to get more random data from those numbers.
Join now!

                               

Before I collect the data from the newspapers, I must work out how many words I should collect from each section. From my planning I know that I need to collect data from separate sections. So I will have to work out how many words I should collect from each section. I firstly counted how many pages there are in the newspaper altogether, so for the Herald Tribune there was 18.

        

...

This is a preview of the whole essay