Data Handling

Maths Coursework:

Data Handling Task

Task:

My task is to use statistical techniques to analyse the differences between different types of print media; for this I have chosen a tabloid newspaper (The Sun) and a broadsheet newspaper (The Times).

I have chosen these different types because of the variations between the two, for example they have different audiences, as The Times is a more knowledgeable newspaper than The Sun, which is more for entertainment. I also think that The Times will have more complex language and content than The Sun. Out of the differences above I think that I will be able to measure how complex the language is through measuring the length of the words and sentences in both newspapers.

Hypothesis:

I think that in a broadsheet paper the average length of words will be greater than the average in a tabloid. I also think that the average sentence length will be greater in a broadsheet than a tabloid newspaper. I think this is true because the broadsheet newspaper are aimed at an audience of intellectuals and provide more informative articles in the news, whereas the tabloid is more aimed at entertaining people and is similar to a magazine which is full of gossip and shorter, briefer articles.

Brief Pilot:

I will begin with a brief pilot study ...

This is a preview of the whole essay

Hypothesis:

Brief Pilot:

I will begin with a brief pilot study so that I can recognise any problems that may occur when collecting my actual data. I will collect a small amount of data from a short article in any paper to allow me to see any problems that could occur and so that if they do, then I know how to counter them.

My brief pilot data:

Problems that occurred were:

Apostrophes e.g. Emily’s, It’s
Hyphen’s
Abbreviations
Numbers i.e. digits e.g. 38
Times e.g. 7.00pm
Headlines
Sub-headings
Acronyms e.g. NATO

How I will counter these problems:

If I come across apostrophes I will ignore the apostrophe e.g. can’t and either count it as one word or as a four letter word.
If I come across hyphen’s I will ignore them completely and count it as one word.
Abbreviations I will ignore completely.
If I come across numbers e.g. 38 I will not count it at all.
If I come across times e.g. 8.00pm I will not count it at all.
Headlines I will not count.
Sub-headings I will not count.
Acronyms I will not count.

Plan:

I will collect my data from a broadsheet paper; The Times and from a tabloid paper; The Sun. I am going to select a sample of 200 words and sentences from both papers, as I can’t count the number of words and sentences in the entire newspapers.

My samples will be unbiased because I am taking samples of quite a large size and they are representative of both newspapers, because when I am collecting my samples for the number of words per sentence, I will select 50 sentences from 4 corresponding sections of the paper e.g. articles with similar relevance so it’s representative of each newspaper.

When I am collecting samples for the sentence length I will select 50 sentences from 4 corresponding sections of the paper so that my data is representative of each newspaper as well as being unbiased.

I will record my data in tally charts and when collecting the length of the sentences I will use grouped frequencies and I will then calculate the mean for both the sentence length and the word length. I will then put my data for word length into a bar graph and my grouped data of sentence length into a histogram so that I can compare the distributions. I will also use a cumulative frequency diagram in order to calculate the median and IQR by drawing box plots. This will show me the average word and sentence length and will also show me how varied the results are and I will be able to compare the distributions easily.

Finally I will use a summary table so that all my statistics can be shown together and can be compared.

Data Handling

This is a preview of the whole essay

Document Details

Related Essays

Handling Data.

maths data handling

Handling data.

Data Handling Project