- Taking the first/ last word of every sentence.
- Taking the second/third …word in every line
- Taking a random word out by closing your eyes and putting your fingers on one.
- Using a random sampler on your calculator
The reason I have chosen to pick out every fourth word in each article because I would find this way quite simple and by looking to my articles (next page) it will result in me getting different variations of words. My sample data taken in will be to analyze 50 words as this is a suitable amount and will hopefully give me accurate results.
I will need to compare the mean number of words for each article and consider seeing how large the words are on average. I will do this by calculating how many letters there are in each word and then I will add them to together then divide the total by how many words (50), giving me the average.
I will also consider finding out the median and the inter-quartile range to find out how spread out the data is in both articles and comparing them.
I will present my findings with a variety of statistical diagrams such as bar charts, cumulative frequency diagrams and box and whisker diagrams. Which will present a variation of data between the two articles
These are the articles I have chosen:
Broadsheet – The Independent
Tabloid – Yorkshire evening post
After collecting every fourth word from the two articles I can see from my table of words the kinds of lengths of word found in each article.
Broadsheet words collected:-
Tabloid words collected:-
Having worked out the mean I can draw up and say that the on average my broadsheet article has a bigger word length then the tabloid one. If we were to round it off to the nearest whole number we’d get a difference of 1.
I will now present my data in on to a bar chart as you can see on the next page this will enable me to see and compare the two articles word length together and work out a mode for each.
Broadsheet
Tabloid
I will now draw up my cumulative frequency curve graph and work out the median and the inter quartile range enable me to draw a box and whisker diagram enabling me to see how the data is spread out.
Broadsheet
Median = 3.8
Tabloid
Median = 2.9
I’ve also have situated the lowest and the highest terms shows the range of word length there both vary from.