• Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

My hypothesis is that the children's book will have a mean word length much shorter than that of the adult book.

Extracts from this document...


Maths statistics Coursework


I am currently a lower sixth student at college studying A/S mathematics and I am required to produce a piece of statistical coursework. I will be doing S1 Task D: Authorship from the AQA syllabus.  


I am required to collect  data from two populations with a view to estimating population parameters e.g. μ and Ơ. This should involve taking a random sample as well as calculating and comparing confidence intervals. I will investigate whether it is possible to gain information about authorship of a text using statistical measures.

There are two types of data, qualitative and quantitative. Qualitative data is data such as colour of eyes, make of car e.t.c. Quantitative data is numerical data, and can be separated into discrete and continuous data. Discrete data involves counting and is data such as the number of people in a family, marks in an exam, e.t.c. Continuous data involves measuring and is data that takes a range of values such as height or speed of cars. For my investigation I will compare the word lengths (how many letters in a word) within two populations of book types, one being a children's book and the other an adult/more mature readers book. Therefore the data that I collect will be discrete quantitative. When I have collected my data I will examine it and use various statistical methods such as arithmetic means.


My hypothesis is that the children's book will have a mean word length much shorter than that of the adult book. I expect to find this in my results because the adult book will have a larger and more complex word length, as this will suite the more mature audience.

...read more.


St. dev










x = Σx


 The mathematical average of a range of numbers (calculated by dividing the sum total of all the items in the range by the total number of items in the range).  


s²= Σx²-x²


A measure of dispersion of a set of data points around their mean value. The mathematical expectation of the squared deviations from the mean. The square root of the variance is the standard deviation.  

With my sample Iam now able to calculate the mean and standard deviation of both categories. As described above the mean is simply the sum of the total word lengths divided by the size of the sample:

adult book x = 282   = 5.64


Child book x = 245  = 4.90  


Using these means Ican now calculate the standard deviation of each sample; as described above the variance is a measurement used to show the dispersal of numbers around the mean. The standard deviation is simply the squared root of the variance.

Adult book

s²= Σfx²-x² = (1874/50) – 5.64² = 5.67 s= 2.38


Children book

s²= Σfx²-x² = (1463/50) – 4.90² = 5.25 s= 2.29


Now that Ihave calculated the mean and standard deviations by hand Iwill use formulae in a spreadsheet package to double check the accuracies. The spreadsheet shows the organized data. To add up all of the values for any given column Isimply typed in a cell “SUM(C1:C12)” and the computer will automatically add up all of the values from cell c1 to c12.. As you can see from the spreadsheet the formula used to calculate the mean  is the total “fx” column divided by the total “f” column. For the variance Idivided the sum of “fx²” column by the total of the “f” column subtracted by the mean². I have given my results correct to two decimal places for improved accuracy rather than just one.

...read more.


Would the length of words between an adult and children's book be a good indication on authorship? Maybe if i were to extend my statistical investigation i would investigate whether there is a stronger relationship between the number of words in a sentence of an adult book and a children's book. I would expect to find that the adult book will have a larger mean sentence length than the children's considering the attention span of children and the consistant use of pictures in a children's book.

Another way of extending this investigation to see whether it is possible to gain information about authorship of a text using statistical measures. I would investigate whether there are more columns within a story form a broadsheeet newspaper (the independent) and a telegraph (The sun). This is because it is proven that a more intellegent more mature reader tends to read the broadsheet oppsed to the telegraph and as the audiences are different maybe the length of stories/(number of columns) are different.  

There has been research carried out that shows J.K Rowling the author of the collection of books called “Harry Potter”, has sold the millions of the great copies of her books. As extra work i could investigate why this is. I could compare the word lengths maybe with a not so succesfull author maybe S.B Chapman who infamously wrote “Fog”. I could see if it is possible to gain information about authorship of a text using statistical measures. Maybe J.K Rowling's books have more words on a sentence than that of the less famous author.

...read more.

This student written piece of work is one of many that can be found in our GCSE Comparing length of words in newspapers section.

Found what you're looking for?

  • Start learning 29% faster today
  • 150,000+ documents available
  • Just £6.99 a month

Not the one? Search for your essay title...
  • Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

See related essaysSee related essays

Related GCSE Comparing length of words in newspapers essays

  1. Critically discuss the current role of phonics and whole word teaching methods in the ...

    In other words, emergent readers and writers need to develop a functional command of what is commonly called phonics. It can be seen that phonics is based upon code emphasis and thus holds the view that reading is developed in certain stages.

  2. Compare a modern romantic comedy with a very old romantic comedy - Compare word ...

    42 36 252 7 8 56 49 392 8 9 72 64 576 9 5 63 81 405 10 4 40 100 400 11 1 11 121 121 12 3 36 144 432 13 2 26 169 338 Mean = ?fx = 45 = 6.5 ?f 20 Variance = ?fx2

  1. This investigation looked to see whether the height on the shore would affect the ...

    A whole through the plank and an angled mirror allow you to find a point 0.6m above the point on which the cross-staff is resting. This is very useful for moving down the shore at height intervals. I chose to use this piece of equipment because it is a simple

  2. Assesment of Reading Difficulties in Patient AM Following the Development of Vascular Dementia.

    the reaction time taken to identify the pictures increasing disproportionately relative to control subjects as visual complexity increases (Behrmann et al., 1998, p 1117). This finding was later confirmed when on testing a larger group of pure alexic patients the same results as seen in EL were seen.

  1. Consumer responses to wine bottle back labels

    18-24 25-35 36-50 51-65 Unspecified A1 % A1 % A1 % A1 % A1 % 11 19.6 26 46.4 11 19.6 7 12.5 1 1.8 Average number of bottles of wine purchased by household per month A2 11-20 21-30 > 30 Unspecified A1 % A1 % A1 % A1 %

  2. "Broadsheet newspapers have a longer average word length than tabloid newspapers"

    measure of the spread but it only measures the 50% of the data. I feel this is an unsuitable range for such a small sample of words. This is why I have found the average standard deviation for each type of newspaper, and I found that the broadsheet had a larger spread.

  1. GCSE Statistics Coursework

    axis I must do 1.375(gradient) � area of headline(X) + 20.67875. (Y intercepts). Equation of the line for Daily Mail = Y - 168.99/X - 74.34 = 1.675 Or, Y - 168.99 = 1.675 (X - 74.34) Or, Y - 168.99 = 1.675X - 124.5195 Or, Y = 1.675X - 124.5195 + 168.99 Y = 1.675X + 44.4705 This

  2. I have always found it fascinating how the English language is built up and ...

    Mean word and sentence length are independent random variables - simply replacing short words with long words in a sentence doesn't make a sentence longer. Both data sets follow a normal distribution curve and the scatter diagram shows the data points can be represented using a line of best fit,

  • Over 160,000 pieces
    of student written work
  • Annotated by
    experienced teachers
  • Ideas and feedback to
    improve your own work