• Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

My hypothesis is that the children's book will have a mean word length much shorter than that of the adult book.

Extracts from this document...


Maths statistics Coursework


I am currently a lower sixth student at college studying A/S mathematics and I am required to produce a piece of statistical coursework. I will be doing S1 Task D: Authorship from the AQA syllabus.  


I am required to collect  data from two populations with a view to estimating population parameters e.g. μ and Ơ. This should involve taking a random sample as well as calculating and comparing confidence intervals. I will investigate whether it is possible to gain information about authorship of a text using statistical measures.

There are two types of data, qualitative and quantitative. Qualitative data is data such as colour of eyes, make of car e.t.c. Quantitative data is numerical data, and can be separated into discrete and continuous data. Discrete data involves counting and is data such as the number of people in a family, marks in an exam, e.t.c. Continuous data involves measuring and is data that takes a range of values such as height or speed of cars. For my investigation I will compare the word lengths (how many letters in a word) within two populations of book types, one being a children's book and the other an adult/more mature readers book. Therefore the data that I collect will be discrete quantitative. When I have collected my data I will examine it and use various statistical methods such as arithmetic means.


My hypothesis is that the children's book will have a mean word length much shorter than that of the adult book. I expect to find this in my results because the adult book will have a larger and more complex word length, as this will suite the more mature audience.

...read more.


St. dev










x = Σx


 The mathematical average of a range of numbers (calculated by dividing the sum total of all the items in the range by the total number of items in the range).  


s²= Σx²-x²


A measure of dispersion of a set of data points around their mean value. The mathematical expectation of the squared deviations from the mean. The square root of the variance is the standard deviation.  

With my sample Iam now able to calculate the mean and standard deviation of both categories. As described above the mean is simply the sum of the total word lengths divided by the size of the sample:

adult book x = 282   = 5.64


Child book x = 245  = 4.90  


Using these means Ican now calculate the standard deviation of each sample; as described above the variance is a measurement used to show the dispersal of numbers around the mean. The standard deviation is simply the squared root of the variance.

Adult book

s²= Σfx²-x² = (1874/50) – 5.64² = 5.67 s= 2.38


Children book

s²= Σfx²-x² = (1463/50) – 4.90² = 5.25 s= 2.29


Now that Ihave calculated the mean and standard deviations by hand Iwill use formulae in a spreadsheet package to double check the accuracies. The spreadsheet shows the organized data. To add up all of the values for any given column Isimply typed in a cell “SUM(C1:C12)” and the computer will automatically add up all of the values from cell c1 to c12.. As you can see from the spreadsheet the formula used to calculate the mean  is the total “fx” column divided by the total “f” column. For the variance Idivided the sum of “fx²” column by the total of the “f” column subtracted by the mean². I have given my results correct to two decimal places for improved accuracy rather than just one.

...read more.


Would the length of words between an adult and children's book be a good indication on authorship? Maybe if i were to extend my statistical investigation i would investigate whether there is a stronger relationship between the number of words in a sentence of an adult book and a children's book. I would expect to find that the adult book will have a larger mean sentence length than the children's considering the attention span of children and the consistant use of pictures in a children's book.

Another way of extending this investigation to see whether it is possible to gain information about authorship of a text using statistical measures. I would investigate whether there are more columns within a story form a broadsheeet newspaper (the independent) and a telegraph (The sun). This is because it is proven that a more intellegent more mature reader tends to read the broadsheet oppsed to the telegraph and as the audiences are different maybe the length of stories/(number of columns) are different.  

There has been research carried out that shows J.K Rowling the author of the collection of books called “Harry Potter”, has sold the millions of the great copies of her books. As extra work i could investigate why this is. I could compare the word lengths maybe with a not so succesfull author maybe S.B Chapman who infamously wrote “Fog”. I could see if it is possible to gain information about authorship of a text using statistical measures. Maybe J.K Rowling's books have more words on a sentence than that of the less famous author.

...read more.

This student written piece of work is one of many that can be found in our GCSE Comparing length of words in newspapers section.

Found what you're looking for?

  • Start learning 29% faster today
  • 150,000+ documents available
  • Just £6.99 a month

Not the one? Search for your essay title...
  • Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

See related essaysSee related essays

Related GCSE Comparing length of words in newspapers essays

  1. Write a hypothesis about the length of words in newspapers and magazines.

    The objectives and methodology used in the original investigation will be applied to this extension. 7.2 Results and Analysis The collection of the raw data for Hello and NOTW has been reformatted into tables as shown in Appendix ## and ## respectively.

  2. Outline any differences between Tabloid and Broadsheet Newspapers in terms of word length, sentence ...

    Using Fig. 7, it is clear that Broadsheet newspapers have a higher percentage of text per page than Tabloid Newspapers. There is a significant difference in the percentages of text. There is a difference of around 42%. From my results it seems as though tabloid newspapers are almost split evenly

  1. Maths Statistical Coursework

    1 143 143 2 39 182 3 14 196 4 4 200 5 0 200 The Independent Syllables Frequency Cumulative Frequency 0 0 0 1 117 117 2 44 161 3 28 189 4 6 195 5 5 200 The data found in the above tables, can easily be represented, rather simply, by cumulative frequency graphs, as follows.

  2. GCSE Statistics Coursework

    Or, Y - 126.87 = 1.375X - 106.19125 Or, Y = 1.375X - 106.19125 + 126.87 Y = 1.375X + 20.67875 This means that to get a value for the Y(area of text) axis I must do 1.375(gradient) � area of headline(X)

  1. Assesment of Reading Difficulties in Patient AM Following the Development of Vascular Dementia.

    words thus suggesting surface dyslexia is a consequence of a general cognitive ability deficit (Patterson et al., 1994., and Strain et al., 1998). To provide further support for the position that a generalized visual area exists in the brain, not specific to reading Behrmann et al.

  2. Statistically comparing books

    I will use all three averages for sentence length; I have chosen these because the use of the Median in a Box Plot with the two books alongside each other will help me to see if there are differences or similarities.

  1. The aim of the research is to find out whether or not interference does ...

    (to see the working out of the Mann Whitney U test go to appendix 7) The results show that the experimental hypothesis was correct. It takes longer to identify colour words than neutral words. There is a 95% chance that the results are significant.

  2. Maths Coursework

    The average sentence length of the paper is 27 which can be seen by looking at the median on the box plots. Next the Guardian will be examined. By looking at the box plots it can be seen that the Guardian has an inter quartile range of 14 so its data is rather consistent.

  • Over 160,000 pieces
    of student written work
  • Annotated by
    experienced teachers
  • Ideas and feedback to
    improve your own work