# My hypothesis is that the children's book will have a mean word length much shorter than that of the adult book.

Extracts from this document...

Introduction

Maths statistics Coursework

Introduction

I am currently a lower sixth student at college studying A/S mathematics and I am required to produce a piece of statistical coursework. I will be doing S1 Task D: Authorship from the AQA syllabus.

Aim

I am required to collect data from two populations with a view to estimating population parameters e.g. μ and Ơ. This should involve taking a random sample as well as calculating and comparing confidence intervals. I will investigate whether it is possible to gain information about authorship of a text using statistical measures.

There are two types of data, qualitative and quantitative. Qualitative data is data such as colour of eyes, make of car e.t.c. Quantitative data is numerical data, and can be separated into discrete and continuous data. Discrete data involves counting and is data such as the number of people in a family, marks in an exam, e.t.c. Continuous data involves measuring and is data that takes a range of values such as height or speed of cars. For my investigation I will compare the word lengths (how many letters in a word) within two populations of book types, one being a children's book and the other an adult/more mature readers book. Therefore the data that I collect will be discrete quantitative. When I have collected my data I will examine it and use various statistical methods such as arithmetic means.

Hypothesis

My hypothesis is that the children's book will have a mean word length much shorter than that of the adult book. I expect to find this in my results because the adult book will have a larger and more complex word length, as this will suite the more mature audience.

Middle

Var.

Sample

x

s

s²

Population

μ

Ơ

Ơ²

Mean

x = Σx

n

The mathematical average of a range of numbers (calculated by dividing the sum total of all the items in the range by the total number of items in the range).

Variance

s²= Σx²-x²

n

A measure of dispersion of a set of data points around their mean value. The mathematical expectation of the squared deviations from the mean. The square root of the variance is the standard deviation.

With my sample Iam now able to calculate the mean and standard deviation of both categories. As described above the mean is simply the sum of the total word lengths divided by the size of the sample:

adult book x = 282 = 5.64

50

Child book x = 245 = 4.90

50

Using these means Ican now calculate the standard deviation of each sample; as described above the variance is a measurement used to show the dispersal of numbers around the mean. The standard deviation is simply the squared root of the variance.

Adult book

s²= Σfx²-x² = (1874/50) – 5.64² = 5.67 s= 2.38

Σf

Children book

s²= Σfx²-x² = (1463/50) – 4.90² = 5.25 s= 2.29

Σf

Now that Ihave calculated the mean and standard deviations by hand Iwill use formulae in a spreadsheet package to double check the accuracies. The spreadsheet shows the organized data. To add up all of the values for any given column Isimply typed in a cell “SUM(C1:C12)” and the computer will automatically add up all of the values from cell c1 to c12.. As you can see from the spreadsheet the formula used to calculate the mean is the total “fx” column divided by the total “f” column. For the variance Idivided the sum of “fx²” column by the total of the “f” column subtracted by the mean². I have given my results correct to two decimal places for improved accuracy rather than just one.

Conclusion

Would the length of words between an adult and children's book be a good indication on authorship? Maybe if i were to extend my statistical investigation i would investigate whether there is a stronger relationship between the number of words in a sentence of an adult book and a children's book. I would expect to find that the adult book will have a larger mean sentence length than the children's considering the attention span of children and the consistant use of pictures in a children's book.

Another way of extending this investigation to see whether it is possible to gain information about authorship of a text using statistical measures. I would investigate whether there are more columns within a story form a broadsheeet newspaper (the independent) and a telegraph (The sun). This is because it is proven that a more intellegent more mature reader tends to read the broadsheet oppsed to the telegraph and as the audiences are different maybe the length of stories/(number of columns) are different.

There has been research carried out that shows J.K Rowling the author of the collection of books called “Harry Potter”, has sold the millions of the great copies of her books. As extra work i could investigate why this is. I could compare the word lengths maybe with a not so succesfull author maybe S.B Chapman who infamously wrote “Fog”. I could see if it is possible to gain information about authorship of a text using statistical measures. Maybe J.K Rowling's books have more words on a sentence than that of the less famous author.

This student written piece of work is one of many that can be found in our GCSE Comparing length of words in newspapers section.

## Found what you're looking for?

- Start learning 29% faster today
- 150,000+ documents available
- Just £6.99 a month