The aim of the coursework is to find out what book is more complex out of two books that I have selected by obtaining their sentence lengths and length of words. I have choosen two childrens books which are 'The Wizard of Oz' and 'The Patchwork Cat'.

Authors Avatar

Maths Statistics Coursework

Introduction

The aim of the coursework is to find out what book is more complex out of two books that I have selected by obtaining their sentence lengths and length of words. I have choosen two childrens books which are ‘The Wizard of Oz’ and ‘The Patchwork Cat’.

To find out what book is more complex, I will use random sampling, firstly to select a page out of the book, and then to choose a sentence, recording the number of words in the sentence, then recording the amount of letters in the word. This sample would be taken 50 times for 50 sentences and 50 words.

Sampling

To do this, I will use random sampling. Random sampling is a method which gives each member of a group an equal chance of being chosen. In other words, the sample is selected at random, rather like picking numbers out of a hat. Today computers can be used to produce a random list of numbers which are then used as the basis for selecting a sample. Its main advantage is that bias cannot be introduced when choosing the sample. However, it assumes that all members of the group are the same, which is not always the case. A small sample chosen in this way may not have the characteristics of the population, so a very large sample would have to be taken to make sure it was repetitive. It would be very costly and time consuming for firms to draw up a list of the whole population and then contact and interview them.

One method sometimes used to reduce the time taken to locate a random sample is to choose every tenth or twentieth name on a list. This is known as systematic sampling. It is, however, less random.

To use random sampling, on a calculator you need a random button, which is clearly marked as:

                     SHIFT Ran#

So to get each sample, you would have to do:

SHIFT Ran# x number of pages

SHIFT Ran# x number of lines

SHIFT Ran# x number of words

Confidence intervals

Join now!

A confidence interval gives an estimated range of values which is likely to include an unknown population parameter, the estimated range being calculated from a given set of sample data.

If independent samples are taken repeatedly from the same population, and a confidence interval calculated for each sample, then a certain percentage (confidence level) of the intervals will include the unknown population parameter. Confidence intervals are usually calculated so that this percentage is 95%, but we can produce 90%, 99%, 99.9% (or whatever) confidence intervals for the unknown parameter.

The width of the confidence interval gives us some idea about ...

This is a preview of the whole essay