• Join over 1.2 million students every month
• Accelerate your learning by 29%
• Unlimited access from just £6.99 per month

# The aim of this investigation is to gain statistical information to show authorship of a text.

Extracts from this document...

Introduction

AS Mathematics: (AQA) Statistics Coursework DESIGN Introduction: The aim of this investigation is to gain statistical information to show authorship of a text. For this investigation, I will use two pieces of text in order to investigate authorship. In order for the investigation to be valid, the two pieces of text I need to use should have a different theme attached to them. By theme, I mean they need to be different in a broad way i.e. different genre, different age readers. I had a number of different texts to compare but I decided to use one adult text and one child text as this will give me a more obvious variation and expectation. For this investigation I will be calculating the mean of the distribution for both populations. I will then be able to calculate the standard deviation and variance, and I will be using the unbiased estimator for both populations. I will calculate the standard error and confidence intervals for both populations. My data will be represented using frequency distribution tables and these can show the trends of a frequency distribution graph. The normal distribution diagrams will also be used for the confidence intervals representation. Population: In a statistical enquiry, you often need information about a particular group. This group is known as the POPULATION and it could be small, large or infinite. The population for my investigation is the all the words of each separate book. Sampling: Sampling is the selection of individual members of a population. The advantage of taking a sample is that it is cheaper, quicker and the results are easier to analyse and the appropriate for this type of investigation. Unfortunately, it does have some disadvantages that are difficult to avoid as the results may include natural variation or bias, and so may not be representative of the whole population and thus the results are meaningless. ...read more.

Middle

I I I 7 7 I I I 3 8 I I I I 4 9 I I I I I I I 7 10 I I 2 11 I 1 12 I 1 The distribution is not normal and I will discuss how a certain theorem acknowledges this. Raw Data for Children's Text: Word Page Word Word Length 1 24 mum 3 2 8 to 2 3 7 was 3 4 29 their 5 5 31 trouble 7 6 3 she 3 7 18 was 3 8 45 there 5 9 5 the 3 10 19 very 4 11 37 a 1 12 38 he 2 13 20 it 2 14 45 to 2 15 26 and 3 16 15 eggs 4 17 25 chris 5 18 30 friends 7 19 35 archie 6 20 33 to 2 21 40 yellow 6 22 2 hands 5 23 10 out 3 24 43 house 5 25 6 on 2 26 42 jacket 6 27 14 was 3 28 38 oh 1 29 17 said 4 30 25 there 5 31 35 for 3 32 37 chris 5 33 36 cat 3 34 10 coops 5 45 43 half 4 46 40 of 2 47 16 the 3 48 15 place 5 49 33 bring 5 40 19 six 3 41 45 picture 7 42 46 lots 4 43 27 sing 4 44 41 down 4 45 4 glass 5 46 36 the 3 47 23 too 3 48 37 it 2 49 14 the 3 50 26 want 4 Frequency Distribution table and graph for Children's Text: No. of letters(x) Tally Frequency (f) 1 I I 2 2 I I I I I I I I 8 3 I I I I I I I I I I I I I I I 15 4 I I I I I I I I 8 5 I I I I I I I I I I I 11 6 I I I 3 7 I ...read more.

Conclusion

The above preview is unformatted text

This student written piece of work is one of many that can be found in our AS and A Level Probability & Statistics section.

## Found what you're looking for?

• Start learning 29% faster today
• 150,000+ documents available
• Just £6.99 a month

Not the one? Search for your essay title...
• Join over 1.2 million students every month
• Accelerate your learning by 29%
• Unlimited access from just £6.99 per month

# Related AS and A Level Probability & Statistics essays

1. ## The aim of this investigation was to look at the reliability and validity of ...

A pioneer in this field was Raymond Cattell (1965). Cattell believed that the personality could be divided into two kinds: Surface Traits and Source Traits. Surface Traits were those, which other people could see, the overt personality, but underlying these are Source Traits which create the basis of the personality.

2. ## Statistics coursework

130<IQ<140 1 58 - IQ of girls in year 7 (Table 1) - IQ of boys in year 7 (Table 2) IQ Frequency Cumulative Frequency 60<IQ<70 1 1 70<IQ<80 1 2 80<IQ<90 4 6 90<IQ<100 15 21 100<IQ<110 40 61 110<IQ<120 6 67 120<IQ<130 0 67 130<IQ<140 0 67 Once

1. ## Investigating the Relationship Between the Amount of Money a Football Club Receives and its ...

�231,000 23 3 Brighton & H A 17 46 8 3 12 25 35 8 4 11 24 31 55 10952 �0 -17 3 Cambridge Utd 2 46 13 6 4 41 21 10 6 7 37 27 81 9247 �72,000 30 3 Cardiff City 3 46 13 7 3

2. ## I am investigating how well people estimate the length of a line and the ...

I will display data from hypothesis 1 in a cumulative frequency table, then graph as I will find it easier to compare data both on other cumulative frequency graphs and on box plots, than I would do on perhaps a frequency polygon making it easier for me to come to

1. ## Standard addition was used to accurately quantify for quinine in an unknown urine sample ...

Because the vibrational levels of both ground and excited states are similar, the fluorescence spectrum is often a sort of mirror image of the exciting absorption spectrum. The lifetime of an excited singlet state is usually 10-9-10-6 seconds and fluorescence lifetimes fall in this range.

2. ## Guestimate - investigate how well people estimate the length of lines and the size ...

Frequency Cumulative Frequency Upper Class Boundary 0 < E < 30 1 1 30 30 < E < 60 20 21 60 60 < E < 90 6 27 90 90 < E < 120 1 28 120 120 < E < 150 1 29 150 150 < E <

1. ## Probability of Poker Hands

matching the pair A pair occurs when a player obtains two cards of the same value from his set of five cards. The other three cards do not match the pair and do not have a pair among themselves. The pair with the higher value defeats the pair with the lower value.

2. ## Design an investigation to see if there is a significant relationship between the number ...

maximum error in my measurements to 0.005m, which I feel is an acceptable maximum error. Justification: Method: I have decided to sample the Fucus vesiculosus from the lower and middle shores of Robin Hood's Bay. My reason for choosing to sample from these areas is that I believe there will

• Over 160,000 pieces
of student written work
• Annotated by
experienced teachers
• Ideas and feedback to