• Level: GCSE
• Subject: Maths
• Word count: 6958

# Statistically comparing books

Extracts from this document...

Introduction

Rebecca Nielson

Statistics Coursework

I am going to statistically compare two different books. The books I have decided to compare are, The Order of the Phoenix - Written by J. K. Rowling and Nicholas Nickleby - Written by Charles Dickens. I will be taking a sample of words and sentences from each book to try and find a similarity or difference between the two.

The Order of the Phoenix

This is the fifth book in the Harry Potter series, which tells the tale of Harry’s fifth year back at Hogwarts School of Witchcraft and Wizardry. The series so far has already made author J. K. Rowling into a multi-millionaire.

Nicholas Nickleby

This is the third book written by Charles Dickens and tells the story of Mr Nickleby who dies penniless leaving his wife, daughter and son to fend for themselves, Nicholas the son, soon finds his own way out of his family's desperate situation by perseverance and good fortune.

I have chosen these books as they are both stories about a boy’s childhood however they were written in different centuries. From my data I will hopefully be able to see whether there is a difference in the way these two books are written.

There are different parts of a novel that I can use to statistically compare a book; I am only going to use word length, sentence length and the number of syllables per word.

This should be enough information to see if there is a comparison between the two books.

Hypotheses

• I think that Nicholas Nickleby will have a higher mean word length than The Order of the Phoenix, because it is aimed at an adult audience so there will be longer words which are usually harder to read.
• I also think that there will be a good positive correlation between mean sentence length and reading age from both of the books. This is because normally longer sentences contain longer words, which are probably harder to read, so the reading age would get higher as sentence length increases.
• I think The Order of the Phoenix will have a shorter standard deviation in terms of sentence length than Nicholas Nickleby. This is because Nicholas Nickleby is a book aimed at adults; therefore it will have more difficult sentences which will cause some variation in length.
• I think the mean syllables per word will be very similar in both books, even though they are aimed at different audiences you will still have one and two syllable words. I think if there is a difference it will only be small, but I think that Nicholas Nickleby would have a slightly higher mean for syllables per word.
Middle

(Reading age of U.S school years, I will covert it to years old by adding 6 to get the actual age), the characters per word and words per sentence to compare blocks of text from each book against each other but also to compare them against the original sample.

I searched on the internet for the texts from both books; unfortunately I couldn’t find any available resources. This meant I had to find 5 blocks of data (at least 100 words) from each book, I used the RAND function on my calculator to select a page at random, and then I selected a block of text from that page.

I typed all the 5 blocks from Nicholas Nickleby into Microsoft Word. The Readability Statistics showed:

This would mean the reading age for Nicholas Nickleby would be:

I typed all the 5 blocks from Order of the Phoenix into Microsoft Word. The Readability Statistics showed:

This would mean the reading age for Order of the Phoenix would be:

These Readability Statistics show how different this sample is too the first sample I took from the two books.

In the second sample (blocks of text) the mean word length for Nicholas Nickleby was 4.4 and in the first sample (words on there own) the mean word length was 6.13. That is a big difference in word length. Also in Nicholas Nickleby the sentence length estimated Mean for the first sample was 30.55 and in the second sample it was 21.6, which also showed a big difference. I think this shows the first sample was completely wrong and was way higher than the actual amounts.

In the second sample the mean words per sentence in Order of the Phoenix was 15 and for the first sample the mean words per sentence was 19.17.

Conclusion

ass="c22 c16">room

4

1

436

23

inside

6

2

457

24

Grimmauld

9

2

459

24

done

4

1

486

25

voices

6

2

488

25

necks

5

1

506

26

nearly

6

2

510

26

when

4

1

531

27

observe

7

2

529

27

newspaper

9

3

556

28

that

4

1

555

28

pumpkin

7

2

574

29

frowning

8

2

575

29

dully

5

2

604

30

edged

5

1

598

30

furious

7

3

626

31

nose

4

1

623

31

extra

5

2

644

32

transferred

11

2

651

32

sixth-years

11

2

668

33

broken

6

2

669

33

Grawp's

7

1

678

34

were

4

1

677

34

hindquarters

12

3

693

35

mysteries

9

3

696

35

poking

6

2

716

36

forehead

8

2

719

36

strike

6

1

729

37

bluntly

7

2

730

37

occasions

9

3

747

38

improving

9

3

Sentence Length for Nicholas Nickleby

 Page Chapter Amount of words Amount of words over 6 letters 5 1 35 8 13 2 98 32 28 3 9 2 36 4 43 15 45 5 5 2 53 6 11 1 77 7 8 0 85 7 34 7 93 8 8 3 110 9 30 6 117 10 19 5 127 11 26 5 136 11 18 4 139 12 53 15 148 13 25 7 164 14 6 1 175 15 3 0 192 16 10 3 207 17 11 5 219 18 31 10 238 19 12 5 248 20 8 0 257 21 49 9 269 22 7 1 287 23 11 3 305 24 84 34 326 25 86 23 335 26 3 0 346 27 25 9 349 27 32 9 358 28 29 7 379 29 20 5 384 29 18 4 393 30 63 34 402 31 83 23 411 32 3 0 421 33 16 7 427 34 48 14 449 35 29 7 456 35 4 1 468 36 24 5 471 37 6 2 483 38 14 4 505 39 74 24 506 39 62 16 515 40 67 18 530 41 51 23 534 41 93 31 546 42 34 12 553 43 4 0 563 44 2 0 589 45 24 9 608 46 34 5 625 47 14 5 635 48 7 2 637 49 62 14 544 49 16 4 649 50 77 35 664 51 3 1 677 51 13 1 684 52 9 2 698 53 36 14 708 54 7 4 724 54 70 39 727 55 48 23 743 56 3 1 756 57 6 2 765 58 15 7 771 59 43 18 775 59 91 40 783 60 7 2 798 61 34 24 801 62 38 13 807 63 87 31 812 64 3 0

Sentence length for Order of the Phoenix

Page

Chapter

Amount of words

Amount of words over 6 letters

13

1

8

0

14

1

49

9

29

2

7

1

33

2

11

3

46

3

32

6

55

3

14

4

60

4

7

1

63

4

13

4

77

5

21

6

89

5

8

2

101

6

27

6

107

6

14

3

114

7

4

0

115

7

13

2

136

8

28

5

138

8

5

1

141

9

17

3

147

9

33

6

164

10

10

2

180

10

41

7

185

11

35

5

189

11

5

0

200

12

12

3

207

12

8

1

239

13

3

3

243

13

25

5

255

14

11

2

261

14

24

4

278

15

50

11

285

15

9

4

301

16

8

3

317

17

30

6

330

17

19

5

341

18

26

5

347

18

18

4

352

19

5

1

365

19

39

10

374

20

13

4

383

20

8

2

392

21

41

11

393

21

4

2

419

22

20

5

427

22

18

4

439

23

15

5

443

23

32

9

457

24

29

7

475

24

6

2

487

25

19

4

493

25

8

3

506

26

7

3

510

26

22

5

530

27

41

8

534

27

31

6

552

28

10

2

553

28

17

3

581

29

4

0

585

29

12

2

617

30

26

3

619

30

13

4

627

31

35

6

631

31

41

6

646

32

29

5

660

32

9

2

663

33

18

4

672

33

53

15

683

34

25

7

687

34

6

1

690

35

3

0

692

35

10

3

712

36

11

5

717

36

31

10

740

37

12

5

743

37

44

8

753

38

5

2

759

38

21

