Maths Handling Data Coursework: Mayfield High School

Authors Avatar
Jonathan Nisner

Maths Handling Data Coursework: Mayfield High School

For this handling statistical data coursework I will be investigating the heights and weights of students of years 7 to 11 in Mayfield High School. I will look for a trend in the heights and weights of the students to see if the taller they are, the more they weigh. My hypothesis is that there is no correlation between height and weight, and my Alternative hypothesis is that there is strong, positive correlation between them. I will then investigate the heights of boys in years 7 and compare them to the girls, and then do the same in year 11. I will then be able to compare these two sets of results. These hypotheses.

I am carrying this investigation out because from my hypothesis I want to know whether students in the older years should be separated from the younger students. In order to carry out this investigation, I will need to collect the heights and weights of all the students in Mayfield High School between and including years 7 to 11. Instead of collecting the data, I can find the information on an exam board website. This data is reliable because it is provided by the exam board and is based on real students, however, it may be unreliable because it is secondary data, not primary since I am not physically collecting it myself, and the students may not have measured or weighed themselves on the day, and had guessed the measurements instead. Height and weight is continuous data so only some graphs and calculations can be done.

I have decided to select a sample size of 100 students because 1183 students are too many. I won't be able to compare the data well since I will have such a wide range of results. I also won't be able to fit all the information on one graph or one box-plot making the comparison even more difficult. I will be able to cope with 100 students' heights and weights better.

In order to get a sample of 100 students, I will take a stratified sample of each year group, and then of each gender within the year groups. I will use stratified sampling because using this I can get a fair sample that is proportional and to the ratio of the original number of students within each year group out of the whole school.

Once I have my stratified sample of each year group, I will use random sampling to choose which of the students I will take my data from. This is a fair way of sampling due to the fact that there will be no biased decision as to who is picked for their data.

The following data is what I will use in order to take a stratified sample of 100 students:

Year Group

Number of Boys

Number of Girls

Total

7

51

31

282

8

45

25

270

9

18

43

261

0

06

94

200

1

84

86

70

Total

604

579

183

To take a stratified sample of the year groups I will do the following calculations:

Year Group

Calculation

Total (answer)

7

(282/1183) x 100

= 24

8

(270/1183) x 100

= 23

9

(261/1183) x 100

= 22

0

(200/1183) x 100

=17

1

(170/1183) x 100

= 14

Total

=100

To stratify the year groups into gender I will do the following calculations:

Year Group

Gender

Calculation

Total (answer)

7

Boys

(151/282) x 24

= 13

Girls

(131/282) x 24

=11

8

Boys

(145/270) x 23

=12

Girls

(125/270) x 23

=11

9

Boys

(118/261) x 22

=10

Girls

(143/261) x 22

=12

0

Boys

(106/200) x 17

=9

Girls

(94/200) x 17

=8

1

Boys

(84/170) x 14

=7

Girls

(86/170) x 14

=7

I have rounded the answer to the nearest whole number since the number of people is discrete data because a fraction of a person is impossible.

Now that I know how many students I should take data from I need to choose which students to use. In order to make this as fair as possible, I will choose random students. I will do this by separating the students into year groups and into gender. Then I will go down the list and recall the height and weight of the number student that comes up on the calculator as I do these calculations:

For year 7 boys: RAN# x 151 and press = 13 times.

For year 7 girls: RAN# x 131 and press = 11 times.

For year 8 boys: RAN# x 145 and press = 12 times.

For year 8 girls: RAN# x 125 and press = 11 times.

For year 9 boys: RAN# x 118 and press = 10 times.

For year 9 girls: RAN# x 143 and press = 12 times.

For year 10 boys: RAN# x 106 and press = 9 times.

For year 10 girls: RAN# x 94 and press = 8 times.

For year 11 boys: RAN# x 84 and press = 7 times.

For year 11 girls: RAN# x 86 and press = 7 times.

To be certain that I had the correct number of students, I added the answers to the above calculations up to check that I had 100 students.

I will collect the information from the chosen students and I will make a scatter diagram on the computer showing the correlation between the heights and weights of the sample of 100 students from Mayfield High School. I will use a scatter diagram because it shows the correlation between two variables well and can be seen at first site. I will add the line of best fit and its equation. The data is on the next page.
Join now!


The scatter diagram shows fairly strong positive correlation and is proved by the line of best fit. It's equation is y=0.5078x + 136.04. This shows that the gradient of the line of best fit is positive and that with every kilogram the height increases by 0.5078cm. I will now use the equation of the line of best fit to figure out the height of a student whose weight is 65kg.

Y=mx+c

Y=0.5078x + 136.04

Y=0.5078(65) + 136.04

Y=169.047

Y=169cm

My scatter diagram shows two obvious anomalies, which could be due to ...

This is a preview of the whole essay