Data Handling Project

Authors Avatar

Maths Coursework – Mayfield High School - Mrs Ferguson  

Data Handling Project                                                              Alexandra Mullan

For this project, I . The main reason for this is because the data for height and weight is continuous, unlike eye and hair colour and KS2 results which are discrete or qualitative. Therefore, I can put the information I find into cumulative frequency tables, box plots, histograms and scatter graphs.

Plan

Before I begin my statistical inquiry I need a hypothesis to examine. A hypothesis is used as a basis for further investigation, and my first hypothesis is going to be as follows:

“The taller a pupil is, the more they are going to weigh “  

I will use scatter graphs to compare all of my data and find correlations and standard deviation. I will use my histograms and box and whisker plots to investigate further the weight differences between each year and the boys and girls, and therefore I will find the quartiles and the median etc.

I could do this investigation using all 1183 pieces of data; however this would be extremely time-consuming. I therefore am going to take a random stratified sample, and if I make my sample large enough I am confident that the results will give a good indication of the results as a whole because my data will have a large range and should consequently cover a good majority of heights and weights.

By sampling, I simplify my calculations and graphs dramatically. However, I must make sure that the sample I take is completely unbiased otherwise my results will be corrupt.

Before I begin my study, I will create a scatter graph comparing random people’s height and weight so that I can see whether the hypothesis is worth investigating. If the correlation is positive, then I know that it is possible for my hypothesis to be true. I predict that there IS a positive correlation between height and weight.

I have collected 50 samples of relevant data, and I have avoided bias by choosing my samples completely at random. I have also found out the mean for height and the mean for weight; this will be useful because I will then be able to draw a line of best fit, as the line of best fit always goes through the mean.

Scatter Graph Table

Now I will construct a scatter graph with the data that I have collected to see whether my hypothesis is worth investigating or not. If the correlation is positive, then I know that it is possible for my hypothesis to be true.

Scatter Graph

Conclusion

The scatter graph clearly shows a positive correlation between height and weight. This means that my hypothesis is worth pursuing because it demonstrates that the taller a person is, the more they are likely to weigh. Also, I believe that I have found an anomalous result, which I have labelled.

Plan 2

Now that I have discovered that my first hypothesis is true, and that generally the taller a person is the more that they are likely to weight, I can investigate a new hypothesis relating to the results that I have found. To do this, I am first going to make another hypothesis.

“Boys at Mayfield School are taller and weigh more on average in comparison to females”  

I will need to use the data of Mayfield High School between the years of 7 to 11, and this is due to the fact that it will give a wider sampling range and sufficient, unbiased results. The total number of students in the school is 1183.  Here is a Table that I have produced which contains the number of boys and girls in each year.

Table

The table below is a two way table due to the fact there are two variables shown at the same time and helps view results and data conclusively.

Join now!

I will use Stratified Sampling to investigate my new hypothesis because it improves the sample by reducing sampling error.

The variables for the sample are gender and age, so I had to do separate samples for boys and girls and vary the amount of samples taken from each year to keep the sample unbiased. This was done as the different year groups had different numbers of pupils and it would be unfair to take the same number of samples from each year group. Therefore, stratified sampling will be helpful due to the fact that there are different numbers ...

This is a preview of the whole essay