Maths Statistics Coursework - relationship between the weight and height

Authors Avatar

Matthew Eden

Planning

Introduction:

In this investigation, I am aiming to find out information on the relationship between the weight and height and the relationship between the age and height.

Hypotheses:

  • As the height increases, the weight increases
  • As the age increases, the height increases
  • Boys have a higher average height than girls

Justification:

I think that as someone’s height increases, their weight increases. I predict this because of basic knowledge of the human body. As someone’s body grows, it the muscles, bones and fat on the body increases and gets larger. And as these increase, they make the body weigh more. I will present this data in scatter graphs for boys, girls and both. I will use Microsoft Excel to create this graph because I think that it will prevent human error and it will enable me to change the scales or any other amendments at any time with ease. This will help me compare and contrast the different genders’ heights and weights. To give me knowledge of the correct correlation, I will use product-moment correlation coefficient which is a calculation that shows the correlation value of the data. For all the scatter graphs, I will construct them by using Microsoft Excel which will allow me to group the data to present them into scatter graphs.

I predict that as the student gets older, their height also increases for all boys and girls. This is probably because as you get older, you grow which also increases your height. I will simply plot this data in a scatter graph showing the ages and heights, hopefully providing me some correlation which I can interpret.

Also, I think that all boys will be taller than all the girls in all the year groups in the school. Again, generally, girls are usually shorter and more petite than boys which give them a decreased height against the boys. I will present this data in the form of stem and leaf plots which will help me find the mean, median, mode, range, inter-quartile range, upper quartile and lower quartile which will provide me enough data to construct a box and whisker plots. I will construct this chart by hand and it will let me calculate the standard deviation or spread of data which will help me compare the heights for the boys and girls.

Data:

Here is the secondary data from the Mayfield High School.

The total number of students in Mayfield High School is 1183.

Data Collection and Sampling:

In this investigation, I will sample 50 people because it is a substantial amount of samples and a round figure. In the database I have been provided with, I will need to sample the girls’ and boys’ heights, weights, ages, year groups and gender because this will enable me to prove or disprove my three hypotheses. The data I have collected has already been collected by someone else so therefore it is secondary data. I will not be using primary data because it is expensive and I don’t have enough time to go out and collect the data from the high school. The data that the other people collected has been formed into a database or spreadsheet for me to sample from. The secondary data is useful because it has 1183 students in the database which provides a very accurate representation on the population because it is a very large sample.

I will sample only the fields that I have mentioned: gender, age, height, weight and year group because these are the only ones I have to use in my investigation to help me prove my hypotheses. However, the data that I will plot on graphs and calculate will include the heights, weights and ages which are all quantitative data which includes continuous numbers. The other two fields that I will use are to help me distinguish which height, weight and age corresponds to the year group and gender. This data is qualitative which means that it is not numerical, only discrete. Heights and weights are quantitative and are continuous which means that the numbers are not

 

I will record my results in tables which show the different heights and weights for girls and boys for each year group separately. This will then help me view those results to graph and interpret on to make conclusions on my hypotheses. Stratified sampling allows my sample to reflect the proportions that exist in the population. Also, as you sample more, the results will most likely be more reliable. I have used stratified sampling to find out how many students I have to sample from each year group.

I am planning to do the calculations in a variety of ways. I will be calculating the averages by using a calculator using the statistical functions on it. I will be using Microsoft Excel software to help me to calculate the product-moment correlation coefficient (PMCC) and the standard deviations for the boys’ and girls’ heights. Also, when calculating the averages of the heights for the boys’ and girls’, I will be using a function on the spreadsheet which will calculate it much faster and easier without human error. I am hoping the calculations will show me some of the relationships between the data which will help me conclude upon to refer back to my hypotheses. I also hope that the calculations will help me to create tables and diagrams such as the stem and leaf plots and box and whisker plots.

To work out the total number of boys and girls that will be sampled for each year group, I do these calculations:

Total Number of girls or boys in year group

                                                         × Number of Samples

        Total number of students in school

     

= total samples for either boys or girls in a particular year group

Year 7 Boys: 151÷1183×50=6 boys

Year 7 Girls: 131÷1183×50=6 girls

Year 8 Boys: 145÷1183×50=6 boys

Year 8 Girls: 125÷1183×50=5 girls

Year 9 Boys: 118÷1183×50=5 boys

Year 9 Girls: 143÷1183×50=6 girls

Year 10 Boys: 106÷1183×50=4 boys

Year 10 Girls: 94÷1183×50=4 girls

Year 11 Boys: 84÷1183×50=4 boys

Year 11 Girls: 86÷1183×50=4 girls

Now that I have calculated the number of boys and girls that have to be sampled from each year group, I can now sample randomly from my database that displays all the details of every boy and girl in the Mayfield High School from years 7 to 11. Random sampling is very useful for providing an accurate and unbiased result which I can then analyse and interpret the relationships for the data.

To do this I have to use the random number generator that is available on a calculator. The number that will be given, I multiply by the total number of either boys or girls for that particular year group, and for example, I would multiply the random number by 131 because this is the total number of girls in year 7. Every time I get that number, I have to find the number on database that matches to that product and highlight that because that is the one I have chosen to be sampled. I do this 50 times because there are 50 samples in total. I expect to find extreme values or anomalies in the data because the people are not ‘perfect’; they don’t always put in the correct value and they can make mistakes when entering the data. If when I am sampling, I come up with an extremity or anomaly, I will include it anyway because this will let me justify on how this has come up and how I will use it to make an accurate justification on my hypotheses. To make the investigation as fair as possible, when I am sampling randomly from the data, I will collect every value, even though it could be faulty, it will make the investigation unbiased because I didn’t have to choose another value.

Here are all the samples that I have got from using random sampling on this high school. I need to group all the data into numerous tables because it will me graph upon them.

Join now!

                

In total, I sampled 50 and each year group was stratified to provide me with a signified number of samples which is from years 7 to 11, boys and girls. This will now allow me to create complex graphs to present the data to enable me to spot any trends, relationships or anomalies in the data for height, weight, year groups, age and gender.

I will present all the averages and totals for the boys’ and girls’ means, modes, medians, ranges, quartiles and deviations into tables and charts so that I can look at it anytime in ...

This is a preview of the whole essay