AS statistics coursework - correlation coefficient between height and weight in year 11 boys and girls

Antony Georgiou

Statistics Coursework

Aim

The aim of this investigation is to discover if there is a link between two variables and see whether they are dependant or independent on each other. In order to carry this out with reliable results I will need to collect suitable data which I can use statistical methods to calculate and analyse correlation coefficients and regression lines, taking into account any anomalies that may affect the correlation coefficients and regression.

In this investigation I am going to look into whether or not there is a connection between the height and weight of year 11 boys and girls (I chose height and weight as my variables as I feel that they will have a strong correlation and should be dependant on each other i.e. the taller the person is the more they should weigh. I also chose the background variable of gender to see if this influences the result). The population from which I shall gather my sample is the boys and girls in year 11 from Wilnecote High school. I will gather data on 35 boys and 35 girls chosen randomly to give me a set of data that represents the whole year group.

To pick people from the year at random I will get a list of every boy and every girl from year 11 on separate sheets and assign a number to each name (1-135 for boys and 1-124 for girls). To randomly choose which students will be used I will use the random function on my calculator (as there were 135 boys on the calculator you press 135, Ran, and then the equals button 35 times taking note of each number) I will then ask to weigh them and measure their height. I will repeat this process for the girls and pick 35 out of the 124 in the year.

Data Collection

After I gather the names of the 35 representative boys and girls from year 11 I arranged to go to one of their assemblies. I read out their names at the end and asked them to stay behind afterwards. Once I had my sample students I told them my purpose and why I was doing it (coursework) then asked them if there where any problems with me taking their weight and height, with no complaints. I asked them all to remove their shoes and any other clothing other than uniform to minimise difference in weight of clothing and height in heel of shoe. To measure weight I used bathroom scales and measured to the nearest kg (presuming that the scales were accurate and their uniform was the only clothing on not including shoes) and to measure height I used a tape measure blue tacked to the wall and measured to the nearest cm minimising error by standing on a stool to take measurement as accurate as possible under the conditions (presuming that they all stood with their feet together with shoes off, feet and back against the wall and they weren’t slouching or tip-toeing). The weighing and measuring of height was consistently the same throughout for every student ensuring a fair test. As I took the data I put it into tables which I later the same day put into 2 data sheets on excel (one for boys and one for girls).

Product moment correlation coefficient

All correlations lie between -1 and 1 as shown in the diagrams above however correlations are rarely exactly -1, 0 or 1 as this would indicate either a perfect negative or positive correlation or a set of data with no correlation pattern (however it may be non-linear and the points may lie on a quadratic curve for example). A strong positive correlation would be around 0.7 to 0.9 and a strong negative around -0.7 to -0.9. “r” is the symbol used to indicate the product moment correlation coefficient.

I will work out what “r” is between height variable (x) and weight variable (y) for the year 11 boys and girls, which will give me a measure of linear correlation. After I will interpret the two r values obtained and analyse to give reasons for the strength of the correlation and why they may differ. The linear equation that best describes the relationship between X and Y can be found by . The means by which I shall do this are as follows.

I shall use the following equations further in my coursework in aid of finding correlation values.

...