"There ought to be a relationship between the respective heights and weights of the students in the sample." I will conduct an investigation to prove this statement.

Name: Emily Capes Candidate Number: 5085

Data Handling Project

I will investigate the following statement:

“There ought to be a relationship between the respective heights and weights of the students in the sample.” I will conduct an investigation to prove this statement.

Task description

The extensive data we were given regarding the pupils at Mayfield High presented many possible comparisons that could have a relationship worth investigation. The most obvious being the relationship between height and weight, a comparison that is likely to show a relationship. It is because of this reliability factor that I have chosen to investigate the height/weight this instead of a more questionable comparison of data, for example the relationship between IQ and hours of television watched per week.

Hypothesis

I predict that as the heights of the pupil’s increase the weights will increase at a proportional rate. This hypothesis is based on basic knowledge that an increase in weight accompanies an increase in height.

This investigation could ultimately have no definite end, but due to time constraints, I will have to ensure my investigation stays disciplined.

To begin the investigation, I took several averages as I thought these would help me recognise obvious trends and also with other mathematical tasks that may come later in the investigation. I found the average by adding together all of the necessary values and then dividing by the number of values:

The general observation from the table, that should support the scatter graphs I intend to draw, is that as the height increases, the weight does at a proportionate rate, confirming my prediction. However, averages can be unreliable as they are distorted by extreme results so drawing conclusions from these averages alone would be irrational. So now I will draw a series of scatter graphs to illustrate the data collected to make a correlation clearer to see. I chose to use a scatter graph as I could then see if there are any possible links or relationships between the heights and weights of pupils.

The data for the whole school depends upon whether you look at the data with or without the anomalies (results that do not fit in with the general pattern of the data). Due to the fact that in my sample the only ‘anomalies’ recorded were pupils who were extremely ‘large’ compared to the rest of the sample I chose to include these individuals. This was because although considerably larger than the rest of the sample, the individual anomalies were in proportion and fitted closely to the line of best fit.

After completing the scatter graphs, I then drew a line of best fit on each (that passed though the mean result) through the points plotted to highlight the correlation of the results.

Finding equations from my scatter graph

Although my line of best fit can give an accurate approximation of a pupils height/weight when one value is known by simply reading the values off the graph, I needed to find an equation for my line of best fit so I could accurately guess at some ones approximate statistics when only one value is shown without the use of a graph at hand.

To find the equation for the line of best fit I knew it will be in the form 'y = mx + c' where y=weight of pupil, m = gradient, x = height of pupil and c = y intercept. Due to time constraints I could only find the equation of one line of best fit, for the scatter graph showing the combined heights and weights from the whole sample of the school children. I chose to use this particular graph as although not as accurate as finding an equation for the heights and weights of boys/girls/separate years would have been, it will provide me with a rough guideline for all of the pupils in the school combined and equations from separate year groups would not have differed greatly.