Data from Mayfield high school. It consists of 1183 people, split into 604 boys and 579 girls from the year groups 7-11. The information we've been provided contains a lot of information including: first name, surname, height weight, age, month

Authors Avatar

MAYFIELD HIGH SCHOOL

Introduction

For my investigation I’m going to use data from Mayfield high school. It consists of 1183 people, split into 604 boys and 579 girls from the year groups 7-11. The information we’ve been provided contains a lot of information including: first name, surname, height weight, age, month of birth, KS2 results etc. I have chosen to investigate the relationship between height and weight, as I feel they’ll show a good relationship and may influence each other.

I will investigate the following statements:

  • The relationship between height and weight and see how it varies within gender.
  • Compare the height to weight ratio in terms of body mass index.
  • Compare the height and weight between each of the year groups.

Where the data can be found

The data can be found on the school computers (located on the “W” drive, under MY COMPUTERS)

I’ve chosen there sources of information because…

I have chosen these sources of information, as the data is continuous and therefore there is a wide range of data and therefore I can make more predictions and hypothesis. The data is also quantative and so I can carry out my investigation even further.

The data is reliable because:

This data is reliable, as it has been taken from real people and therefore this data really exists and the values are not made up. (Anomalies are taken into account for, and are simply eliminated)

Prediction/hypothesis

  • I predict that there will be a positive correlation between height and weight in the whole school. I predict that there will be a positive correlation between height and weight in each year group but there will be a poorer positive correlation as age increases. Girls will tend to have a better positive correlation than boys.
  • Average boys’ weight will be larger than the average girls’ weight.
  • People who walk will tend to have a low body mass index that fits their “age frame” (Between 20-25) compared to people who take any other forms of transport. People who travel by bicycle will have a lower body mass index (i.e., they’ll tend to be in the underweight region for BMI)  than people that take any other forms of transport including walking.
  • The spread of boys’ weight will be larger than the spread of girls’ weight.

I should collect data of the height and weight of 60 pupils. I will choose a certain number of pupils from each year group and also choose and equal number of boys and girls from each year group. This means that in my overall data I should have data of 30 boys and 30 girls.

Data

In order to make data sample fair and unbiased I will do a random stratified sample, which means I will have a certain number of boys and girls from each year group in correlation to their percentage of school size.

The data may have some anomalies (for e.g. a year 11 pupil might be less than 1 meter tall or might weigh less than 25 kg) in which case I will simply eliminate that sample and select another one of the same sex and age at random.

I will now carry out an investigation of my hypothesis I made by drawing an assortment of;

  • Graphs (bar, line, cumulative frequency graphs, histograms)
  • Diagrams ( box and whisker etc)
  • Using several calculations

Investigating my hypothesis even further

There will be a positive correlation between height and weight, I will collect the data of the height and weight of 30 boys and 30 girls and plotting a graph, with the height on the X-axis and the weight on the Y-axis. I will plot the points and draw a line of best fit, and form an equation (y=mx+c) to show a positive correlation. With this I can estimate the approximate height or weight of random people in the group. If the equation is positive I have proved that my hypothesis is correct.

The average boys’ weight will be larger than the average girls’ weight.

I will calculate the mode, median and mean of the weight data of the 60 pupils, grouped and ungrouped. If the average weight of boys is higher than the averages of the girls’ data, then my hypothesis will be correct. Histograms can be drawn to show the ‘concentration’.

My third hypothesis is all to do with Body mass index. Here I will be comparing the BMI obtained by pupils and how it relates to the mode of transport they take. I will draw up several bar charts including; a bar chart that compares the mean BMI I calculated for the different groups of people (I.e. the people categorized into their groups by mode of transport they take). I will also draw up a bar chart to show the percentage of people that have the required BMI in each group. I will also carry out some calculations to show where the BMI are concentrated and compare them to the suggested or rather required BMI table.

My final hypothesis states that boys will have a larger spread of weight then girls. For this hypothesis I will use several measure of spread including; standard deviation, variance, interquartile range/semi interquartile range, range and several percentage calculations. I will also draw up box and whisker plots to make the idea more distinct.

Sample size: 60 pupils

Stratifying My Data

I will need to do a stratified sample, because there are different number of pupils in each year, so if I took the same number of pupils from each year group, my results would be biased. I will divide, the school, into ten groups, year groups, and then gender groups.

To do this I will have to take the number of pupils for e.g. male students in year 7 (151 boys) divide this number by 1183 and multiply that by the number of pupils I’m taking (60 pupils)

(151/1183)*60= 7.685 (rounded of to 7)

The final plan conclusion

Once I collected the information required from the place stated in the problem and the plan I will pick up 60 random stratified datum points and will then do the following,

  1. I will plot several different graphs for the 60 datum points
  2. Explain each graph
  3. Associate with the hypothesis
  4. Do three other graphs on random datum points
  5. And use a graph to draw a conclusion
  6. Conclude the hypothesis

I will repeat this for each hypothesis and reach an investigation conclusion.

Copy of stratified sample of data I will be using in my project

MY FIRST HYPOTHESIS

There will be a positive correlation between height and weight, I will collect the data of the height and weight of 30 boys and 30 girls and plotting a graph, with the height on the X-axis and the weight on the Y-axis. I will plot the points and draw a line of best fit, and form an equation (y=mx+c) to show a positive correlation. With this I can estimate the approximate height or weight of random people in the group. If the equation is positive I have proved that my hypothesis is correct.

Firstly I’m going to plot a scatter graph of the whole school to show that there will be a positive relationship between height and weight thus proving my hypothesis correct.  I will also look at the correlation coefficient to back up my hypothesis.

To do this I will use a program called ‘autograph’ to draw my scatter graphs and I will insert a regression line (to the order 1), which will show the gradient and the correlation between height and weight.

 

SCATTER GRAPH FOR WHOLE SCHOOL

This graph shows the spread of pupils weight in terms of height throughout the whole school (I choose 30 boys and 30 girls as part of my data). As you can see from the graph above that the data is spread from 1.38 to 1.8m on the x-axis whilst the range for weight goes from 33kg to 92kg on the y-axis. The equation on the graph is y=48.16x–24.17. By the gradient (48.16) I can tell that there is a positive relationship between height and weight. The correlation coefficient, which is 0.1141 for the whole school tells me that there is a positive correlation between height and weight although is its not very strong. This however proves that my hypothesis is right.

Yr 7 scatter graph

For this scatter graph I choose 8 boys and 7 girls giving me a total of 15 data points for this scatter graph. As you can see from the graph above that the data is spread from 1.36 to 1.73m on the x-axis whilst the range for weight goes from 35kg to 90kg on the y-axis. There is 1 noticeable outlier (a point that is some distance away from the regression line) though I have not got rid of it, as it is a possibility. The line of best fit has an equation y=42.0x-19.52. The correlation coefficient for this scatter graph is 0.3821 which shows that there is a positive correlation between height and weight although its not that strong but it’s a pretty decent as it is not that near to “0”.

Yr 8 scatter graph

For this scatter graph I’ve chosen 7 girls and 7 boys making a total of 14 data points. As you can see from the graph above that the data is spread from 1.38m to 1.75m on the x-axis whilst the range for weight goes from 35kg to 61kg on the y-axis. The line of best fit has an equation y=47.41x-27.44. The correlation coefficient for this scatter graph is 0.5097, which shows that there is a pretty strong positive correlation between height and weight thus proving my hypothesis right.

Join now!

Yr 9 scatter graph

For this scatter graph I’ve chosen 7 girls and 6 boys making a total of 13 data points. As you can see from the graph above that the data is spread from 1.5m to 1.76m on the x-axis whilst the range for weight goes from 41kg to 72kg on the y-axis. The line of best fit has an equation y=11.38x+42.08. The correlation coefficient for this scatter graph is 0.1141 which shows that there is a strong poor correlation between height and weight thus aiding my hypothesis in me trying to prove that there will ...

This is a preview of the whole essay