Investigate the following fields; Weight and Height.

Adam Sheldrake GCSE Maths Coursework Mrs Thorpe

Year 11 Gault Weight and height investigation. 10/10/03

Introduction

My name is Adam Sheldrake from Lytchett Minster school in Dorset, England. My School is a secondary school and is of mixed genders from ages 11 to 17 plus a 6th from which has students of 17-19years. The school is based in the country side near to several small towns and villages, most of the students around the area come to this school.

Preliminary Investigation

I have decided to investigate the following fields; Weight and Height. I have decided upon these two values because I am curious if there is any connection between them, as I know many people my age which are around my weight but which are a lot shorter than me. The fields I have chosen to investigate are both forms of continuous data, in contrast to data that is known as discreet, for example, Gender or Favourite sport. Discrete data usually occurs in a case where there are only a certain number of values, or when we are counting something (using whole numbers).

Continuous data makes up the rest of numerical data. This is a type of data that is usually associated with some sort of physical measurement, in this case, height and weight.

Data is discrete if there are only a infinite number of values possible or if there is a space on the number line between each 2 possible values. Discrete data usually occurs in a case where there are only a certain number of values, or when we are counting something (using whole numbers), and so this form of data does not provide much scope when concerned with comparability and analytical values. Therefore I have chosen to investigate a type of data I will be able to conclude un-limiting numerical data from as so I can make detailed analogies and conclusions.

I predict that there will be a correlation between weight and height as when you grow taller, your body mass increases as you have more bone etc to make up your body. However it would be interesting to find out is if you compare people of different ages weights when their height is the same, and body maturity may make the person heavier due to developed muscles etc. Also it may be intriguing to discover the relationship between males and females of the same age.

Within my investigation I am concerned with gender difference as upon examination of my data it is apparent that values for each sex are of poles apart, in that values for males are of higher value and lower for females, I believe this will become more apparent as my investigate proceeds.

Within my preliminary investigation I will be analysing a sample of random subjects, these subjects will be chosen methodically by obtaining a range of thirty numbers from the range of each boys and girls.

I have chosen to take a sample of 30 boys and girls as I have calculated that 30 out of the total number of each gender produces a result of about 5%, which I consider a good starting point for my pre-test as it will produce quick, obtainable data and will give me a good idea of how I can deal with the information as so I can improve my testing further, for a more detailed analysis later on.

Percentage sample of boys

=30/604 x 100

=4.9669%

Percentage sample of girls

=30/579 x 100

=5.1813

I used Quota sampling to collect my data by simply choosing students walking past me from the corresponding gender and year groups. This may not be the fairest way of obtaining my data but I believe that is was just as random as it would be if I had catalogued every single student and picked them out at random.

I used this method to generate a sample of 30 boys and girls:

Girls

Boys

Handling the Data

I also examined accuracy of the data, as I can verify the truthfulness of my data I will examine it on face value, I have ignored any data that falls outside of 2 decimal places, weights that are not within the parameters 30-100kg and heights outside 1-2 metres. I have decided to eliminate these values as I consider them to be abnormal and biased, I can conclude this by examining BMI, as I am aware that any-one with a body mass Index lower than 18 or higher than 35 is not typical. This applies for height ...

This is a preview of the whole essay

Handling the Data

Obtaining evidence for analysis

My analysis will concern the differentiation of boys against girls, as I am aware, from the age of 11 it is apparent both genders take on a from of change and depending on their personal chemistry a change in weight and height is one of the major factors within adolescence.

By obtaining data that I can tabulate I can carry out a series of comparisons and conclude conclusions concerning my applied data.

There are a range of statistical calculations I can make use of.

I am aware I can obtain the following, for both genders and both height and weight:

Frequency distribution
Mean;
Mean deviation;
Standard deviation;
Median;
Range;
Inter-quartile range;
Central tendency;
Measure of dispersion or spread;
Distribution.

Mode = Value that occurs most in a data set. Not a very useful measure of central tendency.

Median = Middle value from a set of ranked observations. Useful for highlighting the typical value of a data set.

Mean = Sum of a set of observations divided by the number of observations in a data set, most widely used measure of central tendency. Also can be calculated as a weighted mean for grouped data

Standard Deviation = Measures or depicts the amount of spread or variability in a data set; how typical of a whole distribution the mean actually is. It is apparent, the larger the Standard Deviation, the greater the spread of observations and the less typical the mean.

Standard Deviation or Variance to compare locations or regions is an absolute measure.

Mean = A measure of central tendency calculated by dividing the sum of the scores in a distribution by the number of scores in the distribution. This value best reflects the typical score of a data set when there are few outliers and/or the dataset is generally symmetrical.

Box plot = Summary plot based on the median, quartiles, and extreme values. The box represents the inter-quartile range which contains the 50% of values. The whiskers represent the range; they extend from the box to the highest and lowest values, excluding outliers. A line across the box indicates the median.

Skewness = Measures the degree to which data values are evenly or unevenly distributed on either side of the mean. If a majority of the values in a data set fall below the mean, then data are positively skewed with the tail of the histogram falling to the right. If a majority of the values fall above the mean, then data are negatively skewed and the tail of the histogram will fall to the left.

As I have limited amount of time given for this investigation I will consider the importance of the above actions before I carry any of them out.

Histogram to compare the Heights for boys and girls

Histogram = Graphic representation of grouped data along two axes

I have chosen to draw a histogram of the heights of the boys and girls within my sample. This will require the grouping of both boys and girls as so I can accumulate frequency and frequency density. Class widths need to be appropriate as so their individual frequency are of tabulating range, as so I can conclude a clear distribution.

I have created a table below which displays the frequency distribution and frequency density for my samples.

Frequency = frequency density x class width interval

Boy’s heights (m)

Girl’s heights (m)

Boy’s weights (kg)

Girl’s weights (kg)

The above sets of data were utilised within figure 1.1, which concerns each genders heights. I have plotted my Histograms as so boys and girls exist within the same graph as so I can make direct comparisons. A histogram allows me to assimilate distribution of the concerned information more quickly than if I were to simply examine the above tables, a graph demonstrates the information clearly and conclusions such as, for example a lack in symmetry, or skew can be concluded.

What is the mean and deviation?

I wish to calculate the means and standard deviations using raw data as so I can obtain additional statistics for my comparison.

I used my calculator to obtain the mean and standard deviation for both genders, using my data for heights and then weights.

Height

Weight

Cumulative curves of weights of both genders

Within the above tables concerning frequency and frequency density for both boys and girls weights I have also calculated the cumulative frequency as so I could create a histogram of these results (figure 1.2)

I can use my graph to analyse the relationship between the data for boys and girls. These two cumulative corves will be plotted within the same graph as so I can carry out a better analysis

Median weight (kg):

Boys = 55

Girls = 48.5

I was able to obtain the median values for each sex easily as my results from my sample were put into a table within a spreadsheet and so I was able to arrange the data in ascending weight order and pinpoint the middle value.

Once the data is arranged in order of ascending height, I can conclude:

Weight range (kg):

Boys = 35-93

Girls = 33-72

Attaining Lower, Upper and inter-quartile range

I was able to obtain these results by pinpointing the intervals 25 percentile, 50 percentile and 75 percentile on my graph and reading off the corresponding data, this was simplified by the fact that weight is a continuous variable, it is a continuous approximation of the distribution of values.

Scatter diagrams of weight against height for boys and girls

This graph enables me to look at any possible correlation between the two variables, height and weight. I can deuce the coefficient, depending on the degree of correlation with line of best fit and plotted points. (Figure 1.3)

Summary of findings from preliminary investigation

My results lead me to believe that in general terms that the central tendency for boys heights is within the range 1.60-1.65 metres and girls, both ranges 1.60-1.65 and 1.70-1.75 metres. I can conclude from the standard deviation for boys concerning both weight and height, suggests that the boy’s values vary more than the girls thus meaning their mean is less typical. It appears that in fact the girls vary less concerning height. Interestingly the boys and girls mean height is the same and their mean weights do not vary too greatly.

These values indicate that a typical weight for boys within Lytchett Minster School, aged from 11-16 is greater than for girls within the same parameters.

As observed by the product of standard deviation, boys are more spread out and have a wider range compared to girls.

These results indicate, in general that both sexes follow a trend concerning weight until their weights until they reach 50 kg. Evidence suggests that 15% of boys are above 68kg, and 15% girls are above 57kg a difference of 11kg. I can also conclude that 50% of boys are above the median weight of 55kg and 73% of girls are above the median weight of 48.5kg. The plotted points on my scatter graphs for both sexes demonstrating weight dependant on height. Both sexes demonstrated a lack of correlation and the deviation from the line of best fit illustrates a wide spread, especially concerning boys.

Analysis

Using my provisional conclusions, I have collected some issues I feel need further investigation. My Histogram indicated that both sexes follow a trend in increasing height until their weight exceed around 50kg, therefore there could be a point where boys weights and heights exceed girls, or that girls start to grow in weight at a more proportional weight to each other and reach a steady weight before boys.

My original approach to sampling of pupils involved a Quota sampling of 5% from each sex and 10% of the whole school.

I would like to take a more detailed look at the school, this time exceeding the percentage by a further 10% so an over all school sample of 20%.

Considering my proposed hypothesis I am going to conduct a survey across year groups rather than the school as a whole, as I do not believe a further look at the generalised patterns of height and weight for boys and girls will help my theory any further than before.

I am going to examine year groups 7 and 11 as so I can pinpoint whether in fact:

Girls appear to mature at a steadier rate than boys after a certain point, and finish their growth spurt before boys reach their full potential.

I have chosen these two year groups as a comparison as they are at each end of my available age range and I believe will produce the most promising results.

Depending at the outcome of this search I will consider whether it is necessary for me to take a further look at year groups in more detail.

To ensure my testing is fare and candidates have an equal chance of being picked I have decided to use stratified random sampling dividing up the school into years and genders. This is basically finding the ratio of the total number of values you want from each group.

A Sample of 50% sample of each boys and girls from the combined year groups 7 and 11 which have 282 and 170 pupils

Total number of pupils = 170+282= 452

Within this I wish to take a sample of 10% sample in total.

Therefore the following calculations correspond to my sample:

Number of boys from year 7 = 151/452 x 120=40

Number of girls from year 7 = 131/452 x 120=35

Number of boys from year 11= 84/452 x 120=22

Number of girls from year 11 = 86/452 x 120= 23

This then is used to find 40 random pupils from year 7 boys, 35 random pupils from year7 girls and so on.

Year 7 Boys Heights and Weights:

Year 7 girls Weights and Heights:

Year 11 Boys Weights and Heights:

Year 11 Girls Weights and Heights:

Median weight (kg):

Year 7 Boys = 46

Year 7 Girls = 38

Year 11 Boys = 56

Year 11 Girls =48

Year 7 Height

Year 7 Weight

Year 11 Height

Year 11 Weight

The below Graphs are not to the same height scale, this was because the results were either to close together and hard to see the separate results, or spread out over a range of height.

The above scatter diagrams Indicate to me what to expect when I compile my scatter diagrams in more detail.

Equation for line of best fit

y = mx + c (m = gradient and c = y intercept).

To find the equation for the line of best fit on any graph you need to find the gradient of the line and the y intercept. So you can substitute them into the equation for every line which is y = mx + c. So therefore I need to find them for my graphs.

For example if the gradient on my graph was 40, I could substitute this into my equation: y = 40x + c

Now because my line doesn’t go through the y-axis we have to work out where about it would normally go so I need to substitute in 2 values off my graph. I have chosen them as:

x = 1.4

y = 42

So I need to replace x and y to find c:

42 = 40 x 1.4 + c

42 = 56 + c ...

Calculating correlation coefficients

To make accurate comparisons of the two sets of data for each sex and age, I will use spearmans rank. Spearmans rank will show me how closely related height is to weight. Because I have to do the ranking four times, I have decided to only use a sub-sample of ten random people from years 7 and 11 for boys and girls. I will choose the people for my sample via a random number generator on my calculator however, my calculator goes into the second decimal place so I will round up to the nearest whole number. After I have worked out the difference between the ranks squared (d²) I will then use the following equation to calculate the correlation coefficient:

1-6Σd² / n(n²-1)

Year 7 boys

The sum of d² is 71.5

1 – 6 x 71.5/10(10²-1) = 0.56

Year 7 Girls

The sum of d² is 93.50

1 – 6 x 93.50/10(10²-1) = 0.43

Year 11 Boys

The sum of d² is 5.50

1 – 6 x 5.50/10(10²-1) = 0.96

Year 11 Girls

The sum of d² is 117

1 – 6 x 117/10(10²-1) = 0.29

Conclusions

I can tell from these scatter graphs first of all that the girls have a poorer correlation than the boys, this is also proved by my spearmans ranking where I discovered that the correlation coefficient is much greater for boys than for girls. Although the boys have better correlation, the girls have a closer height bracket whereas the boys have a bigger height bracket and a better correlation. In both of the graphs the majority of people are in the 1.6m to 1.8m range. The boys height however continues after 1.8m unlike the girls.

Very like in year 7 the girls weight is not spread out and the majority are still compacted into the 40kg to 60 kg weight range, although it is now more spread out because before it was all nearly on the 40kg line whereas now it is more spread out but still very compact. The boy’s weight is spread out mainly in the 40kg to 80kg range. This shows us that while the girls are fairly uniform in weight the boys are a lot more varied. Like my first prediction the boys are heavier then the girls by year 11.

I can tell from the year 11 cumulative frequency graph for weight that the boys are heavier because the graph ends later for boys then the girls. Also the boys have a greater inter-quartile range. I can tell this because of the shape of the curves. The girls have a tight distribution. I can also tell from the year 11 cumulative frequency graph for height that boys are taller since because many more boys are still on the graph even after the tallest girl has been counted for.

Evaluation

I think that my investigation has been a complete success in proving my original hypothesis, however I do think that I should have made my original samples a little larger, also I think that my sub-sample for spearmans rank was too small as the correlation would have greatly varied if one result was different as there was such a small number of samples. Apart from that I think I have proved that females tend to be of the same weight, but with varied heights, making them have poor correlation, and men tend to be spread over a wide range of heights and weights, but have both strongly related.

Investigate the following fields; Weight and Height.

This is a preview of the whole essay

Mode = Value that occurs most in a data set. Not a very useful measure of central tendency.

Median = Middle value from a set of ranked observations. Useful for highlighting the typical value of a data set.

Mean = Sum of a set of observations divided by the number of observations in a data set, most widely used measure of central tendency. Also can be calculated as a weighted mean for grouped data

Standard Deviation = Measures or depicts the amount of spread or variability in a data set; how typical of a whole distribution the mean actually is. It is apparent, the larger the Standard Deviation, the greater the spread of observations and the less typical the mean.

Standard Deviation or Variance to compare locations or regions is an absolute measure.

Document Details

Related Essays

Investigate the relationship between height and weight.

height and weight

I am going to investigate weight, age, gender and height

Height and weight of pupils