# The heights of 16-18 year old young adults varies between males and females. My prediction is that the majority of males are taller than females.

Statistics coursework

The heights of 16-18 young adults varies between males and females.  My prediction is that the majority of males are taller than females.

I have decided to take a sample of 100 16-18 young adults from Havering Sixth Form College and measure their heights with a tape measure.  My two populations will be male and females and I will obtain 50 male heights and 50 female heights.  I have chosen to take a sample of 100 because small sample sizes may give inaccurate results whereas a larger sample size may be impractical and time consuming.  Therefore I feel a sample size of 100 will give a efficient set of results and a meaningful conclusion could be deduced.

To insure that the data I collected wasn’t biased I didn’t look at each persons height to ensure that the data I collected was even for both populations and within both populations.  Also because height isn’t affected by whether you are athletic or not or whether you eat lots or who you are friends with, the data I collected cant be bias for that aspect of my data collection and also I didn’t collect my data at a modeling studio, of which nearly everyone there would be over 6ft and would make my data collection bias and unfair.  So overall the data I collected wasn’t bias and was accurate to use for my investigation.

I decided to investigate the difference in heights between males and females ages 16-18 by collecting the data from students attending Havering Sixth Form College and then working out the Mean and Variance of both populations so that I could work out confidence intervals.  To work out the mean and variance I had to illustrate tables for both populations.  I drew up tables showing X as the height in inches which ranged from 60 - 73 in females and 62 - 76 in males.  Then from my raw data I worked out the frequency (ƒ)  of X to see how many males and how many females were in each height category, to make sure that I had 50 of both populations I added up the total sum (Σ) in the ƒ column and it would total up to the sum of 50.  This allowed me to work out X times the ƒ ( xF ) of the data but because I was producing my tables in Excel I was able to insert SUM=B5*D5 of which would do the calculation for me, then I could work out the total sum of the X times ƒ column which later on in my calculations I can work out the mean and variance for both my populations.  I then had to work out X squared, times by the ƒ ( x^2F) so I inserted the formulae SUM=B5^2*D5 and then I worked out the Σ of that column.

To work out the mean of my data for both of my populations I had to divide the Σ of X times ƒ by the ƒ, so by inserting SUM=E20/D20 it would calculate the mean.

I have chosen to use this formulae because my data is in the form of a frequency table.

To obtain the variance of my data for both populations I had to divide the Σ of

( x^2F) by the Σ of the ƒ minus the mean squared.  To work out the variance of both populations on Excel I ...

#### Here's what a teacher thought of this essay

An excellent piece of work showing good use of Normal distribution, the central limit theorem and confidence intervals. 5 stars