Math Coursework-Mayfield High Data Handling

Authors Avatar

Isaac Wong 10H Math                                                                                   5/9/2007

Math Coursework-Mayfield High Data Handling

Introduction

In this investigation, I will be using the information provided in the Mayfield High database to carry out statistical analysis to prove my hypothesis. There are a total of 1183 students in the Mayfield High database, which is a secondary data source for my investigation. I will use some statistics skills such as standard deviation, scatter graphs, box and whisker diagrams, etc. I will carry out 3 investigations which will hopefully contribute to proving my main hypothesis. I will use 2 different year groups in this investigation, Year 8 and Year 10. I have chosen year 8 and year 10 because they represent the middle section of the school community and this is good because it will be very rare for the students to cease their body growth at that age.

Hypothesis:

The year group an individual belongs has the strong effect on their weight (The higher the year group the student belongs in, the heavier he/she is).

Sub-hypothesizes:

  1. The weights of students in year 8 are relatively lighter than the ones in year 10 (main hypothesis)
  2. The weights of the genders are similarly distributed
  3. The taller the individual, the heavier s/he would be

The first sub-hypo is the one which will give me the main information for my main hypo. I think the first sub-hypo is correct because as a person gets older, their body will physically change and growth will continue to start, therefore their weight will continue to increase. Since the year group I am obtaining from are teenagers, it is their time for growth so growth should not have ceased.

The second sub-hypo may be a factor that determines weight but because I don’t have a lot of confidence in it, therefore I won’t say it has a large effect on it but then I will investigate it in order to make sure.

The third sub-hypo is quite obvious because as a person is taller, the heavier the person would be and this will provide great evidence on which is the main factor that determines weight, it is either height or year group.

Planning

The sub-hypothesizes would then be represented by different forms:

  1. I plan to take a random sample of 60 for each year (year 8 and 10). I will use the random number generator and take 30 samples of males and 30 samples of females for those 2 years. This will let me use these data to present box and whiskers diagrams for my first sub-hypo.

  1. The second sub-hypo, I would partly use the data already obtained from sub-hypothesis 1. I will obtain another 30 males and another 30 females (15 males from year 8, 15 males from year 10, 15 females from year 8, 15 females in year 10) and then separate all the data into their genders. As a result, I will have 30 male data in year 8 and 30 male data in year 10. I will randomly choose 15 from the 30 data that belongs to year 8 and 15 from year 10 so my ending data that I will use for the graph for second hypo will be 30 (males only). I will do the same with females. I will present these data in a frequency polygon to show the distribution. However, I will present the male data and female data on the same graph but using a different line, this will enable me to easily see the difference between them and the distributions. Another 60 more people would be randomly selected again from the group I already used in the histograms to create a general distribution of a mix between males and females. This would then be compared against the genders distribution to see if there’s a shift and whether the gender has an affect upon weight.
Join now!

  1. For the third sub-hypothesis, I will use stratified sampling to get 120 data from the year 8 and 10.  The final hypothesis would be shown in the form of a scatter diagram of weight against height to show the correlation between the two.

Formulas/Statistical Skills Used

Sampling

Random Sampling:

Random sampling is a method of sampling to get data in a totally unbiased way. There is a random number generator in calculator which will generate a number between 0.001-1.000. I will use the given number and multiply it by the total number of samples ...

This is a preview of the whole essay