Statistics Coursework

Authors Avatar

Statistics Coursework

I am going to be using the data from the Mayfield high school to investigate my following hypothesis.

Hypothesis

1st Hypothesis – For my first hypothesis I will investigate the relationship between the number of TV hours watched per week by the pupils against their IQ. I am going to use the columns “IQ” and “Average number of hours TV watched per week” taken from the Mayfield high datasheet. I think that there will be a relationship between them and will attempt to reveal it.

2nd Hypothesis – For my second hypothesis I will investigate the relationship between “Average number of TV hours watched per week” and “weight (kg)”. I think that there will not be any major relationship between as they will not affect each other greatly.

I will present my analysis and the results in graphs and tables and explain the results using the correlation of the graphs and arrangements of the figures.

I will select a number of pupils to base my data on and will use random sampling to ascertain the correct number of male and female pupils needed to make the investigation fair.

Stratified Sampling

I do not want to use all of the data in the database for my analysis so I will need to take a sample of the number of people in the school. I would like to take about 10% of the overall figure. I will also need to use stratified sampling to make it an equal proportion of the number of males and females in the school to make it fair.

The total number of pupils at the school is 813 so I will need to take 10% as my number, 81.3 is rounded down to 81.

The overall ratio for boys and girls in the school is: 414:399

Now I will need to do my sampling

 

Males = 414 multiplied by 81                     = 41

               813

Females = 399 multiplied by 81                  = 40

                   813

Random Sampling

Now I have the number of samples I will need to select the samples I will be taking. To do this I will use random sampling. I will take random samples until I have 81. I can do this on Excel using the following formula: = round(round()*120.

Once I have gathered the samples I am ready to start analyzing my samples.

Analysis

Hypothesis 1 Males

The first thing I need to do in my analysis is to analyze my graphs which are the source of the investigation. I have created scatter graphs to show the relationship if the two data sources for my first hypothesis. I have separated them into male and female graphs as there is a separation in the numbers.

First male scatter graph:

This first graph presented a bit of a problem. There was an anomalous result that affected the trend line and the scale of the graph. I decided to create a new graph that didn’t include that 1 piece of data. This way it would help me to analyze the rest of the data.

Join now!

Second male scatter graph:

This graph showed the data much clearer and I could then start analyzing it. There is no correlation between the 2 sets of data. This means that it is unlikely that there is a relationship between IQ and Average number of TV hours watched per week. In this it may be that my hypothesis is incorrect. There is only a very slight gradient on the trendline that leans towards a negative correlation, but the gradient is not steep enough to draw any conclusions about the relationship between the two ...

This is a preview of the whole essay