Factors influencing girls athletic performance throughout secondary school.

Authors Avatar






















Introduction

Athletics data has been collected for a number of years at Colchester County High School.

Colchester County High School is a selective school for girls in the Colchester district.  This means that it is not representative of the whole population.  Upon entry to the school, forms are chosen on the basis of musical, sporting and academic talent from previous years in primary school.  This means, that in theory, all the forms that are the outcome of one selective test should be equal in sporting ability.  

However, this is not to say that they would be equal in athletic activity, as in primary school, most pupils play sports such as netball, hockey, tennis and rounders.  Even primary schools that do some athletics do more common things like the 100m run, and long jump.  Most primary schools do not teach the athletic events such as 1500m or discus.  Girls that are good at sport are not necessarily good at athletics, and vice versa.  Also, girls whose schools do teach athletics are clearly priveliged.

This data is available to the pupils through the maths and sports departments.  The data includes times for running various distances and distances for long jump high jump and triple jump.  The data also includes distances that the girls can throw the rounders ball, discus and shot.

This data is to be treated as though it were primary data, as it is from a reliable source.  The physical education staff record the data, and sometimes the data is collected by the pupils themselves.  

Although the data is from a reliable source, it is essential to recognise that human error can be a factor in this data.  The data could have been “measured” inaccurately or recorded mistakenly.  There are several scenarios which would make the data faulty.

There is a very large amount of data, so it seems sensible to obtain samples of data to eliminate as much bias as possible, and attempt to obtain a sample which will enable me to make conclusions about the whole school/year/class, that I have chosen to investigate.  

Hypothesis One – THE BETTER ONE IS AT SHOT PUT, THE BETTER THEY ARE AT DISCUS

I hypothesise that the farther one can throw the shot, the farther one can throw the discus.  To test my hypothesis, I am using data from 7H – 1998, 7K- 1999, 7A- 2000 and 7P – 2001 to test my hypothesis.  I would eventually like to develop this hypothesis to compare the classes in the events of shot putt and discus, in an attempt to determine which year 7 class was the best at these activities.  

To ensure that fair samples which are representative of the form are chosen, I will take a stratified sample from each form.  This means that I will divide them into strata (forms) and then choose a random sample from each category.  The size of each sample is in proportion to the size of each category within the population.  As I am doing a scatter graph, an appropriate sample size is 20 – 30, as it is enough to show a relationship, but not enough to overcrowd, or look messy:

To work out the stratified sample, we need to know the year size and the size of each form:

Then, to work out the sample size we need to divide the class size by the sum of the classes and multiply by our total sample size (26):

7K        :        26        x   26   =   6.4

As 6.4 is not an appropriate sample because you can not have 6.4 people, we round up or down accordingly.  This needs to be done for every form until we know the sample size for every form:

Then, using a calculator, we chose 6 random samples from 7K and 7H, and 7 random samples from 7P and 7A.  To do this, we enter the total class size (26/27) in to the calculator, and then push the “RAN” button.  When we press equals, a random number up to 26/27 will appear, and which ever piece of data corresponds to that number will be one of the 6/7 samples from that strata.

Join now!

This method ensures that I have a fair proportion of data from each “strata” (Year 7 form). I chose this method because of the fair representation it gives.  Cluster sampling is limited in class sizes this small, as will be too biased.  I also attempted systematic sampling, where every nth piece of data is used, but I experienced many problems with this.  For example, if I took every 4th piece of data, sometimes the data I should have used was inadequate, as it was incomplete (etc).  This meant that I had to choose the next piece of data, which messed ...

This is a preview of the whole essay