• Join over 1.2 million students every month
• Accelerate your learning by 29%
• Unlimited access from just £6.99 per month
Page
1. 1
1
2. 2
2
3. 3
3
4. 4
4
5. 5
5
6. 6
6
7. 7
7
8. 8
8
9. 9
9
10. 10
10
11. 11
11
12. 12
12
13. 13
13
14. 14
14
15. 15
15
16. 16
16
• Level: GCSE
• Subject: Maths
• Word count: 3338

# GCSE STATISTICS/Data Handling Coursework 2008

Extracts from this document...

Introduction

[Type text]

GCSE: Data Handling Coursework

Introduction

For this data handling project, I shall use data from athletics; track and field events and also the mass and height of the pupils from years 7 to 11, from the Athletics data spreadsheet. The subjects are only boys, from one school. There is a large amount of data in the sample, including times for 100m, 200m, 400m, 800m and 1500m, and also events such as long jump, triple jump, shot, javelin and discus. There is also a bleep test result and height and mass of students too. The data should be reliable, however I shall check for any anomalous records, then discard any from my sample.

I shall make three hypotheses based upon this data. I shall then show how I will test these hypotheses in my plan to prove or disprove them.

Hypotheses

The bleep test is an indication of aerobic respiration, event within the data; it is a test of endurance and also fitness. I think that fitness and health are related and the BMI, body mass index, of a person can be a good representation of health, despite sometimes not taking into account people with high muscle: fat ratios. I therefore think that people with a BMI in the “healthy” 20-24 bracket will have a better score for the bleep test than those outside of it.

Middle

In the data Lower Quartile = 5.5

Median = 6.15

Upper Quartile = 6.925

Therefore the inter quartile range is 1.425.

1.5 X 1.425 = 2.1375

5.5 – 1.425 = 4.075                 the data highlighted blue are lower outliers

6.925 + 1.425 = 8.35         so the data highlighted red are upper outliers.

3: 0.8

4: 0.2 0.4 0.7 0.7 0.8 0.8 0.95

5: 0 0 0 0.2 0.3 0.3 0.3 0.4 0.4 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.6 0.6 0.7 0.7 0.7 0.75 0.8 0.9

6: 0 0 0 0 0.1 0.1 0.1 0.1 0.2 0.4 0.4 0.4 0.4 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.6 0.6 0.6 0.6 0.75 0.8 0.8 0.9 0.9

7: 0 0 0.2 0.3 0.3 0.4 0.5 0.5 0.5 0.5

8: 0 0 0 0 0.2 0.3 0.3 0.5 0.7

9: 0.5

I discarded these data then looked at the yellow year 8 diagram.

In the data Lower Quartile = 5.25

Median = 6

Upper Quartile = 6.5625

Therefore the inter quartile range is 1.3125.

1.5 X 1.3125 = 1.96875

5.25 – 1.96875 = 3.28125                 therefore there are no lower outliers

6.5625 + 1.96875 = 8.53125         and also no upper outliers

3: 0.8

4: 0 0 0.5 0.5 0.5 0.6 0.75 0.75 0.8

5: 0 0 0 0 0 0 0 0 0.2 0.25 0.25 0.25 0.3 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.6

6: 0 0 0 0 0 0 0 0 0 0 0 0.1 0.3 0.4 0.4 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.75 0.75 0.8

7: 0 0 0 0 0 0 0 0.1 0.2 0.3 0.5 0.5 0.75

8: 0 0.3 0.4 0.5

Finally I looked at the orange year 7 diagram as before.

In the data Lower Quartile = 4

Median = 5

Upper Quartile = 6

Therefore the inter quartile range is 2

1.5 X 2 = 3

4 – 2 = 2         so there are no lower outliers

6 + 2 = 8         the data highlighted red is an upper outlier.

2: 0.75

3: 0 0 0 0 0 0.25 0.5 0.5 0.5 0.5 0.5 0.5 0.5

4: 0 0 0 0 0 0 0 0 0 0 0 0.1 0.2 0.25 0.5 0.5 0.5 0.5 0.7 0.75 0.75

5: 0 0 0 0 0 0 0 0 0 0 0 0.2 0.25 0.25 0.5 0.5 0.5 0.5 0.6 0.75 0.8

6: 0 0 0 0 0 0 0 0 0 0.1 0.1 0.1 0.2 0.2 0.5 0.5 0.75

7: 0 0 0 0 0 0 0.3 0.5

8: 0.3

Having then discarded all of the outlying data I created the box and whisker diagrams again.

From the orange year 7 box there is a definite increase in the median. The inter quartile range is smaller, so data is more closely grouped and also grouped towards a longer throw. The high points and low points are also both greater in year 8 than year 7. Neither data is particularly skewed.

The difference between years 8 and 9 is not this conclusive. The quartiles are all slightly higher, but the high point drops. This however is due to the difference in weight of the shot thrown. It increases between year 8 and 9, but not year 7 and 8.

Conclusion

For the data to be normally distributed: Approximately 68% of data lies within one standard deviation of the mean

i.e. 68% lies within μ±σ

Similarly 95% lies within μ± 2σ

And        99% lies within μ± 3σ

For year 7:         Standard deviation = 2.1952

Evaluation

Hypothesis 2 was the only hypothesis to be proven correct, however I was able to analyse why hypothesis 3 was incorrect, and also look at links between the distribution and age too.

Hypothesis 1 was incorrect, however this was the least likely to be proven right as BMI is a simple indication of something that is often too complicated to be shown in such a categorical way.

Overall the project therefore had mixed results, however I was able to draw conclusions from all three hypotheses which is a strong positive. I tried to make hypothesis that were not definite as there would be no point in stating obvious points to then prove them correct, so it is understandable that the whole project did not go completely smoothly.

To better the investigation I would use a wider variety of results if possible – there are obvious limitations with the data I used for this project. It is only from one school, and only boys as well. There are also not very many pupils who have complete records – there are very many pieces of data missing. I could use a national database for example with much more data so as to reduce the risk of anomalous graphs and to make the project more reliable and valid, including results for both genders.

of

This student written piece of work is one of many that can be found in our GCSE Miscellaneous section.

## Found what you're looking for?

• Start learning 29% faster today
• 150,000+ documents available
• Just £6.99 a month

Not the one? Search for your essay title...
• Join over 1.2 million students every month
• Accelerate your learning by 29%
• Unlimited access from just £6.99 per month

# Related GCSE Miscellaneous essays

1. ## For our GCSE statistics coursework, we were given the question Where are houses most ...

As well as using Wikipedia to find which counties are in the North/South of England. The advantages of 'right move' were that the specific search meant only the necessary prices came up. Systematic sampling I used a systematic sample to determine which counties I would use to find the postcodes.

2. ## Maths driving test

need to clean it up and delete any of the previous data which is no good for me, before making ay assumptions about the hypothesis. Missing data- several rows have been deleted as all the information isn't present in the spreadsheet.

1. ## Statistics Coursework. I am going to study the wealth of countries in the ...

27.5 4,000 49 25 576 Denmark 79.2 56,427 7 1 36 Djibouti 52.3 996 39.5 36 12.25 Ecuador 55.4 3,312 32 27 25 Georgia 69.2 2,315 13 31 324 Germany 71.2 40,079 10 8 4 Greece 60.1 32,166 21 12 81 Guinea 52.8 487 36.5 45 72.25 Haiti 48.9 638

2. ## The relationship between level of parental education and SAT scores

Total group reports in 2009 http://professionals.collegeboard.com/profdownload/cbs-2009-national-TOTAL-GROUP.pdf Total group reports in 2010 http://professionals.collegeboard.com/profdownload/2010-total-group-profile-report-cbs.pdf Analysis First of all, descriptive statistics was used to discover any statistical significance on what is being observed. Using Microsoft Office Excel 2007 the data of the survey conducted by the Collegeboard.com were first arranged into graphical forms

1. ## IGCSE Modeling Project bouncing ball

to a bounce, it is landing basically every 6 second so the M and N values are : = P (x-18) (x-24) = P (x�-42x+432) To find P : f (20)

2. ## Investigation into 100m times and long jump distances

For my second hypothesis, I will use a box plot which will enable me to easily compare at a glance the median, inter-quartile range the lowest and highest value of each year group. To do this, I will draw a stem and leaf diagram and then work out the median and the quartiles.

1. ## Statistical Experiment Plan to investigate the ability to estimate 30 and 60 seconds.

periods of time so they will have a better correlation in the two estimates I believe. If my box plots show that year 7?s are better at estimating 30/60 seconds then I shall use year 7's for hypothesis 2 and 3.

2. ## Math Investigative Task - calculating the value of metal used in coins.

x 2.764 = \$0.004837 for the mass of zinc in a coin For the total price of both metals in a coin: 0.0004899 + 0.004837 = \$0.0053269 or 0.53269 cents Since the metal value is less than 1 cent, it is not worth melting as the face value of the coin is still higher than the metal value.

• Over 160,000 pieces
of student written work
• Annotated by
experienced teachers
• Ideas and feedback to