• Join over 1.2 million students every month
• Accelerate your learning by 29%
• Unlimited access from just £6.99 per month
Page
1. 1
1
2. 2
2
3. 3
3
4. 4
4
5. 5
5
6. 6
6
7. 7
7

# Bivariate Data Exploration

Extracts from this document...

Introduction

Maths Coursework                Tim Durden

STATISTICS 2:

Bivariate Data Exploration

Aim:

The aim of this investigation is to see if there is a correlation between the engine size of a car and the insurance group that it resides in.

Introduction:

In our present day there is an ever-increasing public demand for value-for-money products and services, especially in cars, shopping and clothing markets. For students, this is even more important as everything they buy (unless they are particularly affluent) can easily amount to debt (through extensive student loans). For students in particular, cars are very often an essential means of transport, and so, like most things, it is important for a student to get the best deal for their car.

However, insurance companies and car dealers are very much aware of the student situation and have classified certain cars as ‘student cars’, and to clarify this, include cars from Peugeot (106, 306), Renault (Clio), Citroen (Saxo), and Vauxhall (Nova) to name but a few.

Now it seems that these cars all have relatively low engine sizes, commonly ranging from 900-1800cc, and are all placed in relatively low insurance groups (and therefore have lower insurance costs), but this may not be the case for all cars, especially those with larger engine sizes.

This investigation will examine data from a range of cars, varying

Middle

6

1.6

9

1.4

10

1.8

15

1.4

5

2

12

2.5

16

1.2

3

2.5

15

0.95

3

1.6

3

1.2

3

1.2

4

1.4

8

2

11

1.8

11

1.8

10

1.5

7

0.9

2

1.1

5

1.6

10

1.1

4

1.2

4

1.4

5

1.6

6

1.4

7

1.4

5

1.6

10

1.3

5

1.3

6

1

3

2

11

1.1

4

1.4

4

Modelling Procedures:

Now the data could be compared to see if there was correlation. The first step was to draw a scatter diagram, with the X-axis as engine size and Y-axis as the insurance group.

The followed graph was

Conclusion

Accuracy & Refinements:

Firstly, the sample size (50 datasets) was selected using random numbers generated by a calculator. Whilst this method does produce random numbers, the numbers are formed as part of an equation, and so may not prove completely random. A much better approach would have been to use a systematic sample, which would have been obtained from the parent population (once the data was ordered by a variable, e.g. insurance group) by counting through the sampling frame, i.e. every 2nd or 4th dataset was selected.

Secondly, if a larger sample had been collected, the accuracy of the correlation would be increased. There would be more points to plot and therefore the correlation would be much more representative of the entire population (e.g. a sample of 500 cars out of 50,000 in Essex), even if there were more cases of outliers to the correlation.

Thirdly, it was felt that having data that was ‘secondary’ gave rise to bias and error of data collection. If data had been ‘primary’, that is collected by the researchers themselves, the data may have been more accurate. With regards to this investigation, it is possible that because the company were selling cars, there may have been some bias as towards which cars they buy and sell. Cars that were of a poor standard would not have been purchased for secondary sale.

Page  of

This student written piece of work is one of many that can be found in our AS and A Level Probability & Statistics section.

## Found what you're looking for?

• Start learning 29% faster today
• 150,000+ documents available
• Just £6.99 a month

Not the one? Search for your essay title...
• Join over 1.2 million students every month
• Accelerate your learning by 29%
• Unlimited access from just £6.99 per month

# Related AS and A Level Probability & Statistics essays

1. ## Statistics coursework

From my scatter diagrams I can see a strong positive correlation between IQ and total KS2 results. This leads me to believe that girls in year 7 should achieve higher KS2 results than boys as they achieve higher IQs. To investigate this I decided to draw a cumulative frequency graph of the year 7 girls and boys total KS2 results.

2. ## I have been given the task of finding what affects the price of a ...

Conclusions of Random Sampling. As you can see some of my predictions were right while others weren't. * Age was a big effecter of price and had quite a strong negative correlation as I predicted. * MPG again had a very strong negative correlation showing it did affect price a lot, which I predicted wrongly.

1. ## Anthropometric Data

This is like an inverse correlation which shows the direction of the correlation slopping down. No correlation When observing the pattern on this particular scatter it shows that there were no relationships the different variables. Dependent and Independent variables Dependent and independent variables gives the understanding to refer values that change in the relationship to each other.

2. ## AS statistics coursework - correlation coefficient between height and weight in year 11 boys ...

The fact that the boys and girls both not only have positive correlations which are at minimum moderate but they also both have steep gradients for their regression lines and low residuals which indicates that height and weight are most definitely dependant on one another i.e.

1. ## Statistics Coursework

86.51 85 92.59 130 96.34 175 100 41 86.77 86 92.59 131 96.56 176 100 42 86.96 87 92.86 132 96.56 177 100 43 87.04 88 92.94 133 96.56 178 100 44 87.04 89 93.12 134 96.56 179 100 45 87.3 90 93.15 135 96.56 Year 8 1 16.14 48

2. ## &amp;quot;The lengths of lines are easier to guess than angles. Also, that year 11's ...

This is 0.4cm bigger than the actual length of the line. The actual length of the line is contained in the box, but at the very edge of the lower quartile. This again shows that most people estimated over the length of the line.

1. ## Statistics Coursework - Bivariate Data.

To improve the quality of the data I am using, I removed those schools that had no data and those which were from special needs schools, so they would not effect the results. From the scatter diagram, you can see that there are no outliers that need to be tested.

2. ## My aim is to find out if there is :a) Any correlation within ...

I collected all of my data from the ECB (English Cricket Board) official website and conducted the selection of my data under a strict criteria: 1. All statistics are taken from the 2002 English Cricket Season 2. All players tested took at least 1 wicket in the division in which they represented their county.

• Over 160,000 pieces
of student written work
• Annotated by
experienced teachers
• Ideas and feedback to