• Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

Statistics. The purpose of this coursework is to investigate the comparative relationships between the depreciation of a cars price, in relation to the factors that affect it.

Extracts from this document...

Introduction

Statistics: Analysis of used cars database

Introduction:

The purpose of this coursework is to investigate the comparative relationships between the depreciation of a car’s price, in relation to the factors that affect it. The factors that I wish to investigate are the age/mileage of a car, being the easiest to compare to depreciation. To do this, I shall use random sampling. I shall give a number of hypotheses, claiming whether each influential factor has an adequate effect on depreciation. I shall attempt to validate this using data given to me on Excel. I have done this in terms of percentage depreciation to make sure that I have relevant data to compare depreciation over each car in my sample. Here are the hypotheses and questions:

<< Hypothesis 1 >>

The older the car, the greater the percentage depreciation of the price – I believe this because as a car travels further, essential parts may perhaps wear down, and stop the car from working to its optimum standard. After a certain level of mileage, the car’s fuel costs may begin to increase, as its decreased efficiency uses up more fuel per mile.

These following data values are necessary to calculate the depreciation of a value of a car (as a rule), when there is more or less mileage:

  • Sale price (no miles attached)
  • Mileage

Mileage will affect the percentage depreciation of the original car’s price, so there should be no other variables included in the data needed to prove, or refute this hypothesis.

...read more.

Middle

Beetle

14950

13500

1

6500

1

75

Rover

623 GSi

24086

2975

6

96000

2

76

Suzuki

Vitara

10800

2995

8

50000

2

77

Mercedes

AvantGarde

17915

11750

2

17000

1

78

Audi

80

17683

3995

7

103000

2

79

Volkswagen

Polo

9960

7550

1

5000

1

80

Ford

Escort

13183

3495

7

43000

2

81

Ford

Mondeo

17780

7995

4

30000

1

82

Mazda

Pegasus

10420

2495

7

50000

3

83

Rover

416i

14486

3685

6

64000

1

84

Vauxhall

Corsa

7840

4976

4

21000

2

85

Vauxhall

Corsa

7440

3495

6

55000

2

86

Ford

Fiesta

6590

1664

10

37000

3

87

Nissan

Primera

2574

9

49000

2

88

Citroen

Xantia

14065

8

49000

1

89

Peugot

Graduate

7600

2497

8

71000

2

90

Peugot

306

12350

3995

6

71000

2

91

Fiat

Punto

7518

3769

4

38000

2

92

Volkswagen

Polo

8710

4693

5

50000

2

93

Vauxhall

Calibra

18675

6995

6

63000

2

94

Rover

Metro

5495

1995

7

52000

2

95

Rolls Royce

Silver Spirit

94651

14735

9

70000

2

96

Ford

Escort

15405

3995

5

57000

2

97

Vauxhall

Astra

9795

3191

6

43000

2

98

Renault

19

11695

2748

6

52000

2

99

Ford

Escort

9995

2995

6

64000

2

100

Vauxhall

Vectra

13435

5

52000

2

This is randomly ordered, to get a general trend in data, so my results will not be biased. However, I have 4 pieces of missing data: I will need to fix this using Standard Deviation.  To solve this problem I will remove this missing data.  To find out if there are any outliers I should find out the standard deviation to find the upper and lower bounds. The upper quartile is 75% of the maximum value, and lower quartiles are 25% of it. The formulae to work the missing values out in terms of standard deviation are as follows:

Upper Bound = Mean + 2x Standard Deviation

Lower Bound = Mean – 2x Standard Deviation

There are data outside the upper bound in the column concerning the Porsche It is approximately £6,000 higher than the upper bound; it is an outlier. However, the effect is not drastic and will not obscure my results to an inaccurate curve. When I identify any huge outliers, I will remove these, though this will not have much of an impact. Because of the 4 missing data, I will need to delete the rows for this make, as a lack of one value will obscure an average. One example is the Lexus: with no mileage, it is impractical for me to include it in my investigation, because it cannot work for my 3rd hypothesis. If I remove all the other cars which lack data, I have a remaining sample of 47.

I have constructed a table to show the range of data and to see how the data correlates I comparison to each other.

Data

Highest Value

Lowest Value

Range

Price When New

£170,841

£5,495

£165346

Price Second Hand

£37995

£1995

£36000

Age

10Years

1 Year

9 Years

Mileage

103,000m

2000

101,000

Number of Owners

3

1

2

...read more.

Conclusion

 Some new questions would probably be investigated if I had the chance to do it again. For example, the third scatter graph of the no. Previous Owners vs. Mileage did not give me strong results: it had a weak correlation and a bad trend. In place of this, I would test a new theory of ‘the older a car, the greater the mileage it will have gained.’ This would be an improvement on it: instead of comparing the number of owners to the mileage the age would allow me to see how the mileage built up, in relation to its age.  Some flaws in the original graph have been spotted: for example, some people may have owned a car in a very short timeframe, and sold the car briefly after buying it.  With age included, I can see how much the car traveled in relation to time, rather than the number of people who drove it. I believe this hypothesis would give me a strong correlation: it would provide me with more reliable results.

Perhaps if I had more time, I would test this multiple regression to see how different influential factors affect each other, rather than depreciation.      

...read more.

This student written piece of work is one of many that can be found in our AS and A Level Probability & Statistics section.

Found what you're looking for?

  • Start learning 29% faster today
  • 150,000+ documents available
  • Just £6.99 a month

Not the one? Search for your essay title...
  • Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

See related essaysSee related essays

Related AS and A Level Probability & Statistics essays

  1. Investigate if there is any correlation between the GDP per capita ($) of a ...

    Therefore the data shows a clear linear relationship. Another technique that I am going to use is a histogram because you are able to see the distribution clearly and able to determine whether I can use Pearson's product moment correlation (PMCC) or Spearman's coefficient of rank order. I am going to draw a histogram for each variable and

  2. Statistics Coursework

    So, I do not believe that the age of the students affect the attendance at school. Does attendance affect the students' learning and their exam results? There is a relationship between the attendance of the students and their exams results.

  1. Statistics. I have been asked to construct an assignment regarding statistics. The statistics ...

    Upper quartile is 27,500. So the interquartile range would be 27,500 - 23,900 = 3,600. The interquartile range shows the lower and higher quartiles are within 3,600 of each other. This spread shows a greater spread of data, meaning the data is more varied and less consistent, it is also a shows the attendance

  2. Statistics coursework

    All these diagrams will either prove or disprove the first part of my hypothesis - that girls have a higher IQ than boys. So the next stage will be to compare the IQs of boys and girls to their total KS2 results.

  1. Anthropometric Data

    the calculator I'm able to get accurate number when comparing it against the excel checking. When a visual impression and calculation checking, show that close relationship on the (x) and (y) value. In the visual look is at the line of regressing will pass through a child that as a foot length of 139(mm)

  2. Teenagers and Computers Data And Statistics Project

    How the cuboid works for the 5 x 4 x 3 No of faces painted Number of cubes 0 3 x2 x 1 = 6 1 6 + 3 + 2 = 11 x 2 = 22 2 4 x 3 + 4 x 2 + 4 x 1 = 24 3 8 Total 60 10.

  1. Maths Statistics Investigation

    Car no. Make Model Mileage Age Price new Price used % depreciation (1d.p) 8 Mercedes SLK 20000 4 35025 25995 25.8 11 Nissan Micra 28000 11 7170 860 88 22 Audi A2 20000 6 14237 7100 50.1 24 BMW 3-Series 91-99 68000 5 19930 12825 35.6 25 Nissan Terrano 18000

  2. DATA HANDLING COURSEWORK

    I will use these graphs to predict what the weight or height of a student would be. I will use cumulative frequency graphs to make comparative generalised statements about heights and weights of students across all of the strata. The cumulative frequency graphs allow you to predict percentages of students within a given range.

  • Over 160,000 pieces
    of student written work
  • Annotated by
    experienced teachers
  • Ideas and feedback to
    improve your own work