# Test which factors affect the price of a second hand car using a variety of different statistical techniques

Introduction

I am going to test which factors affect the price of a second hand car using a variety of different statistical techniques. From what I discover using these techniques I will conclude which factors are the most important.

I will examine only cars within the reach of a normal family, i.e. not luxury cars. I will also exclude classic cars as these are unlikely to perform like normal second hand cars.

Using a given set of data on the prices paid for second hand cars, I chose ten of these factors and did a survey of the class’ parents to make initial assumptions of which factors they were most influenced by when buying a second hand car. The survey asked them to choose the 3 most important and 3 least important from a list of ten. Using the results of this data, shown on page … , I selected three hypotheses to test. I will test these using a given data set. If I need to take more samples to test the hypotheses more conclusively, I will obtain data from other sources.

The types of information in the column represent the factors which can easily be defined when buying a car, even though there may be other ones that affect a person buying a car, even if they cannot be measured.

Middle

If the data is skewed then I will make box plots for each age group. To do this I will need to calculate the lower quartile, median and upper quartile. I would expect the median not to move much in relation to the lower quartile.

Collect, Process and Represent Data

Hypothesis 1

The cars I have excluded are 13, 54, 56, 71, 72, 73 and 95 as they all have original prices of over £25000. I have also excluded number 29 because it does not have an original price. This is probably because the car is so old (15 years old – the oldest car) that the value for original price could not be found out. I have also not shown numbers 69, 74 and 79 on the graph as they do not fit my scale. I will examine them individually after the graph to see if they fit the pattern.

As can be seen from the graph there is a positive correlation, albeit quite a weak one and with several anomalies, mostly in cars which are particularly expensive. This could be because when people buy a nicer model of car they will want to buy it new rather than second hand. The three cars above the £20000 mark included on this graph are all Rovers. The peculiarity of these results could also be due to something about people who buy Rovers.

Conclusion

Therefore its equation is y = -6x + 64

The gradient of the first line is (y6–y5) ÷ (x6 – x5) = (16 – 21) ÷ (10.667 – 7.667)

= -5 ÷ 3

= -1.667

Its y- intercept is 34

Therefore its equation is y = -1.667x + 34

These equations can be used as follows to find the expected percentage of original price a second hand car can expect to get due to its age. Let a represent age.

0 < a ≤ 2 years | Expected % of Original Price = -12a + 75.5 |

2 < a ≤ 7 years | Expected % of Original Price = -6a + 64 |

a > 7 years | Expected % of Original Price = -1.667a + 34 |

If I want to negate the effect of age on car I can remove the depreciation it causes from my data. To do this I will divide the actual percentage of original price of the car by the expected percentage of original price of the car. This will then give me a value which shows how the car has lost value in relation to what I would expect. It also means I can analyse other factors without it being distorted by age.

Therefore to find the % of original price of a car discounting the effect of age I use this table. Let a represent age and o represent the actual percentage of original price.

0 < a ≤ 2 years | % of Original Price Discounting Age = o ÷(-12a + 75.5) |

2 < a ≤ 7 years | % of Original Price Discounting Age = o ÷ (-6a + 64) |

a > 7 years | % of Original Price Discounting Age = o ÷ (-1.667a + 34) |

I can use this formula to work out percentage of Original Price Discounting Age for all of the cars.

