The aim of this project is to investigate which factors influence the costs of second hand cars. The makes we were looking into include Ford, Peugeot, Renault and Vauxhall.

Authors Avatar

Write Up

The aim of this project is to investigate which factors influence the costs of second hand cars. The makes we were looking into include Ford, Peugeot, Renault and Vauxhall.

To aid us with our research, we were given a bank of secondary data of one hundred and ninety nine cars. Along with these cars, we were given nine variables for each car; colour, engine size, petrol/diesel, year of manufacture, mileage, cost, preliminary cost, make and model. We later had to add an age column for each.

We have been asked which features make most difference. My predictions for which makes the most difference are as follows:

FACTOR                                IMPORTANCE                RANGE

  1. Age                                most important factor                1yr-16yrs
  2. Mileage                                                                1200-150,000
  3. Cost                                                                 375-132,500
  4. Petrol/Diesel                                                        Petrol/Diesel
  5. Engine size                                                        1,000-2,500
  6. Year                                                                1989-2002
  7. Model                                                                106-Vectra
  8. Make                                                                Ford-Vauxhall
  9. Preliminary Cost                                                        6,000-20,020
  10. Colour                                least important factor                Black-Yellow

Looking through the data, I can see that it is not perfect. There are many missing fields of data, and I had to remove one rogue piece of data, being a Renault Laguna. I knew it was not right from my general knowledge of cars, the price being too high.

Join now!

I will draw up scatter graphs, histograms and cumulative frequency curves (for cost compared with the whole population’s age followed by individual Makes)

to try and distinguish any correlation’s (patterns) in the cost distribution.

Fortunately, the median is not affected by extreme values so the box plots will be fine, as will any other cumulative frequency curves.

Most variables would affect the cost, but by how much remains to be seen.

I am now satisfied that results from my sample will represent the population as a whole.

1st Hypothesis

I have said ...

This is a preview of the whole essay