I have been given the task of finding what affects the price of a used car, using a spreadsheet given to me displaying data on a hundred cars with data on about each car.

Authors Avatar

Maths Statistic Coursework

I have been given the task of finding what affects the price of a used car, using a spreadsheet given to me displaying data on a hundred cars with data on about each car. The data on the cars were: (See Spreadsheet 1)

Make                                Model                                Price When New

Used Price                        Age                                Colour

Engine Size                        Fuel Type                        MPG

Mileage                        Service                        Owners

Length of MOT                Tax (Months left)                Insurance Group

Doors (Amount)                Style                                Central Locking

Seats                                Gearbox                        Air Conditioning

Airbags

Immediately from looking at those categories I omitted colour, fuel, service, doors, style, central locking, seats, gearbox, air conditioning and airbags. I omitted this data because it is of a low range of contains words, these would be hard to show on graphs and would show me little evidence of what affects a used car price.

E.g. Colour: Cannot produce a scatter graph as it uses words.

        Seats: Has a range of 2-5 and would produce poor scatter graphs and would be hard to find a direct relationship on.          

Then from the remaining categories I picked age, insurance group, MPG, mileage and of course used price, as this is what I was investigating. It then dawned one me that I could use the depreciation price, the price when I took the used price away from the new, this perhaps could be a more accurate look at the data as some cars depreciate quicker than others. Looking further into that work I decided against it as it would take longer and time was of the essence, but this was perhaps an extension that could be added on at the end.

Reasons Why

  • Age: Has a large range and would be interesting to see what sort of relationship there is  
  • Insurance Group: Again a wide range.
  • MPG: Grouped data could be used on cumulative frequency graph and has quite a large range.
  • Mileage: Huge range and a definite effecter of used price but would be interesting to exactly how much.

Sample

I was given 100 cars but to investigate this would be very time consuming so I would have to bring that number down. In the end I chose to do a 40 car sample as it is a round number, lower than 100 but still big enough to display a fair representation of the data supplied.

Join now!

Sampling Method

Now I’ve decided how big I need my sample, I know have to decide how I will sample. There are two main methods random or stratified, eventually I want to try both but for now I will use a random sample. To do this I will use the random number function on my calculator.

I press the random number button and a 3 decimal place number is displayed, I then picked the first 2 numbers and used this as my sampling method. If a number was repeated I ignored it and chose again.

EG.

Random ...

This is a preview of the whole essay