• Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

I have been given the task of finding what affects the price of a used car, using a spreadsheet given to me displaying data on a hundred cars with data on about each car.

Extracts from this document...


Maths Statistic Coursework

I have been given the task of finding what affects the price of a used car, using a spreadsheet given to me displaying data on a hundred cars with data on about each car. The data on the cars were: (See Spreadsheet 1)

Make                                Model                                Price When New

Used Price                        Age                                Colour

Engine Size                        Fuel Type                        MPG

Mileage                        Service                        Owners

Length of MOT                Tax (Months left)                Insurance Group

Doors (Amount)                Style                                Central Locking

Seats                                Gearbox                        Air Conditioning


Immediately from looking at those categories I omitted colour, fuel, service, doors, style, central locking, seats, gearbox, air conditioning and airbags. I omitted this data because it is of a low range of contains words, these would be hard to show on graphs and would show me little evidence of what affects a used car price.

E.g. Colour: Cannot produce a scatter graph as it uses words.

        Seats: Has a range of 2-5 and would produce poor scatter graphs and would be hard to find a direct relationship on.          

Then from the remaining categories I picked age, insurance group, MPG, mileage and of course used price, as this is what I was investigating. It then dawned one me that I could use the depreciation price, the price when I took the used price away from the new, this perhaps could be a more accurate look at the data as some cars depreciate quicker than others. Looking further into

...read more.


The mileage groups were.   0-5000        






With these sorted I took 40% at random from each group and ended up with this. I ensured it was random by drawing numbers out of a hat respective to the numbers of the car, I then noted that number and placed in back in so each time the chance of drawing a single card was equal and didn’t change. If I drew the same one twice I simply ignored that, placed it back in and redrew.       (See Spreadsheet 3)                                

If actually counted there are 41 cars. As 40 and 41 are very close, rather than tamper with any results which could make them biased I simply left them.

From this data I then compiled scatter graphs on them just as before.


  • Age, I believe that there will be a strong negative correlation as there was before but as this is supposedly a more reliable sample it should be more evident.
  • MPG, I believe there will be a strong negative correlation as there was before but should be more evident due to sample being more reliable.
  • Mileage should have a strong negative correlation due to reasons above.
  • Insurance group should have a strong positive correlation due to reasons mentioned above.

See graphs 5,6,7 and 8.

Conclusions on Stratified Sampling.

As you can see some very strange results came up.

  • Age showed the very strong negative correlation as I said there would be.
  • MPG showed a strong negative correlation as well as I said.
  • Mileage proved very weird. The data was in two groups basically one showing high mileage and low price while the other low mileage and low price. From this I can deduce that the mileage is a limiting factor of used price.
  • Insurance group showed no correlation with data all over the place, show perhaps my random sample was a mishap and in fact insurance has no relationship or very little with used price.
...read more.


From individual graphs you can see that the majority of the cars are around the 20,000 to 60,000 miles range in both the random and stratified samples. Standard deviation could perhaps tell me which sample is more accurate so that could be an extension to the work done.I mentioned a bell shape graph before. By this I mean one, which slowly goes up to a peak then reduces down, with the majority of the data displayed in the middle and only some or no data displayed in the highest and lowest areas.

However from the histograms I did not find any reasoning behind the weird shaped and correlated stratified scatter graph. Further investigation into this could prove interesting.

Overall Conclusion

From all the work carried out above you can clearly see that many different things affect used car prices and some more than others. You could say that the different categories are limiting factors and a culmination of these results in the depreciation of a cars price.

As a further investigation I would look into the strange scatter graph produced by my stratified mileage sample. Perhaps using standard deviation or other data representation methods I could find out why it is so peculiar. I could also look at how one category affects another such as engine size and mileage or engine size and MPG and find a relationship between those. There are many more aspects that I could of considered but however from the work I’ve done there are things that are certainly clear.

...read more.

This student written piece of work is one of many that can be found in our AS and A Level Probability & Statistics section.

Found what you're looking for?

  • Start learning 29% faster today
  • 150,000+ documents available
  • Just £6.99 a month

Not the one? Search for your essay title...
  • Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

See related essaysSee related essays

Related AS and A Level Probability & Statistics essays

  1. Standard addition was used to accurately quantify for quinine in an unknown urine sample ...

    The presence of paramagnetic species such as, dissolved oxygen and other heavy atoms, strongly affects the rate of intersystem crossing which in turn alters the quantum efficiency for fluorescence. To improve the accuracy of this experiment, paramagnetic species could have been separated and excluded from the sample solution.

  2. Investigating the Relationship Between the Amount of Money a Football Club Receives and its ...

    11 27 40 52 7556 �118,000 -18 2 Fulham 1 46 19 3 1 50 12 12 5 6 29 20 101 19250 �0 47 2 Gillingham 4 46 15 5 3 45 17 7 9 7 30 27 80 10600 �2,250,000 31 2 Lincoln City 23 46 9 4

  1. "The lengths of lines are easier to guess than angles. Also, that year 11's ...

    Neither of the most populated groups contained the correct estimate for the line or the angle, so it is not possible to tell accurately which one was easier, but you can see that the year 11 estimate for the line's most densely populated group was the group next to the

  2. Statistics coursework

    12 58 100 - Total KS2 results of boys in year 7 Total of KS2 results Frequency Cumulative Frequency Percentage of total 5<C<8 3 3 4.48 8<C<10 8 11 16.42 10<C<12 26 37 55.22 12<C<14 19 56 83.58 14<C<16 11 67 100 My next step is to prove that if

  1. Anthropometric Data

    As a researcher I'm able to look at things that already exist from this data and be able to determine if and in what way those things are related to each other. When doing my correlation I'm able to predict about one variable based on the other.

  2. Statistics. The purpose of this coursework is to investigate the comparative relationships between the ...

    more mileage. This will also increase the number of years attached to the car, and the efficiency of the car. The following data will help me decide whether this hypothesis is true, when there are more or less owners attached to the car, to affect the percentage depreciation: * Sale price (first hand)

  1. Teenagers and Computers Data And Statistics Project

    3 8 Total 4 x 5 x 7 =140 b.6 x 4 x5 No of faces painted Number of cubes 0 4 x 2 x 3 = 24 1 8 + 6 + 12= 26 x 2 =52 2 4 x 2+ 4 x 3 + 4 x 4 = 36 3 8 Total 6x4x5 = 120 11.


    So, from this I can now predict that 50% of the boys in the whole school will weigh between 0kg and 50kg, and that 57% of the girls in the whole school will be between this range. This implies that there will be a bigger percentage of boys than girls over this range proving that they are heavier than girls.

  • Over 160,000 pieces
    of student written work
  • Annotated by
    experienced teachers
  • Ideas and feedback to
    improve your own work