• Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

I have been given the task of finding what affects the price of a used car, using a spreadsheet given to me displaying data on a hundred cars with data on about each car.

Extracts from this document...


Maths Statistic Coursework

I have been given the task of finding what affects the price of a used car, using a spreadsheet given to me displaying data on a hundred cars with data on about each car. The data on the cars were: (See Spreadsheet 1)

Make                                Model                                Price When New

Used Price                        Age                                Colour

Engine Size                        Fuel Type                        MPG

Mileage                        Service                        Owners

Length of MOT                Tax (Months left)                Insurance Group

Doors (Amount)                Style                                Central Locking

Seats                                Gearbox                        Air Conditioning


Immediately from looking at those categories I omitted colour, fuel, service, doors, style, central locking, seats, gearbox, air conditioning and airbags. I omitted this data because it is of a low range of contains words, these would be hard to show on graphs and would show me little evidence of what affects a used car price.

E.g. Colour: Cannot produce a scatter graph as it uses words.

        Seats: Has a range of 2-5 and would produce poor scatter graphs and would be hard to find a direct relationship on.          

Then from the remaining categories I picked age, insurance group, MPG, mileage and of course used price, as this is what I was investigating. It then dawned one me that I could use the depreciation price, the price when I took the used price away from the new, this perhaps could be a more accurate look at the data as some cars depreciate quicker than others. Looking further into

...read more.


The mileage groups were.   0-5000        






With these sorted I took 40% at random from each group and ended up with this. I ensured it was random by drawing numbers out of a hat respective to the numbers of the car, I then noted that number and placed in back in so each time the chance of drawing a single card was equal and didn’t change. If I drew the same one twice I simply ignored that, placed it back in and redrew.       (See Spreadsheet 3)                                

If actually counted there are 41 cars. As 40 and 41 are very close, rather than tamper with any results which could make them biased I simply left them.

From this data I then compiled scatter graphs on them just as before.


  • Age, I believe that there will be a strong negative correlation as there was before but as this is supposedly a more reliable sample it should be more evident.
  • MPG, I believe there will be a strong negative correlation as there was before but should be more evident due to sample being more reliable.
  • Mileage should have a strong negative correlation due to reasons above.
  • Insurance group should have a strong positive correlation due to reasons mentioned above.

See graphs 5,6,7 and 8.

Conclusions on Stratified Sampling.

As you can see some very strange results came up.

  • Age showed the very strong negative correlation as I said there would be.
  • MPG showed a strong negative correlation as well as I said.
  • Mileage proved very weird. The data was in two groups basically one showing high mileage and low price while the other low mileage and low price. From this I can deduce that the mileage is a limiting factor of used price.
  • Insurance group showed no correlation with data all over the place, show perhaps my random sample was a mishap and in fact insurance has no relationship or very little with used price.
...read more.


From individual graphs you can see that the majority of the cars are around the 20,000 to 60,000 miles range in both the random and stratified samples. Standard deviation could perhaps tell me which sample is more accurate so that could be an extension to the work done.I mentioned a bell shape graph before. By this I mean one, which slowly goes up to a peak then reduces down, with the majority of the data displayed in the middle and only some or no data displayed in the highest and lowest areas.

However from the histograms I did not find any reasoning behind the weird shaped and correlated stratified scatter graph. Further investigation into this could prove interesting.

Overall Conclusion

From all the work carried out above you can clearly see that many different things affect used car prices and some more than others. You could say that the different categories are limiting factors and a culmination of these results in the depreciation of a cars price.

As a further investigation I would look into the strange scatter graph produced by my stratified mileage sample. Perhaps using standard deviation or other data representation methods I could find out why it is so peculiar. I could also look at how one category affects another such as engine size and mileage or engine size and MPG and find a relationship between those. There are many more aspects that I could of considered but however from the work I’ve done there are things that are certainly clear.

...read more.

This student written piece of work is one of many that can be found in our AS and A Level Probability & Statistics section.

Found what you're looking for?

  • Start learning 29% faster today
  • 150,000+ documents available
  • Just £6.99 a month

Not the one? Search for your essay title...
  • Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

See related essaysSee related essays

Related AS and A Level Probability & Statistics essays

  1. Statistics coursework

    After this I had planned to draw a stem and leaf diagram to compare year 7s and 11s results however, on starting this I realised there would only be two levels to this. This would have meant a comparison between the two would have been difficult as the bars would

  2. Investigating the Relationship Between the Amount of Money a Football Club Receives and its ...

    11 27 40 52 7556 �118,000 -18 2 Fulham 1 46 19 3 1 50 12 12 5 6 29 20 101 19250 �0 47 2 Gillingham 4 46 15 5 3 45 17 7 9 7 30 27 80 10600 �2,250,000 31 2 Lincoln City 23 46 9 4

  1. Standard addition was used to accurately quantify for quinine in an unknown urine sample ...

    The presence of anions such as chloride in the sample may have suppressed the presence of quinine. If the solution under investigation contains, beside the fluorescing molecules, a solute that absorbs either the exciting or the fluorescing radiation, the measured fluorescent power is reduced.

  2. Statistics. The purpose of this coursework is to investigate the comparative relationships between the ...

    more mileage. This will also increase the number of years attached to the car, and the efficiency of the car. The following data will help me decide whether this hypothesis is true, when there are more or less owners attached to the car, to affect the percentage depreciation: * Sale price (first hand)

  1. Design an investigation to see if there is a significant relationship between the number ...

    Another factor that may have an affect on how the algae grow is water availability, which is effected by the salt content of the sea, due to water potential. The algae need water in order to respire. Lack of water can result in desiccation and consequent death.

  2. Statistic: Is reading age a predictor for future attainment?

    31 24 24 273 / 24 = 11.4 Mean = 273/24 = 11.4 Median = 11.4 Lower quartile = 10.3 Upper quartile = 12.6 Inter quartile Range = 12.6 - 10.3 = 2.3 By adding all the frequencies it gave me a total of 24.

  1. Anthropometric Data

    As a researcher I'm able to look at things that already exist from this data and be able to determine if and in what way those things are related to each other. When doing my correlation I'm able to predict about one variable based on the other.

  2. Statistics Coursework

    89.31 135 94.18 182 97.35 229 100 42 83.33 89 89.42 136 94.18 183 97.35 220 100 43 83.6 90 89.47 137 94.44 184 97.35 231 100 44 83.86 91 89.68 138 94.44 185 97.35 232 100 45 83.86 92 89.95 139 94.44 186 97.35 233 100 46 84.15 93

  • Over 160,000 pieces
    of student written work
  • Annotated by
    experienced teachers
  • Ideas and feedback to
    improve your own work