Used second hand cars
Extracts from this document...
Introduction
Introduction:
I have been provided with a set of data, which altogether includes the factors that affects the second hand car prices of the listed hundred cars. Such as *Age, *Mileage, *Owners, *Insurance group and *MPG to name a few. You will see a spread sheet of this information on the following pages.
My task is to relate the prices of the second hands to number of variables. Also so how and state why they are important elements. I must then interpret my results and finally come to conclusion from them.
Many problems of statistical nature are made clearer when they are put in the forms of hypothesis. A hypothesis is generally considered to be a statement which may be true, but for which no proof has yet been found. I am going to choose o less than three factors, which I feel has an affect on the prices of the second hand cars. To start my project I have come up with three responsible questions, which I believe, could help me with my investigation. The questions I have decided to experiment are:
- Does the age leave a huge impact on the second hand value?
- What about make? Would you rather spend more money buying a cool car rather than buying a new car which is cheaper and less stylish?
- How many people care about the mileage? Is there any disadvantage to the mileage?
From the alternative data base given to me I will randomly choose 3o cars and separate them from other cars taking in their age, mileage and make.
Hypothesises:
I predict that age, make and mileage are the three important factors that affect the second hand prices.
*AGE: -The older the car the lower its second hand price.
Middle
16
4
14.5
-2.5
6.25
86
1664
28
10
1
22
484
92
4693
17
5
10.5
2.5
6.25
96
3995
20
5
10.5
5.5
30.25
98
2748
24
6
8.5
10.5
110.25
= 209203
= 4284.20
Formula = _ 6∑ d²
n(n²-1)
= _ 6*4284.20
30 (900- 1)
= _25704
26970
= 1- 0.953058956
= 0.046941045
= 0.05 (2 d.p)
You can compare two sets of ranking using Spearman’s coefficient of rank correlation.
You use the formulaρ= _ 6∑d²
n (n²-1)
d is the difference between the two rankings of one item of data. n is the number of items of data.
ρ is Spearman’s coefficient of rank correlation.
The value of ρ will always be between -1 and +1.
-1 0 +1
ranking in weak negative no Weak positive same ranking
reverse order correlation correlation correlation strong positive
strong negative correlation
correlation
The spearman rank for the data is 0.05, which shows that there is almost no correlation between the age of the car and the second hand price of the car. This seems to contradict my hypothesis. However, I do feel that this is influenced by some of the cars brand new prices.
Display: Cumulative frequency
Age of the cars | Frequency | Age of the cars (x) | Cumulative frequency |
1 | 2 | <1 | 2 |
2 | 5 | <2 | 7 |
3 | 4 | <3 | 11 |
4 | 6 | <4 | 17 |
5 | 2 | <5 | 19 |
6 | 4 | <6 | 23 |
7 | 2 | <7 | 25 |
8 | 4 | <8 | 29 |
9 | 0 | <9 | 29 |
10 | 1 | <10 | 30 |
In this cumulative frequency table the data shows that the model is 4. This supports my averages calculation. Therefore people prefer buying a second hand car at its least low age. The cumulative diagram shows how the cumulative frequency changes as the data value increases. The cumulative frequency is shown on the vertical axis and the data is shown on the horizontal axis on continuous scale. I have drawn the cumulative frequency curve on the next page on a graph paper.
I have used the cumulative frequency to find upper quartile, median, lower quartile and inter quartile to draw box and whisker diagram.
Display: Box and whisker diagram
To get an estimate of the median:
- Divide the total cumulative frequency by 2.
- Find this point on the cumulative frequency axis.
- Draw a line across to the curve and down to the horizontal axis.
- Read off the estimate of the median.
To get an estimate of the lower quartile:
- Divide the total cumulative frequency by 4.
- Find this point on the cumulative frequency axis.
- Draw a line across to the curve and down to the horizontal axis.
- Read off the lower quartile.
To get an estimate of the upper quartile:
- Divide the total cumulative frequency by 4 and multiply by 3.
- Find this point on the cumulative frequency axis.
- Draw a line across to the curve and down to the horizontal axis.
- Read off the lower quartile.
Inter quartile is upper quartile minus the lower quartile.
The box plot shows the median, lower quartile, upper quartile and the inter quartile, found out using the cumulative frequency curve for the age of the selected cars.
Median | 3.7 |
Lower quartile | 2.1 |
Upper quartile | 5.8 |
Inter quartile | 5.8- 2.1 = 3.7 |
The median is nearly same as the median gained in the averages calculation. The inter quartile is the same value as the median. The box and whisker diagram also known as the box plot diagram is drawn at the back of the cumulative curve.
The box and whisker diagram has a positive skew. The median is not in the middle of the diagram. It is closer to the lower quartile.
Median- Lower Quartile < Upper Quartile- Median
M - LQ < UQ - M
Percentage depreciation:
Car No: | Make | Price when new | Second hand price | Difference | Age | Percentage depreciation |
2 | Mercedes | 16000 | 7999 | 8001 | 1 | 23.75043328 |
4 | Vauxhall | 8785 | 1595 | 7190 | 4 | 53.96160558 |
6 | Renault | 7875 | 1495 | 6380 | 4 | 63.26965467 |
10 | Vauxhall | 8748 | 1995 | 6753 | 4 | 40.79200592 |
14 | Vauxhall | 9105 | 2300 | 6805 | 2 | 43.87640449 |
19 | Rover | 12125 | 4295 | 7830 | 6 | 72.06683351 |
24 | Fiat | 11800 | 4700 | 7100 | 8 | 78.21969697 |
27 | Toyota | 8680 | 3200 | 5480 | 2 | 45.68884058 |
32 | Rover | 14505 | 8800 | 5705 | 8 | 82.22686879 |
37 | Renault | 13230 | 8250 | 4980 | 8 | 70.6401766 |
40 | Fiat | 13183 | 3495 | 9688 | 7 | 81.86653772 |
43 | Ford | 17780 | 7995 | 9785 | 7 | 77.19478738 |
47 | Vauxhall | 6590 | 1664 | 4926 | 6 | 78.8937409 |
50 | Daewoo | 15405 | 3995 | 11410 | 3 | 53.85826772 |
52 | Ford | 7310 | 1050 | 6260 | 3 | 64.57731959 |
54 | Bentley | 9995 | 2995 | 7000 | 8 | 77.76002248 |
57 | Ford | 16000 | 7999 | 8001 | 4 | 63.13364055 |
62 | Peugot | 8785 | 1595 | 7190 | 3 | 58.533309481 |
65 | Ford | 7875 | 1495 | 6380 | 3 | 37.64172336 |
66 | Peugot | 8748 | 1995 | 6753 | 1 | 17.80821918 |
70 | Fiat | 9105 | 2300 | 6805 | 2 | 53.79278446 |
72 | Mercedes | 12125 | 4295 | 7830 | 2 | 33.77483444 |
73 | Porche | 11800 | 4700 | 7100 | 6 | 40.9152902 |
77 | Mercedes | 8680 | 3200 | 5480 | 2 | 34.41250349 |
81 | Ford | 14505 | 8800 | 5705 | 4 | 55.03374578 |
84 | Vauxhall | 13230 | 8250 | 4980 | 4 | 36.53061224 |
86 | Ford | 13183 | 3495 | 9688 | 10 | 74.74962064 |
92 | Volkswagen | 17780 | 7995 | 9785 | 5 | 46.11940299 |
96 | Ford | 6590 | 1664 | 4926 | 5 | 74.06686141 |
98 | Renault | 15405 | 3995 | 11410 | 6 | 76.50277897 |
I have found out the percentage depreciation of the car by:
Percentage depreciation= Price when new- Second hand price
Price when new
This will help me to clarify the relationship between depreciation of price and age of the car.
Using the percentage depreciation I have calculated the four point moving averages for this data.
The results are followed…
Moving Averages
45.44443818
50.47506767
55.00137465
58.73888522
59.96283519
69.55045127
69.19378704 *The graph is drawn on the next page on a graph paper.
70.10549723
77.98209262
77.14881065
72.95333343
68.6310289
68.77233767
64.83231259
66.00101936
59.2671203
44.27916948
41.94395545
35.75439036
40.72385315
41.03409348
41.72303793
50.18162054
43.97569235
57.86662432
67.859666
Moving averages are averages worked out for a given number of items of data as you work through the data.
A three- point moving average uses three items of data at a time.
A four- point moving averages uses four items of data and so on.
I have decided to do four point moving averages.
The moving averages show that the results are random. The trend line suggests that it has a negative trend. This shows that there would be a decrease in frequency with an increase in age.
Standard deviation:
The standard deviation, s, of a set of data is given by the formula:
The higher the standard deviation, the more spread out the data is. The above formula gives the same results as the other formula but is much easier to work with, especially when the mean is not a whole number.
The other formula is:
I have decided to find out the standard deviation of the age of my 30 cars.
Age (x) | Frequency (f) | fx | x² | f x² |
1 | 2 | 2 | 1 | 2 |
2 | 5 | 10 | 4 | 20 |
3 | 4 | 12 | 9 | 36 |
4 | 6 | 24 | 16 | 96 |
5 | 2 | 10 | 25 | 50 |
6 | 4 | 24 | 36 | 144 |
7 | 2 | 14 | 49 | 98 |
8 | 4 | 32 | 64 | 256 |
9 | 0 | 0 | 81 | 0 |
10 | 1 | 10 | 100 | 100 |
= 138 | = 799 |
Mean= fx = 138
x , 30 = 4.6
The formula s= becomes
Standard deviation= 2.339515619
= 2.34 (3 s.f)
The mean of the age of the second hand cars is 4.6 and the standard deviation is 2.34. This indicates that the age is bigger spread.
I have conducted spearman’s rank to find out whether if there is relationship between the percentage depreciation and the second hand price of the cars.
Spearman’s rank:
Car No: | Second- hand price | Rank 1 | Percentage depreciation | Rank 2 | Difference (d) | d² |
2 | 10999 | 5 | 23.75043328 | 2 | 3 | 9 |
4 | 6595 | 11 | 53.96160558 | 14 | -3 | 9 |
6 | 4999 | 13 | 63.26965467 | 18 | -5 | 25 |
10 | 7499 | 9 | 40.79200592 | 7 | 2 | 4 |
14 | 4995 | 14.5 | 43.87640449 | 9 | 5.5 | 30.25 |
19 | 3795 | 21 | 72.06683351 | 21 | 0 | 0 |
24 | 1495 | 30 | 78.21969697 | 27 | 3 | 9 |
27 | 7495 | 10 | 45.68884058 | 10 | 0 | 0 |
32 | 1700 | 27 | 82.22686879 | 30 | -3 | 9 |
37 | 1995 | 25.5 | 70.6401766 | 20 | 5.5 | 30.25 |
40 | 1500 | 29 | 81.86653772 | 29 | 0 | 0 |
43 | 1995 | 25.5 | 77.19478738 | 25 | 0.5 | 0.25 |
47 | 2900 | 23 | 78.8937409 | 28 | -5 | 25 |
50 | 4395 | 18 | 53.85826772 | 13 | 5 | 25 |
52 | 4295 | 19 | 64.57731959 | 19 | 0 | 0 |
54 | 37995 | 1 | 77.76002248 | 26 | -25 | 625 |
57 | 3200 | 22 | 63.13364055 | 17 | 5 | 25 |
62 | 5795 | 12 | 58.533309481 | 16 | -4 | 16 |
65 | 8250 | 6 | 37.64172336 | 6 | 0 | 0 |
66 | 7500 | 8 | 17.80821918 | 1 | 7 | 49 |
70 | 4995 | 14.5 | 53.79278446 | 12 | 2.5 | 6.25 |
72 | 17500 | 3 | 33.77483444 | 3 | 0 | 0 |
73 | 19495 | 2 | 40.9152902 | 8 | -6 | 36 |
77 | 11750 | 4 | 34.41250349 | 4 | 0 | 0 |
81 | 7995 | 7 | 55.03374578 | 15 | -8 | 64 |
84 | 4976 | 16 | 36.53061224 | 5 | 11 | 121 |
86 | 1664 | 28 | 74.74962064 | 23 | 5 | 25 |
92 | ...read more.
Conclusion
Like I mentioned in my introduction I do believe that my sample was large enough to represent the population fairly. The price of the car would decrease with increases in mileage and vice versa. The reputed and posh the make of the car more expensive it is. I have proven very clearly that the older the age less the value of its second hand price and lower the mileage higher the second hand car’s value. If someone else carried out the same way investigation the chance of his or her findings matching my result is about 50 to 60 percent. This is due to the consideration of other factors, such the size of the sample, sampling method, the hypothesis, time provided and etc. If I were to do the investigation again, the things I would prefer doing differently are:
I would also analyse my results with my sallow students to see whether id there is any match. I think if someone else were to read my report it would be fairly easier to understand as I have shown all the calculations step by step and given brief description of all. I have repeated my hypothesis again and again and explained my graphs and what they show. I don not think I have not concluded any irrelevant statistical calculations or irrelevant statistical diagrams or any inappropriate conclusions, Therefore, any one reading my course work would not find any misleading information. This student written piece of work is one of many that can be found in our GCSE Gary's (and other) Car Sales section. Found what you're looking for?
Short of cash - and want FREE access?Submit one of your essays and get a FREE DAY PASS.
![]() ![]()
Read more
(The above preview is unformatted text)
Found what you're looking for?
![]() Looking for expert help with your Maths work?![]() |