Representation & Interpretation of Data
Hypothesis 1 - I predict that as the age of the car increases, the percentage of depreciation in the value of the car increases.
For this hypothesis I have made 6 groups to split the cars into. These 6 stratums are: 1-2 years, 3-4 years, 5-6 years, 7-8 years, 9-10 years and 11+ years old. These age groups cover the whole population. The following pages show the data from the database in the 6 groups. The groups show the car number from the database, and then relevant information relating to the hypothesis, the make and the model of the car and the price of it when bought new and when sold on second hand. The last column shows the value of depreciation in the car’s value as a percentage value which has been calculated by using the formula: difference in price/original X 100. To do the stratified sample I will need to pick a proportional amount of cars from each group. Out of the 100 cars in the database, I want a sample of 30 cars, so therefore to work out how many cars need to be picked from each group; I will need to divide the total number of cars in the group by the total population which is 100 and multiply it by 30.
1-2 Years
19/100 X 30 = 5.7 = 6
Therefore from this group I will take a random sample of 6 cars.
3-4 Years
20/100 X 30 = 6
5-6 Years
27/100 X 30 = 8.1 = 8
7-8 Years
24/100 X 30 = 7.2 = 7
9-10 Years
8/100 X 30 = 2.4 = 2
11+ Years
2/100 X 30 = 0.6 = 1
Now I know how many cars I need to take from each group and so a random sample using a random number generator is needed.
By randomly generating numbers from a calculator, I can then multiply this number by the total number of cars in the group to get a number between 1 and the total number of cars in the group. Then I will repeat this as many times as stated above for each group. For example, in the first group, I will randomly generate a number from the calculator using the ‘RAN#’ button and then multiply this number by 19, to get 5.871 (6 to 1sf.), and so I choose the 6th car in the group. This is repeated 5 more times because I need a sample of 6 from this group.
1-2 Years
The 6th, 10th, 4th, 9th, 8th and 16th cars need to be picked from this group.
3-4 Years
The 17th, 11th, 19th, 1st, 6th and 8th cars.
5-6 Years
The 21st, 11th, 1st, 18th, 15th, 23rd, 19th and 5th cars.
7-8 Years
The 20th, 14th, 17th, 23rd, 22nd, 15th and 16th cars.
9-10 Years
The 8th and 4th cars.
11+ Years
The 1st car.
The following cars have been included into the sample and will be used to find the relationship between age and value of depreciation.
From the sample I have worked out the average value of depreciation for the cars in the sample. This will then allow me to plot the information onto a scatter diagram.
The graph shown above shows a very good upwards trend. There is a strong positive correlation between the value of depreciation and the age in years of the cars. The graph shows that the older the car the bigger the difference between price of the car when new and second hand will be. For an older car, the age has more significance when determining the price of the second-hand car. A car that is only one or two years old is still relatively new so therefore many other factors are taken into account as well. A car that is old generally has done more miles, shown by this graph below. This graph shows the general relationship between age and mileage. The graphs states that generally as the age increases the number of miles done by the car increases. In such a large sample of 10 cars, there are a few exceptions and anomalies that do not comply with this rule. This may be due to the fact that some people use their car a lot more than the average person does.
Using this equation I can then predict the value of depreciation of a car by knowing its year using the general equation of the line. The equation of this line is
y = 11x + 29. I will now pick a car from random that was not used in the stratified sample and see if it works in the graph and the equation. I have taken car number 22 which is the Rover 114Sli. This car is 6 years old and has depreciated in value by 71%. From the graph, the red dotted line shows that the depreciation value is actually 68%. It is only out by 3%. So the equation of the straight line is a basic general equation for the relationship between the age of the car and value of depreciation.
I think that age is a major factor when determining the price of a second hand car. This is because older cars are more likely to be faulty or breakdown quickly because all the parts of the car were made some time ago. Also, the age means that the car has been used much more. This will mean it is need of repair or many parts of the car would have worn away. This causes the price decrease. I think that as the car’s age increases the value of depreciation is higher, due to the fact that age has more significance when it is older. So a car that is 1 years old is still relatively new and would not have done many miles and it would contain many modern features, so therefore the price of the car is only one of many factors that affects the car’s price when second hand. A car that is 20 years old would have many features that are old and worn away or broken, so all of these aspects are insignificant. Therefore, the age is the main aspect to consider for the older cars.
My hypothesis that I have stated above is correct. There is a general increase in the value of depreciation as the age of the car increases. The reasons for this are stated in the paragraph above.
I think that this hypothesis could have been improved to gain more reliable and accurate results. Firstly, when doing the stratified sample, a larger sample would have given me more reliable results, because by taking only 30 cars, there is a greater chance that they could have been anomalous results. Instead of doing the groups with 2 years in them, I should have made one group for each age, i.e. group one should have been cars that are one years old, etc. This would have made the results more accurate than they already are. However, despite this I think that my results are accurate and reliable enough to get a good conclusion.
Hypothesis 2 – I predict that as the mileage of any car increases, the percentage value for its depreciation increases.
From the database all the cars have been split into 10 groups and a proportional amount of cars will be randomly picked from each group. The groups are: 1000-10000 miles, 11000-20000 miles, 21000-30000 miles, 31000-40000 miles, 41000-50000 miles, 51000-60000 miles, 61000-70000 miles, 71000-80000 and 90000+ miles. The following pages show the entire population split into these groups. These mileage groups cover the whole population. The following pages show the data from the database in the 6 groups. The groups show the car number from the database, and then relevant information relating to the hypothesis, the make and the model of the car and the mileage done by the car. To do the stratified sample I will need to pick a proportional amount of cars from each group in the same way I did for the first hypothesis. Again I want a sample of 30 cars, so therefore to work out how many cars need to be picked from each group; I will need to divide the total number of cars in the group by the total population which is 100 and multiply it by 30.
1000-10999 Miles
7/100 X 30 = 2.1 = 2
Therefore from this group I will take a random sample of 2 cars.
11000-20999 Miles
11/100 X 30 = 3.3 = 3
21000-30999 Miles
11/100 X 30 = 3.3 = 3
31000-40999 Miles
13/100 X 30 = 3.9 = 4
41000-50999 Miles
21/100 X 30 = 6.3 = 6
51000-60999 Miles
15/100 X 30 = 4.5 = 5
61000-70999 Miles
9/100 X 30 = 2.7 = 3
71000-80999 Miles
7/1000 X 30 = 2.1 = 2
81000-90999 Miles
2/100 X 30 = 0.6 = 1
91000+ Miles
4/100 X 30 = 1.2 = 1
11000-20999 Miles
The 6th and 2nd cars need to be picked for the random sample.
21000-30999 Miles
The 4th, 8th and 2nd cars.
31000-40999 Miles
The 2nd, 1st, 7th and 10th cars.
41000-50999 Miles
The 2nd, 10th, 9th, 6th, 16th and 19th cars.
51000-60999 Miles
The 12th, 3rd, 7th, 11th and 1st cars.
61000-70999 Miles
The 5th, 1st and 8th cars.
71000-80999 Miles
The 4th and 1st cars.
81000-90999 Miles
The 2nd car.
91000+ Miles
The 2nd car.
The following cars have been included into the sample and will be used to find the relationship between miles done and the value of depreciation. I have used the value of depreciation for each car from hypothesis 1. From the sample I have worked out the average value of depreciation for the cars in the sample. This will then allow me to plot the information onto a scatter diagram.
This graph is very similar to the one shown in the hypothesis above. It shows that there is a general increase in the value of depreciation when the mileage of the car increases. There is a relatively strong positive correlation between the value of depreciation and the mileage done by the car. The second graph shown above as part of hypothesis 2, shows that the older the car the more miles it has done and this graph shows that the more miles done by the car, the higher the value of depreciation.
The equation of this graph is y = 6x + 31. This is very similar to the equation of the graph with age which was y = 11x + 29. However, the gradient is nearly half of the gradient for the graph with age. Using this equation I can then predict the value of depreciation of a car by knowing its mileage. I will now pick a car from random that was not used in the stratified sample and see if it works in the graph and the equation. I have taken car number 83, which is the Rover 416i. This car is 6 years old and has depreciated in value by 75%. From the graph, the red dotted line shows that the depreciation value should be 73%. It is only out by 2%. Again, like in hypothesis 1, the equation of the straight line is a basic general equation for the relationship between the age of the car and value of depreciation.
My prediction for this hypothesis was correct because there is a general increase in the value of depreciation as the mileage of the car increases.
I think that like age, miles done by the car is a major factor when determining the price of a second hand car. The graph above which shows the age of the car against the mileage shows that as the age of the car increases, the mileage increases at the same time. This is because the longer people have the car, the more miles they are likely to do with it. Therefore, the older cars have a larger mileage. Mileage is as important as age in determining the price of the car when second-hand. The more than car has been used, the more chance there is that the car needs some service or is more likely to break-down because the more you use something and the older it gets, and the less reliable it is. Therefore, the price of it will be very low – a large decrease from when it was bought new. A car that is only a year old will have only a few miles done, therefore it is likely that other factors will be significant in determining the price of the car. If a car is 20 years old, it would have done many miles, so this decreases the value of the car. Here the mileage is important as is the age.
I think that this hypothesis could also have been improved to gain more reliable and accurate results. The stratified sample for this hypothesis was better than the first because there were more stratums. This makes the results more reliable. However, again, a larger sample would have made the results even more reliable, as there is a chance for anomalous results.
From the analysis above we can see that the age and the mileage are very similar and have a close relationship between each other. They both strongly determine the price of a second-hand car, but also they have more significance when the car is very old. To measure how good the relationship between the two sets of data, I will work out the product moment correlation coefficient.
This graph shows the relationship between the age and the mileage and how closely they are linked with each other. As the graph shows the gradient for each of the graphs is different. The equations of the graphs (y = 11x + 29 for age, and y = 6x + 31 for mileage) show that the gradient for the age is nearly double that of the mileage graph. This shows that age seems to have more of an effect on the value of depreciation. This may be due to the fact that the mileage can be changed. Many second-hand car dealers reduce the mileage of the car and sell it for a slightly higher price. This may be illegal but it is still done to make more money as people want cars with lower mileages. You cannot change the age of the car so the price of a second hand car has been influenced more by the age of it.
The following shows a screenshot of Microsoft® Excel©. This shows how I worked out the Product Moment Correlation Coefficient for the two sets of data. The two sets of data are the two sets of values of depreciation – one for mileage and one for age. Cell C2 shows the formula that I used to work out the PMCC. Cell C£ shows the result.
0.9112938249 = 0.92 (2sf.) 0.92 is very close to +1. This shows that the relationship between the two sets of data have a near perfect correlation. The nearer the value is to 0, the less perfect the correlation is. However, here we can see that the data are high positive.
Hypothesis 3 – I predict that the depreciation value for saloons will have the least depreciation value.
For this hypothesis, the first task is to group the whole database into 3 groups – Saloons, Hatchbacks and others. The following pages show the population into the three groups. Under the heading I have listed only the data that is relevant – car number, make, model, price when new and second-hand, style and depreciation value. As you can see from the table, there are clearly more hatchbacks than any other, but this should not cause any bias or unreliable results. The pages after this show the data for the three groups split into 5 groups for the value of depreciation – 0-20%, 21-40%, 41-60%, 61-80% and 81-100%. For this I only need to know the depreciation value and nothing else as I already know the style. For each group I can then count how many cars are in each group (frequency) and input this information onto a histogram. The table shown after this shows the frequencies for each group.
This table shows how many cars from each style have a value of depreciation under each of the depreciation value groups. This will enable me to construct 3 histograms, from which I can obtain 3 frequency polygons to compare.
The following pages show the graphs and analysis of this hypothesis.
The first graph shown here shows the value of depreciation for hatchbacks and the number of cars that are in each group. This graph doesn’t really show anything so far, as there is no general trend.
The graph below shows the value of depreciation for saloons and the number of cars that are in each group.
The third graph shows the value of depreciation of the other styles of cars against the number of cars that are in each group.
From the polygons shown above, we can see that other cars and hatchbacks have most of the frequencies concentrated towards the middle percentages of depreciation. For the saloon it is more evenly spread. There is no perfect trend for any of the graphs, however, the hypothesis I stated above is correct because the saloon is the lowest. I predicted that out of the three styles of cars, the saloons would have the lowest effect on the depreciation value. This is due to this style of car being very modern and attractive compared to the other styles. This means that being a saloon will mean that the depreciation value will depreciate least, based only on this aspect of the car. If it were any other style, then that would have a greater influence on the depreciation value.
I think that this hypothesis would have been improved if I had grouped the data in a more effective way than just in groups of 20%, this would have the results more accurate.
From this project I have seen that there are two major influences that strongly affect the price of a car that is second hand. These are the age of the car and how many miles the car has done in its lifetime. These two influences have a close relationship shown by the product moment correlation coefficient. I have shown that they have a heavier influence on the depreciation value of the car when the age of the car is higher than younger cars. In this project, the main aim was to find out what the major influences was that affected the price of a second hand car. Although the style of the car was very important, the main factors were age and mileage.