These are my comparative pie charts:
These pie charts show that the used price of Mercedes cars in higher than the used price of ford cars.
So this shows that the make does affect the used price of a car although I have only tested two different makes of cars and used different models of these cars. To make my results more accurate I would have to draw pie charts for other makes of car.
Hypothesis 2
"The larger the mileage the cheaper the used price of a car".
I'm going to investigate if the second hand price of a car is affected by the mileage.
I am going to use 25 cars to produce a scatter diagram. My scatter diagram will have mileage on the x axis and the used price on the y axis because the price is being affected.
I will use only fords so my data isn't bias against different makes of cars. When selecting my data I decided not to use all of the fords because I didn't want my scatter diagram to be too busy and hard to read, so I used random sampling so every car has an equal chance of being used. I did this by putting all the numbers of cars into a hat and picking out 25.
I predict that my scatter diagram will have a negative correlation because I think the lower the mileage the higher the cost.
My scatter diagram shows:
- No Correlation at all
- There are many points around the 10000 mileage area
- The most expensive car has the lowest mileage
- The second cheapest car has the same mileage as the most expensive car
This scatter diagram does not back up my hypothesis.
To gather more accurate results, I will look at another make of car.
I am going to use 25 cars to produce a scatter diagram. My scatter diagram will have mileage on the x axis and the used price on the y axis because the price is being affected.
I will use only Mercedes so my data isn't bias against different makes of cars. When selecting my data I decided not to use all of the Mercedes because I don't want my scatter diagram to be to busy or hard to read, so I used random sampling so every car has an equal chance of being used. I did this by putting all the numbers of cars into a hat and picking out 25.
I predict that my scatter diagram will have a negative correlation because I think the lower the mileage the higher the cost.
I am not going the draw a line of best fit onto any of these graphs because neither have any correlation.
My scatter diagram shows:
- there’s no correlation at all
- there are many points between 10000 and 20000 mileage
These graphs do not prove that the used price of a car gets cheaper if the mileage is higher. To be more accurate I would have to look at other makes of car and sample using only a certain model.
Hypothesis 3
"The older the car is, the least expensive the used price".
I'm going to investigate if the second hand price of a car is affected by its age.
I am going to use 25 cars to produce a scatter diagram. My scatter diagram will have age on the x axis and the used price on the y axis because the price is being affected.
I will use only fords so my data isn't bias against different makes of cars. When selecting my data I decided not to use all of the fords because I didn't want my scatter diagram to be to busy and hard to read, so I used random sampling so every car has an equal chance of being used. I did this by using the same method that I used in hypothesis 2.
I predict that my scatter diagram will have a negative correlation because I think the younger the car the higher the cost.
I drew my line of best fit by using (x, y). Where I worked out that the mean age was 7.92 and the mean price was £3479. So my line of best fit will go through the point (7.92, 3479).
My Scatter diagram shows:
- There is a negation correlation
- It is a weak/moderate correlation
- There are no obvious anomalous results
I can use the line of best fit to make predictions. E.g. a ford car aged 10 would cost approximately £3000, but the line of best fit has limitations, as it continues it will reach 0, and even go into negatives which basically means that if were to sell a car at the age of 12 I would be paying somebody to buy car. Relatively speaking, that just wouldn’t happen.
To gather more accurate results, I will look at another make of car.
I use 25 Mercedes using the same random sampling technique as in hypothesis 2
I predict that my scatter diagram will have a negative correlation because I think the lower the age the higher the cost.
To put a line of best fit onto this graph I used (x, y) as I did with graph for ford cars.
The mean age this time was 7.64 and the mean price was £14982.60
This graph shows:
- there is negative correlation
- there is a very weak correlation
- there aren’t any obvious anomalous results
Again I can use the line of best fit to make predictions, e.g. a Mercedes aged 10 would cost approximately £9000. But the limitations of the line of best fit are just the same, according to the line of best fit if I were to buy an 11 year old car it would be free.
Both of these graphs prove that the used price of a car is affected by its age, although these graphs could also be affected by what model the cars are it doesn’t seem to make a larger impact than it appears to with the mileage.
I will now investigate the correlation further by calculating the spearman’s rank correlation coefficient.
SRCC = 1 – 6 Σ d² SRCC = 1 – 6 Σ d²
N (n²-1) n (n²-1)
1- 24, 580.8 = - 0.78 1- 24,292.5 = - 0.56
13,800 15,600
By using Spearman’s correlation coefficient I have found that the correlation of ford cars and the used price is – 0.78 which is a fairly strong negative correlation whereas the correlation of Mercedes cars and their used price is – 0.56 which is medium/strong negative correlation. A weaker correlation than fords.
To add and improve my results I will now work out the equation of my line of best fit using Y=MX+C.
For ‘scatter diagram to show how age affects the used price of a ford car’:
m = 5
-2
b = 2.5
Therefore y=5x+2.5
2
For ‘scatter diagram to show how age affects the used price of a Mercedes car’:
m= 2
-1
b= 6
Therefore y=2+6
-1
To conclude, my first hypothesis was “The make/type of a car does affect the used price”. I took random samples and produced scatter diagrams and found that in the samples I took the scatter diagrams produced did not back up my hypothesis although could be improved by using more controlled variables such as the model.
My second hypothesis was "The larger the mileage the cheaper the used price of a car". I again took a random sample and produced scatter diagrams, this time I found that my hypothesis was right.
My final hypothesis was "The older the car is, the least expensive the used price". I took more random samples and produced scatter diagram, and I also use SRCC to find the strength of the correlation. I found that my hypothesis was right.
If I was to re-do my coursework to make it better I would use primary data as the spreadsheet I was given could have gone out of date and isn’t as reliable as primary. I would also use better sampling techniques such as quota because then it would have been a fairer test.
As well to improve I could use larger samples, which will make my work more precise and reliable and will also help with investigate a larger population.