- Car Number
- New Price (£)
- Old Price (£)
- Mileage
I selected the new price and the old price to see how the mileage has affected the prices of each car and the difference between them.
To find out the relationship between the mileage and car price, I’m going to plot the data onto a scatter diagram to see the correlation between them to support my hypothesis.
This scatter diagram illustrates that there is a slight negative correlation between the mileage and car price. However, you can also see that there are some outliers, so I am going to find out the outliers by the following calculations:
LQ: Lower Quartile: (40 / 4) * 1 = 10th number = £2310
UQ: Upper Quartile: (40 / 3) * 3 = 30th number = £8500
IQ: Interquartile Range: UQ – LQ = £8,500 - £2,310 = £6190
IQ * 1.5 = £6190 * 1.5 = £9285
LQ - £9285 = £2310 - £9285 = - £6975
UQ + £9285 = £8500 + £9285 = £17785
Therefore, any value below - £6975 and above £17785 is an outlier.
Outliers: £19345
: £25810
The outliers are shown in red in the scatter diagram.
Now, I am going to calculate any outliers for the mileage. I am going to do this by completing the same process:
LQ: Lower Quartile: (40 / 4) * 1 = 10th number = 12000
UQ: Upper Quartile: (40 / 3) * 3 = 30th number = 49000
IQ: Interquartile Range: UQ – LQ = 49000 – 12000 = 37000
IQ * 1.5 = 37000 * 1.5 = 55500
LQ – 55500 = 12000 – 55500 = - 43500
UQ + 55500 = 49000 + 55500 = 104500
Therefore, any value below – 43500 and above 104500 is an
outlier. There are no outliers in this data.
Using the outliers from the “Car Price”, I am going to plot another scatter diagram.
The outliers are highlighted in a different colour.
The scatter diagram above shows more of a negative correlation than the scatter diagram before which had the outliers.
You can see that this has affected the data and line of best fit. I am going to calculate the spearman’s rank correlation to specifically see how much of a negative correlation there is.
Using the data above, I am going to calculate the Spearman’s coefficient of rank correlation using this formula:
After calculating the Spearman’s coefficient of rank correlation, my calculations show a strong positive correlation.
My hypothesis is: the car price decreases when the mileage increases.
Looking at the scatter diagram without the outliers, it shows a negative correlation, however, when I calculated the Spearman’s coefficient of rank correlation, the results I received where contradicting the results from the scatter diagram.
Therefore, the results from the scatter diagram, prove my hypothesis. On the other hand, the results from the Spearman’s coefficient of rank correlation disprove my hypothesis.
Improvements:
- Use a cumulative frequency diagram as well as a scatter diagram to find the outliers, Spearman’s coefficient of rank correlation etc.
- Use a cumulative frequency diagram to accurately find the lower quartile, upper quartile, inter-quartile range.
- The data sheet would be more reliable and appropriate if it had no data missing, like in this secondary data provided by the teacher.