I have chosen Ford and Mercedes for this coursework. The reasons are Mercedes and Ford is in different position in the market. Mercedes is in the higher tier and Ford is in the lower tier. Therefore, I can see a significant rate of depreciation in price on different brands and prove my second hypothesis.
In order to choose my sample of 38 cars for each make (Ford and Mercedes) I need to choose a fair sampling method that helps me to avoid bias. I used cluster sampling because the data is already in groups, which are the makes of cars. Consequently, I could randomly pick two groups for my investigation for this coursework. I could choose systematic sampling or stratified sampling but I did not choose to use both of them. If I chose systematic sampling, I will not get enough cars for the sampling, which is not good.
Mercedes- mileage
Mercedes- secondhand price
I decided to group my data. The reason is the spread of mileage and price is too big. In addition, they are all in different groups. The advantage of grouping my data is the data is much easier to handle. However, the downside is I do not know the exact values of each car any more. Therefore, when I have use mean and median, I will not use this grouped data. Instead, I will use the actual numbers instead to produce an accurate result.
I put this grouped data in pie chart. Therefore, it can be much easier to see the difference in price and mileage. In addition, I can easily identify 46% of Mercedes ran 20000 or less in mileage and 38% of Mercedes second hand price are from 10001 to 15000.
Moreover, I had group the data in bar graph too. This is because bar graph show shows a record in column form and comparison is made easy and it will save time for me to make quick comparisons of this large data. At last, the pie chart shows percentage but not the exact number of cars therefore the bar chart is not pointless in this coursework.
According to the bar graph we can see 17 out of 38 Mercedes ran 20000 mileages or less and 14 out of 38 Mercedes’ second hand price are between GBP 10001 to 15000.
As my 2nd hypothesis said will Mercedes depreciate slower in value than Ford. I will have to calculate the percentage of decrease.
Mercedes -year group
Ford- mileage
As you can see, I had grouped the data, which is the same thing I have applied to Mercedes. The only difference is groups are different. This is because the max mileage of Ford is 71000; however, the max mileage for Mercedes is 100300. Therefore, there is no point to use the same grouping system for Ford, as some of them will be pointless. The reason I chose to group them is it is much easier and quicker to analysis the data for ford. However, the down side is the exact mileage for each car is disappeared under the function of grouping data.
From the pie chart, we can clearly see 57% of second hand ford cars had ran only 10000 or less mileage. Compared to Mercedes, 54% of Mercedes had ran at least 20001 mileage. There is a huge difference in mileage for both brand and this is make me harder to prove my 2nd hypothesis as mileage is one of the main factors affect the second hand price of cars.
Ford- secondhand price
From my bar chart, the range of Ford secondhand price is much smaller compared to Mercedes. The range of Ford is 14000 and for Mercedes are 35000. The range for Mercedes is 150% higher than Ford.
This is the formula I used to work out the percentage for how much higher the range is for Mercedes than Ford.
Ford and Mercedes’ mean and median
Mercedes-mileage Mercedes-mileage
Median=39000
Mercedes-price (GBP) Mercedes-price (GBP)
Median=12665
Ford-mileage Ford-mileage
Median=10000
Ford-Price (GBP) Ford-Price (GBP)
=4823.5 Median=4425
I worked out the mean of my data because it is usually the most representative because it uses all the data. However, outliers may distort it. In addition, I have also worked out the median, which will not be distorted by outliers. On the other hand, it is not a good representation of the data since it only picks the middle value. As a result, I will use it with the range and interquartile range to make it more useful.
Mercedes and ford interquartile range and range
Mercedes- mileage
Mercedes- price (GBP)
Ford-mileage
Ford-price (GBP)
Mercedes-mileage
Mercedes-price (GBP)
Ford-mileage
Ford-price (GBP)
I had calculated the range to see the spread of the data. In addition, Interquartile range is also calculated since outliers could affect the range easily as it only calculates the difference between the highest and lowest value. However, the interquartile range notifies me the range of the middle 50% of the data where is most of the information is. I did not choose to calculate the deciles range because its function is same as the interquartile range except it has gives me the middle 60% of the data which gives me more of the data.
Ford-Secondhand price standard deviation
=Average (1055:1670)
=4823.5
=STDEV (1055:2670)
=3301.73
Mercedes-Secondhand price standard deviation
=Average (11395:11255)
=13673.02632
=STDEV (11395:11255)
=7501.06763
The standard deviation of the price of Ford and Mercedes is huge which data is very spread out and far away from the mean.
In order to prove or disprove my first hypothesis, I will have to use spearman’s co-efficient rank of correlation and scatter graph to see the correlation between mileage and price of Fords and Mercedes cars in my data.
Mercedes-spearman’s co-efficient rank of correlation
Ford- spearman’s co-efficient rank of correlation
Ford-second hand price and mileage
Mercedes-second hand price and mileage
Spearman’s co-efficient rank of correlation of Mercedes and ford let to validate the coefficient correlation between second hand price and mileage because they reflect the same answer, which is as the mileage increases the lower the second hand price of the car. The moderate negative correlations of -0.63131 for Ford and fairly weak negative correlation of for Mercedes shows that as the mileage increases, the price decreases.
In addition, the scatter graphs for Mercedes and Ford show negative correlation. The points are very scattered about a straight line and this is an evidence of weak linear correlation.
On the scatter graph for Mercedes, the point where (40000, 34000) is an outlier. A car has ran 40000 miles and it worth about GBP34000 does not make sense because a car with 20000 mileage worth only about GBP2600. Moreover, this point (40000, 34000) is far away from the line of best fit therefore, these are the reasons to support me to believe it is an outlier.
Price depreciation
To disprove or prove my second hypothesis, which is will Mercedes depreciates faster in value than ford. I will have to work out depreciation rate of cars in value. In order to find the depreciation rate I will use the following formula.
However, the formula above only provides how much the car has depreciated over a period and I would like to work out how much it depreciates each year. Therefore, I will use the following formula.
For example, a car worth 26040 when new but the second hand price is 11395 and I would like to see how much it has depreciated so and the price has depreciated 56.24% over 5 years therefore .
Mercedes
Ford
In order to compare the depreciation rate per year of Mercedes and Ford, I will use the arithmetic mean. Here is the formula I used to find this out. =average (E2:E39). The average depreciation rate per year for ford is 11.39% and for Mercedes is 10.46%. Therefore, Mercedes depreciates slower than ford that proves my second hypothesis.
Conclusion
In conclusion, I have proved my hypothesis that is will Mercedes depreciate slower in value that ford and as mileage increases, the price decreases.
I got a good correlation between the mileage and price using correlation coefficient and scatter graph that prove the relationship. According to my graphs, it was a conclusive proves that as the mileage increases, the price decreases. Furthermore, the sample size was representative since I have chosen 38 cars for each make and I chosen two different brands, they are Mercedes and Ford. Mercedes and Ford is representative for the entire car market since one of them is in the higher tier and the other one is in economic tier. From the depreciation rate, I have proved my second hypothesis and which means prestigious car depreciates slower than normal cars.
Evaluation
In my opinion, I could improve few things in this coursework. For instance, I could gather some primary data since I know how were they obtained and accuracy is known. This helps to improve the accuracy and avoid bias. However, I did not use primary data because it is time consuming. As a result, I used secondary data only, which had negative and positive aspects about it. The reason I chose to use secondary data is the exam board provides it. In other words, the reliability of the data is good. Additionally, I could introduce other cars from different make to my population to make it more representative and reliable.
Moreover, I should consider color of cars as well because colors are the major factor, which affects the appearance of the cars. This factor is affecting the price of second hand cars directly since some colors may be more popular than other colors. Such as black is more popular than pink. This would worth me to investigate because it is interesting and the result might be different. Overall, I am satisfied of the work I had done since I have proven my hypotheses without bias and no outliers’ event happened during the investigation.