My First Hypothesis
As I have already explained why I am conducting this investigation I promptly started to investigate the Mileage of the car against the price. I think that the mileage has a lot to do with the price of cars and I feel there will be a lot of positive correlation because in my experience of cars the higher the mileage the lower the price. At first I solely thought that the cars price depended on the mileage.
My first part of investigating was to produce a scatter graph on Excel and use a graph that would join the dots up. The first thing I noticed about the graph was that there was no correlation between the price and the mileage. The spread of Data looked very uneven and there seemed to be no way of proving that my Hypothesis was correct.
I decided to use the formula Spearman’s Coefficient of Rank. This formula is good to show correlation without drawing a graph and is very good as it ranks the data and sorts it.
I ranked the Data according to the mileage then I ranked the price. I performed the calculation and my end result confirmed my suspicions. The Spearman’s coefficient was -0.275. The closer to zero the answer is the weaker the correlation is. By this answer it shows the correlation is very weak and it seems that the age doesn’t have to do much with the price.
I was very surprised that the mileage didn’t have much to do with the price, I am going to investigate other areas that might tell me what effects the price a lot.
As this Hypothesis has come to the fact that the mileage doesn’t have much of an effect on the price my second Hypothesis might be more helpful.
My second Hypothesis
I am now going to investigate if the age of the car has any effect on the price.
I believe that the cars age will have more effect than the mileage on the price and this investigation will turn out more positive towards my hypothesis.
Again I have started off by doing a scatter graph. I have done this because I feel it is a good way to start off and it shows the spread of the data easily just by looking at it.
My first impression was that the graph would be positive and have good correlation. Again I was very surprised to see that the graph had no correlation again. So I decided to perform Spearman’s coefficient again to make sure.
The Spearman’s Coefficient showed there to be no correlation between the price and the age, again agreeing with the graph, which also showed no correlation.
The Standard Deviation shows the spread of the data. Thus I wanted to find out what the spread of each piece of data was like. The age had a low Standard deviation of 1.7, that means that the data is very close together however, the Standard Deviation for the price was 3814.71 This is very high, thus showing me the spreads of each set of data is very wide indeed.
My third Hypothesis
As there has seemed to be no correlation between price, mileage and age I am going to see if the make has anything to do with the price.
Which make is the most expensive, which is the cheapest?
When I first arranged the cars in alphabetical order according to make my first impressions were that Vauxhall was the most expensive and Citroen were the cheapest. But then I noticed that the ages and mileage was very spread out. So I decided to carry out my earlier hypothesis, but within the makes of the cars. I feel this will give me a more accurate result.
The graphs that have been produced on the car make do not show any correlation with mileage and price and Age and Price. I originally thought these would give me a better breakdown of what was going on and why I wasn’t getting any significant results. I don’t think there would be any point in doing a Standard Deviation on all of them as I feel this would be a waste of time as I can easily see from the graphs there is little or no correlation.
I am going to sort out the data into to age groups. I am sorting them in age groups as I feel I am able to break the data down more easily and analyse it better. After looking at the graphs for the breakdown of price against age I can see why I am getting these results. The reason is that the prices are over lapping. What I mean is the range for Age 1 for example is £15,500 -£6,000 then for Age 2 is £16,000 – £5,000 the for Age 3 is £14,000 - £4,000 and u get the picture. The graphs point this out as well. This is quite surprising and revealing because I was thinking I was getting all of my results wrong.
The same pattern with the mileage also happens. The prices overlap here, or if you look at it another way the mileage overlaps so the graph turns out the way it does and so does the standard deviation. I have marked on the graph where one graph would over lap the one preceding it.
This same pattern happens with the Mileage. As the graphs for Mileage against Price show, as you go up in lots of 10,000 miles the price varies a lot and the graphs over lap here as well. Again I have marked on the graph where it starts to over lap the one preceding it.
Evaluation
Now as I have analysed all the data my results seem inconclusive. None of the Standard Deviation and graphs have shown any indication of what affects the price the most. The only thing I can put my finger on after this investigation is that the make of car has the biggest affect. Because many of the Vauxhalls are more expensive that the other makes of cars. Also I if was to carry out this further I would investigate other areas such as the engine size, condition, type and the model of the car. I think these other factors together with the things like mileage age and make all together contribute to the price of the car.