Gary's Car Sales.

Gary's Car Sales

My task for this coursework is to statistically analyse the data given to me regarding Gary's used car sales. I shall begin with looking at the data. The following data was given to me.

The first question which occurred to me was, is there any correlation between the difference between the price when new and the age. I expected there to be a positive correlation if there was one, showing that as the age went up, the drop in price from when the car is new and when it is re-sold increases. Below is a table of the Price differences and the age.

If you look at graph #1, you will see that I was wrong. There was no correlation between the price difference and the age. After seeing this, I thought that if I found what percentage of the original price the price difference is I could plot that in a scatter diagram to find the correlation between the depreciation percentage and the age. I found the depreciation with the equation

D=100-((o-p)*100)

Where D= Depreciation, o= the original price and p= price when used

Looking at graph #2, which shows the scatter diagram of the depreciation versus the age, you can ...

This is a preview of the whole essay

D=100-((o-p)*100)

Where D= Depreciation, o= the original price and p= price when used

Looking at graph #2, which shows the scatter diagram of the depreciation versus the age, you can definitely see a strong positive correlation where graph#1 did not. This is because the price difference is subject to the original price. If the original price is high, the price difference may be large in comparison to the others, but showing that price difference as a percentage of the original price, it puts all the results as a part of one constant rather than part of a number applicable only to itself. A line of best fit on graph #2 can allow us to estimate the depreciation of a car. Finding the equation of the line can also be used to find an estimate.

The equation of a line takes the form 'y=mx+c'.

m=gradient= y step/x step.

We can find the gradient by taking 2 points from the line.

(1.2,30) (7,72)

m= y2-y1/x2-x1 = (72-30)/(7-1.2) = 42/5.8 = 7.24 to 2dp

y=7.24x+c. We can find c by using one of our points.

(1.2,30) 30=7.24*1.2+c = c=30-7.24*1.2 = 21.31 to 2dp

To check this, I will use the other point.

7*7.24+21.31=71.99, which is very close to 72, the point I found using the graph.

Looking at the results in the table, I saw that with the 1 year old cars, the depreciations were in the area of 20% - 30%. The 2 year old cars were in the 25% - 40%. This showed me that the rate of depreciation might not be constant. To test this, I found the mean depreciation per year for all the cars by dividing the mean depreciation with the mean age, and got 11.65%. I then found the individual depreciation per year for all the cars. Below is a table of the results.

As you can see from the table, the values of the annual depreciation are not near the mean. The newer cars depreciate more than the older cars. Graph #3 shows that the results do not follow a straight line but a downward curve. The slope of the curve show that the cost of a car drops more in its first three years than at any other time.

The next thing I tried was to look at the prices on their own. I split the prices into ranges of 1000, going from 0<x≤1000 to 10000<x≤11000, and made a cumulative frequency table, which is below.

I found the standard deviation of the prices using the formula:

n∑x2 -(∑x)2

As the working out is considerably long, I shall just give the answer, which is

n∑x2 -(∑x)2 = £2440.20

I used the standard deviation in conjunction with the mean price, which was £4556.39, to find an estimate for the inter-quartile range. I calculated the mean plus and minus 1 standard deviation, which gave me a range close to the inter- quartile range I was going to obtain from a cumulative frequency graph.

Range = (4556.39-2440.2) to (4556.39+2440.2) = 2116.19 to 6996.59

Graph #4 shows a cumulative frequency graph.

As you can see from that graph, the inter-quartile range is 2400-6200, which is an average of £540.20 inside the range set by the mean and standard deviation.

Looking at the first table, you can see that exactly ⅔ of the prices are within this data field. Therefore, we can use the remaining 12 cars (car numbers 6, 7, 9, 10, 16, 17, 18, 26, 28, 34, 35, and 36) to look at the factor which most determines unusually high or low prices, as the remaining cars will probably contain extremes of at least one of the factors of age, mileage, engine size, or price when new. We can see if there is a large frequency of values for one factor outside the majority, then this could have had a bearing on the price, making it go outside the majority. We can call the range set by the mean and the standard deviation the majority. I shall be using the range as dictated by the method involving the standard deviation as opposed to the interquartile range. Below is a table of the majority ranges of the different factors.

Of the twelve extraordinary prices,

¨ 75% were outside the majority age,

¨ 75% were outside the majority mileage,

¨ 33% were outside the majority engine size,

¨ and 33% are outside the majority price when new.

For clarity as to which factor influenced the price most, we should look at the same factors, but in conjunction with the outside majority range values. The standard deviation of the depreciation is 21.77 and the mean is 55.97, making the majority range 34.2 to 77.74. There are 15 cars with depreciation values outside the majority range.

Of these 15 values,

¨ 80% were outside the majority age,

¨ 73% cars were outside the majority mileage,

¨ and 33% cars were outside the majority engine size.

One of the cars, car 25, was reliant on the engine size to make it an unusually priced car, having values for mileage and age close to the mean values for both. The size of the engine gave it an unusually high depreciation.

All the above data prove that the age and the mileage are the two most prominent deciding factors of the price and depreciation of a car, with the size of the engine entering into the equation if there is nothing out of the ordinary about the age and mileage.

In conclusion, I have discovered the following facts about the data provided:

· I found that there is no correlation between the price difference and the age, using a scatter diagram.

· I found that there is a strong positive correlation between the depreciation and the age, using another scatter diagram.

· I found a way to predict a percentage loss using a line of best fit and the lines equation.

· I found that a car decreases in value more in its first three years than at any other time using a scatter diagram.

· I found the main contributing factors and one backup factor to the price using the standard deviation, mean, inter-quartile range and majority range for the prices.