Used car prices.

Authors Avatar

Used car prices

Introduction

During this coursework I will be working from the form the data that has been given to me. It is established on figuring out and representing the data in different forms. The table below is the given data that is to be interpreted:

In this coursework, I will be investigating the influences on the price of a second hand car. I have been given set data on the used car prices from a database, which I will study and will be able to give firm reasoning on the factors affecting the price of these second hand cars.

Problem Specification

I wish to find which factors that have influenced the second hand price and therefore, in order to do this, I am going to compare these prices of various data to discover a link between them.  

The database contains information about some used cars. Many different makes of cars are included. Use the information to investigate what influenced the price of second hand cars.

The title of my investigation is ‘Used car prices’. As you can interpret from this title, I am going to investigate a real-life situation using statistics. By collecting data from the data sheet give by my class mathematic teacher. With the help of this I will make a hypothesis and draw conclusions.

Plan

In this coursework I am measuring the same thing for more than one set of data. In this case I will be comparing the old and new prices of different types of makes. I will be finding the mean, median, mode and range for some certain makes of cars, from the data I have been given, I then will represent my collected data by the help of pie charts, bar charts, scatter graphs, etc.

I also will be comparing the cars with each other. For example: By collating the old and new prices of three or more models and compare them with each other, looking for which one is suitable to buy. These results will be displayed on pie charts and many other categories of telegraphs.

With the help of cumulative frequency graphs I will find the estimate of the median, upper quartile, lower quartile and the Inter quartile Range (IQR). I will only make a cumulative frequency graph and estimate the above, one particular factor or maybe more than one.

Mentioning all the correlations: Positive correlation, Negative correlation and Neutral correlation. I will be giving all these three correlation with the help of the data given. This will show me which car with worth buying.

 

Positive Correlation                 Negative Correlation                               Neutral Correlation (No correlation)

Finally to achieve a top grade I will be drawing different graphs, from different sets of data, as before, and with the help of these drawn graphs I will assign the gradient and the inception the y-axis. I will also be writing statements so the graphs could be understood properly. Standard deviations will also be done in this coursework.

Aim

My aim in this assignment is to find out what influences the prices are of the second hand cars.

I will be able to estimate values that are not given using the line of best fit and compare this to the actual second hand prices calculated.    

I am doing this so I can find out what car is least expensive and which one is most expensive and if it is worth buying.

Tally chart

From looking at the data presented to me, I hope that I will be able to see the relationship between the mileage and the age of the cars.

I predict that an older car will have a higher mileage. I think that this is because an older car would have been driven more than a new car and therefore it would be a higher mileage. I also predict that there will be only a few/ hardly any cars when the mileage is extremely high (i.e.: 90000-150000). I will then compare all the histograms produced by visual observation and see, which range of mileage most cars, contain.

The following table shows the data on Ford cars only:

Frequency table to show the number of cars in each mileage group- Ford

This shows that I was correct about the mileage in my hypothesis there are hardly any cars with a very high mileage and there are only a few with smaller mileages.

The following histogram is to show the mileage data.

I will now take the same steps into consideration as before. I shall now investigate the mileage of another make of car. I will be examining Fiat. The following table shows the data on Fiat cars only:

Frequency table to show the number of cars in each mileage group- Fiat

I have shown the above data on the graph below. This shows that my prediction was correct and as there are hardly any cars with a very high mileage and there are only a few with smaller mileages.

 

I will now produce another histogram again with the mileage data of the car make of Vauxhall.

The following table shows the data on Vauxhall cars only:

I have shown the data above below. I have produced a histogram; this obviously shows that there is a high number of cars are between the ranges of 20000-30000mileages.

Conclusion on the histograms drawn

After conducting this part of my investigation I have observed which car make lies in which mileage range the most. I have produced three histograms on three different makes. All these histograms represent the mileage data only.

I have observed by visual observation that the Ford make of cars have most of its cars lying in-between the 30000 and 40000 mileages. By looking at the histogram that has been produced on the Fiat make of cars, the same observation is had here also. It can be seen that in both makes most of the cars of each make lay in-between the assortment of 30000 and 40000 mileages.  

Whereas, by looking at the Vauxhall data and the histogram, it can be seen that most of the cars lay amid the array of 20000-30000.

This can be due to many different kinds of reasons. In my opinion I think that the cars with the most mileage would obviously belong to someone that drives very often or long distances. This could be a sales man, who drives from one state to another. Therefore Fiat and Ford would belong to someone that drives their cars for extensive distances.  

Whereas, the Vauxhall would belong to someone that does not drive large distances. This may belong to a person that works in Central London and lives in East London, who drives to and from work every weekday.

The cars with the most mileage will also bring the second hand price down and will be deduced. Whereas the Vauxhalls that will have a lower mileage will have a higher second hand price rate than the Fiats and the Fords. But the second hand price can also be contributed by other variables, such as the condition or the engine size.

Therefore I cannot make a statement that which would tell me which car is most suitable to buy. I shall now investigate further into different variables that affect the second hand price.

Pie chart

The following table was taken into consideration to acquaint with the number of each car for comparison purposes:

        56

Degrees to be plotted on pie graph = (3600 ÷ Total number of cars) × Frequency

  • Degrees for Fiat = 6.4 × 10 = 64.280 
  • Degrees for Peugeot = 6.4 × 5 = 32.140
  • Degrees for Rover = 6.4 × 12 = 77.140
  • Degrees for Vauxhall = 6.4 × 13 = 83.570
  • Degrees for Ford = 6.4 × 16 = 102.850

I intend to optimise the number of cars upon a pie chart, this would show how much of the total amount of cars are divided.

The pie chart represents the data that is accessible from the above table:

By looking at the graph above it can be noticed that there are more Fords than any other car. There are very few Peugeots out of the total fifty-six cars. I have represented this is a bar graph on the next page, this is to clearly represent the proportion of each make of car out of 56:

From these types of graphs a lot more information can be obtained quickly and easily. Such as, from the above graph it is easier and quicker to gather information about the number of cars that are in attendance. This is not time consuming as it saves time by not looking at the table given on the first two pages.

Doughnut chart

Now a Doughnut chart will represent the same data, this is the same as a pie chart but it contains multiple series:

 

It is now known that the data can be represented in many ways.

Mean, mode, median and range – Ford  

I now will am determined to observe the mean, median and mode of the cars. First I shall be investigating the mean, median, mode and range of Ford. I shall be investigating three car makes.

I shall firstly find out the mode that is also known as the modal. The mode is the highest frequency. By looking at the table above, in the frequency column, it can be easily seen that 5 is the highest frequency. Therefore mode is equal to £8k ≤ x < £10k. This shows me that this is the most frequent price. It is the central tendency from all the Ford cars.  

Secondly I shall calculate the range. This will show me the price between the lowest and the highest price of this car make. The range can be calculated as the following:

Range = Highest Price – Lowest Price

Range = 18 – 4 = £14k

The mean can be found by dividing the number of cars by means of the number of makes.

This can be written as the following:

Mean = ∑ Frequency × Mid-interval

∑ Frequency

Mean = 21 + 45 + 11 + 39 + 30 + 34 ÷ 16

Mean = £11.25k

             

The median represents the middle value out of collection of numbers. Finding the median is very straightforward. The frequency column that was shown above can find the median. The frequency should always arrange from smallest to largest. If there were a large amount of data that is present in the frequency column it would be very time consuming to put the column in ascending order. Therefore an easier step can be taken into deliberation. This has been shown below with the formula that is used at all times to calculate the median:

Median = n + 1 ÷ 2

Where n is equal to highest cumulative frequency

Median = n + 1 ÷ 2

Median = 16 + 1 ÷ 2

Median = 17 ÷ 2

Median = 8.5

Now when this median is found this number is taken and the cumulative frequency column is looked at. It can be noticed that 8.5 lies between 8 and 9. It can now be said that the median is equal to 10 ≤ x < 12.

Join now!

Therefore, Median = 10 ≤ x < 12

Lower Quartile, Upper Quartile and Inter Quartile Range – Ford  

To make my estimations accurate I will find out the lower quartile, upper quartile and inter quartile range.  

Median = n + 1 ÷ 2

Median = 16 + 1 ÷ 2

Median = 8.5

I now looked at 8.5 on the y-axis (cumulative frequency) of the graph on the next page and dropped a perpendicular line until it touched the plotted line. Therefore Median is the value that is obtained on the x-axis. This has ...

This is a preview of the whole essay