Used Cards - find which factors will influence the price of a second hand car and in what way

Authors Avatar

 _                                 _                              Maths Statistics – Used Cars

Pilot Study

Aim

The aim of the coursework is to find which factors will influence the price of a second hand car and in what way. I believe that the price second hand cars are sold at is dependant upon several factors; certain factors will have a much larger effect on the price than others. In my investigation I am going to chose the two most popular (this being the most amount of a specific make provided by Edexcel of two specific makes) manufacturers of cars from a tally chart of all of the cars because the thing that people look at first when they are buying a car is the make and I believe this is what will affect the second hand price also. Different makes have different prices and depreciate every year at different rate resulting in sum cars holding their value due to the make. In addition certain cars have a very good reputation of being reliable while others are not. Also some cars have a higher social status than others for example people would prefer a Mercedes over a Ford.

Before I commenced with developing a hypothesis, I firstly explored and tried to discover any data, information or details needed for my statistic coursework. Previous to this, I was given a candidate sheet that provided me with factors which I may have liked to consider. The factors are:  

  • Price
  • Age
  • Mileage
  • Cost when new
  • Engine size
  • Colour
  • Make
  • Fuel type
  • Estate/saloon/hatchback

I then began investigating, whether second hand prices change due to different factors of a car. Furthermore, considering whether these factors will affect the depreciation of a car from the new price to the used price would be another aspect to investigate. I will have to reflect on the following questions and the main one being the first:

  • What affects the price of a Used Car?
  • What is my population?
  • What data do I need to collect?
  • What sampling method am I going to use?
  • How large a sample do I need?
  • What method of data collection am I going to use?
  • How am I going to record the data I collect?

As secondary data was provided from the Edexcel website, I used that data to my advantage and then began with my simple hypothesis.

Factors Elaborated / Example Hypotheses

Age: The older the car, the cheaper it is, however this may not apply effectively when comparing prestigious cars to standard cars.

Mileage: The lower the mileage, the higher the price of the car.

Cost when new: The cost of a standard car when new may depreciate vastly when it is used, but with premium cars it differs.

Engine Size: As engines differ in size or capacity, I consider that the smaller the size of the engine, the less expensive the car will be as less petrol would be needed.

Colour: Colours differ in Middle Eastern countries. If the colour of a car is ‘white’ it is known to be quite costly. However in European countries white is known to be rather cheap. To sum up I believe rich colour coated cars are more expensive.

Make: The make of a car is one of the most important factors when considering the depreciation a used car. As I have previously stated, standard cars are to a great extent affected by other factors, though premium or prestigious cars are practically not affected by some factors. So therefore the higher rank a used car is the higher the price.

All the variables above are ratio variables (Numbers are used) except the make of the car and the colour, because they are called nominal variables.

Hypothesis

Primarily, I will first try to find any relationships between the current price of the car and the other variables such as age, mileage, car engine size, and the make. After finding the relationships, I will then attempt to progress the data in order, allowing me to eliminate weak relationships and to find stronger relationships between specific variables and eventually I will try to find a general relationship between them. From this, I will draw the conclusions from the examination about how the price decreases and what affects this.

As secondary data was easy to collect and access, I then began with a simple hypothesis for my pilot study which was ‘the higher the mileage, the lower the price of the used car’. The negative thing about secondary data is that it may not be exactly correct; data may also be missing or out of date. I decided not to use primary data because it can be very time consuming, however the positive aspect of it is; data is accurate most of the times. Sometimes, the entire population may be sufficiently small, and I could have included the entire population in the study. However since the data provided was too much to utilize, I will have to apply a sampling to my data. I will carefully choose the sample which can be used to represent the population. It needs to be large enough to represent the population, but small enough to be manageable. The sample reflects the characteristics of the population from which it is drawn. I took a sample of approximately 15% of my data, which is; 30 cars. Samples are used representatively. They represent the whole database. We use samples because it would take too long to investigate every piece of data on the database, so we only investigate a census (portion) of the population. It is important to clearly define the target population.  

Sampling Methods

Sampling methods are classified as either probability or non-probability. In probability samples, each member of the population has a known non-zero probability of being selected. Probability methods include random sampling, systematic sampling, and stratified sampling. In non-probability sampling, members are selected from the population in some non-random manner. These include convenience sampling and quota sampling. The advantage of probability sampling is that sampling error can be calculated. Sampling error is the degree to which a sample might differ from the population.

Random sampling is the purest form of probability sampling. Each member of the population has an equal and known chance of being selected. When there are very large populations, it is often difficult or impossible to identify every member of the population, so data may become biased.

Systematic sampling is often used instead of random sampling. It is also called an ‘Nth’ name selection technique. After the required sample size has been calculated, every ‘Nth’ record is selected from a list of population members. As long as the list does not contain any hidden order, this sampling method is as good as the random sampling method. Its only advantage over the random sampling technique is simplicity. Systematic sampling is frequently used to select a specified number of records from a computer file. (E.g. Edexcel Database)

Stratified sampling is commonly used probability method that is superior to random sampling because it reduces sampling error. The researcher first identifies the relevant stratums and their actual representation in the population. Random sampling is then used to select a sufficient number of subjects from each stratum. Stratified sampling is often used when one or more of the stratums in the population have a low incidence relative to the other stratums.

Convenience sampling is used in exploratory research where the researcher is interested in getting an inexpensive approximation of the truth. As the name implies, the sample is selected because they are convenient. This non-probability method is often used during preliminary research efforts to get a gross estimate of the results, without incurring the cost or time required to select a random sample.

Quota sampling is the non-probability equivalent of stratified sampling. Like stratified sampling, the researcher first identifies the stratums and their proportions as they are represented in the population. Then convenience or judgment sampling is used to select the required number of subjects from each stratum. This differs from stratified sampling, where the stratums are filled by random sampling.

As you can see above I have analysed each sampling method to its extreme, and have noticed that since random sampling may give an equal chance of a car being selected, I decided to choose it. Furthermore as this is my pilot study I will only be working with a small population of my data, so it will be easy to avoid any incorrect conclusions. As for my main study I will not use this sampling method since I will be working with a large population of data. Moreover some of the sampling methods (Quota and Convenience) include aspects of random sampling. So they may also distort my final conclusions and invalidate my hypothesis. So now I have a choice of systematic and stratified sampling. As I will be focusing on a more complex hypothesis for my main study, I decided to use stratified sampling because of it reduces the sampling error.

Join now!

Source: 

Method

The data I had at first was only a sample of the used cars population in the country, taken from recent adverts and reputable guides to the motor trade. For the pilot study, I deleted the unnecessary factors, and left the one’s which I was to work upon. For my sample I used a =INT(RAND()*204)+2 formula.

I clicked on the cell again and moved the cursor to the bottom right of the cell until it changed to a black cross. I dragged ...

This is a preview of the whole essay