• Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month
Page
  1. 1
    1
  2. 2
    2
  3. 3
    3
  4. 4
    4
  5. 5
    5
  6. 6
    6
  7. 7
    7
  8. 8
    8
  9. 9
    9
  10. 10
    10
  11. 11
    11
  12. 12
    12
  13. 13
    13
  14. 14
    14
  15. 15
    15
  16. 16
    16
  17. 17
    17
  18. 18
    18
  19. 19
    19
  20. 20
    20
  21. 21
    21
  22. 22
    22
  23. 23
    23
  24. 24
    24
  25. 25
    25
  26. 26
    26
  27. 27
    27
  28. 28
    28
  29. 29
    29
  30. 30
    30
  31. 31
    31
  • Level: GCSE
  • Subject: Maths
  • Word count: 8351

Used Cards - find which factors will influence the price of a second hand car and in what way

Extracts from this document...

Introduction

 _                                 _                              Maths Statistics – Used Cars

Pilot Study

Aim

The aim of the coursework is to find which factors will influence the price of a second hand car and in what way. I believe that the price second hand cars are sold at is dependant upon several factors; certain factors will have a much larger effect on the price than others. In my investigation I am going to chose the two most popular (this being the most amount of a specific make provided by Edexcel of two specific makes) manufacturers of cars from a tally chart of all of the cars because the thing that people look at first when they are buying a car is the make and I believe this is what will affect the second hand price also. Different makes have different prices and depreciate every year at different rate resulting in sum cars holding their value due to the make. In addition certain cars have a very good reputation of being reliable while others are not. Also some cars have a higher social status than others for example people would prefer a Mercedes over a Ford.

Before I commenced with developing a hypothesis, I firstly explored and tried to discover any data, information or details needed for my statistic coursework. Previous to this, I was given a candidate sheet that provided me with factors which I may have liked to consider. The factors are:  

  • Price
  • Age
  • Mileage
  • Cost when new
  • Engine size
  • Colour
  • Make
  • Fuel type
  • Estate/saloon/hatchback

I then began investigating, whether second hand prices change due to different factors of a car. Furthermore, considering whether these factors will affect the depreciation of a car from the new price to the used price would be another aspect to investigate. I will have to reflect on the following questions and the main one being the first:

  • What affects the price of a Used Car?
  • What is my population?
  • What data do I need to collect?
  • What sampling method am I going to use?
  • How large a sample do I need?
  • What method of data collection am I going to use?
  • How am I going to record the data I collect?

As secondary data was provided from the Edexcel website, I used that data to my advantage and then began with my simple hypothesis.

Factors Elaborated / Example Hypotheses

Age: The older the car, the cheaper it is, however this may not apply effectively when comparing prestigious cars to standard cars.

Mileage: The lower the mileage, the higher the price of the car.

Cost when new: The cost of a standard car when new may depreciate vastly when it is used, but with premium cars it differs.

Engine Size: As engines differ in size or capacity, I consider that the smaller the size of the engine, the less expensive the car will be as less petrol would be needed.

Colour: Colours differ in Middle Eastern countries. If the colour of a car is ‘white’ it is known to be quite costly. However in European countries white is known to be rather cheap. To sum up I believe rich colour coated cars are more expensive.

Make: The make of a car is one of the most important factors when considering the depreciation a used car. As I have previously stated, standard cars are to a great extent affected by other factors, though premium or prestigious cars are practically not affected by some factors. So therefore the higher rank a used car is the higher the price.

All the variables above are ratio variables (Numbers are used) except the make of the car and the colour, because they are called nominal variables.

Hypothesis

Primarily, I will first try to find any relationships between the current price of the car and the other variables such as age, mileage, car engine size, and the make. After finding the relationships, I will then attempt to progress the data in order, allowing me to eliminate weak relationships and to find stronger relationships between specific variables and eventually I will try to find a general relationship between them. From this, I will draw the conclusions from the examination about how the price decreases and what affects this.

As secondary data was easy to collect and access, I then began with a simple hypothesis for my pilot study which was ‘the higher the mileage, the lower the price of the used car’. The negative thing about secondary data is that it may not be exactly correct; data may also be missing or out of date. I decided not to use primary data because it can be very time consuming, however the positive aspect of it is; data is accurate most of the times. Sometimes, the entire population may be sufficiently small, and I could have included the entire population in the study. However since the data provided was too much to utilize, I will have to apply a sampling to my data. I will carefully choose the sample which can be used to represent the population. It needs to be large enough to represent the population, but small enough to be manageable. The sample reflects the characteristics of the population from which it is drawn. I took a sample of approximately 15% of my data, which is; 30 cars. Samples are used representatively. They represent the whole database. We use samples because it would take too long to investigate every piece of data on the database, so we only investigate a census (portion) of the population. It is important to clearly define the target population.  

Sampling Methods

Sampling methods are classified as either probability or non-probability. In probability samples, each member of the population has a known non-zero probability of being selected. Probability methods include random sampling, systematic sampling, and stratified sampling. In non-probability sampling, members are selected from the population in some non-random manner. These include convenience sampling and quota sampling. The advantage of probability sampling is that sampling error can be calculated. Sampling error is the degree to which a sample might differ from the population.

Random sampling is the purest form of probability sampling. Each member of the population has an equal and known chance of being selected. When there are very large populations, it is often difficult or impossible to identify every member of the population, so data may become biased.

Systematic sampling is often used instead of random sampling. It is also called an ‘Nth’ name selection technique. After the required sample size has been calculated, every ‘Nth’ record is selected from a list of population members. As long as the list does not contain any hidden order, this sampling method is as good as the random sampling method. Its only advantage over the random sampling technique is simplicity. Systematic sampling is frequently used to select a specified number of records from a computer file. (E.g. Edexcel Database)

Stratified sampling is commonly used probability method that is superior to random sampling because it reduces sampling error. The researcher first identifies the relevant stratums and their actual representation in the population. Random sampling is then used to select a sufficient number of subjects from each stratum. Stratified sampling is often used when one or more of the stratums in the population have a low incidence relative to the other stratums.

Convenience sampling is used in exploratory research where the researcher is interested in getting an inexpensive approximation of the truth. As the name implies, the sample is selected because they are convenient. This non-probability method is often used during preliminary research efforts to get a gross estimate of the results, without incurring the cost or time required to select a random sample.

Quota sampling is the non-probability equivalent of stratified sampling. Like stratified sampling, the researcher first identifies the stratums and their proportions as they are represented in the population. Then convenience or judgment sampling is used to select the required number of subjects from each stratum. This differs from stratified sampling, where the stratums are filled by random sampling.

As you can see above I have analysed each sampling method to its extreme, and have noticed that since random sampling may give an equal chance of a car being selected, I decided to choose it. Furthermore as this is my pilot study I will only be working with a small population of my data, so it will be easy to avoid any incorrect conclusions. As for my main study I will not use this sampling method since I will be working with a large population of data. Moreover some of the sampling methods (Quota and Convenience) include aspects of random sampling. So they may also distort my final conclusions and invalidate my hypothesis. So now I have a choice of systematic and stratified sampling. As I will be focusing on a more complex hypothesis for my main study, I decided to use stratified sampling because of it reduces the sampling error.

Source:http://en.wikipedia.org/wiki/Sampling_(statistics)

Method

The data I had at first was only a sample of the used cars population in the country, taken from recent adverts and reputable guides to the motor trade. For the pilot study, I deleted the unnecessary factors, and left the one’s which I was to work upon. For my sample I used a =INT(RAND()*204)+2 formula.

image20.png

image20.png

I clicked on the cell again and moved the cursor to the bottom right of the cell until it changed to a black cross. I dragged down until I reached the bottom of the data.image30.png

Here is the data with the random sample numbers on the side.image41.png

Selected data:

image44.png

As you can see I have selected a random sample from my database, I chose the first 30 numbers that my random sample formula provided me and notified and replaced and errors which it had chosen.

Here is the original random sample of numbers and their data that I had (In ascending order):

Random Numbers

79

92

100

107

114

120

120

140

146

148

156

166

169

177

190

2

2

4

11

12

15

16

18

23

24

34

43

70

71

75

Random Data

Car

Make

Model

Price

Mileage

no.

Used

2

Mercedes

E-Class 2000

11395

12000

2

Mercedes

E-Class 2000

11395

12000

4

Rover

25

2970

50000

11

Nissan

Micra

860

28000

12

Fiat

Bravo

1885

51000

15

Mercedes

C-Class 93-01

90000

16

Ford

Ka

2090

10000

18

Rover

Mini

1190

12000

23

Honda

Prelude

1810

6000

24

BMW

3-Series 91-99

12825

68000

34

Mazda

121

1620

55000

43

Mazda

Demio

1920

71000

70

Daihatsu

Sirion

4915

17500

71

Mercedes

Cab E-Class

10920

9500

75

Ford

Mondeo 96-00

3335

22000

79

Subaru

Forester

4550

50000

92

Mitsubishi

Carisma

1385

71000

100

Nissan

100 NX

1005

43000

107

Mercedes

SL-Class 89-02

19260

12000

114

Fiat

Bravo

1125

90000

120

Nissan

Almera

9075

90000

120

Nissan

Almera

9075

90000

140

Fiat

Stilo

4900

60000

146

Toyota

Previa

10700

12000

148

Chrysler

GrandVoyager

6690

15000

156

Land Rover

Range Rover

7735

12000

166

Mercedes

M-Class

25810

19000

169

Ford

Explorer

4715

10000

177

Mercedes

A-Class

12320

80000

190

Ford

Escort

1225

10000

As you can see I have notified data which the random sample had repeated. Moreover I have also highlighted a very significant error which may affect my results. This field had an empty cell, with no details of the price used. This is the problem with secondary data, some mistakes may occur with the data and they need to be ignored. Here are the details of car No. 15.

15

Mercedes

C-Class 93-01

Missing

90000

As mistakes can not be included in the sample, because it will cause an error and an anomaly in my graph, I decided to get rid of the car numbers that had errors. I decided to replace them through choosing random numbers myself, by using the calculator.

The scientific calculator has a random number generation capability, Ran# which can be used to generate the random numbers. The command generates a random number larger than zero and less than one. Eventually, the random numbers produced are spread evenly over the whole interval from zero to one.

The following calculator command will be used to generate the random numbers I need for my sample:

  • Enter 204Ran# to generate a random number between 0 and 204
...read more.

Middle

image04.pngimage03.pngimage02.png

image13.pngimage14.png

s =        1022834768 ÷ 30image03.png

s = 5839.048918

s = 5839.05 (2.d.p)

I also worked out the standard deviation for the mileage; here is what I did to get the standard deviation;

image12.pngimage11.png

s =         Σ(X-x) 2

             n image02.pngimage04.pngimage03.png

image14.pngimage13.png

s =        21813883000 ÷ 30image03.png

s = 26965.33763

s = 26965.34 (2.d.p)

Since my data seemed to be of a weak correlation, due to the conclusion given by the standard deviation, I have decided to expand my variable from mileage and price used, to age and price used. I assume that the higher the age of the car, the lower the price.

I have adjusted my data and have added the ages of all the chosen cars. Here is the data that I am going to use:

Car

Make

Model

Price Used

Mileage

Age

no.

2

Mercedes

E-Class 2000

11395

12000

7

4

Rover

25

2970

50000

8

11

Nissan

Micra

860

28000

12

12

Fiat

Bravo

1885

51000

9

16

Ford

Ka

2090

10000

10

18

Rover

Mini

1190

12000

12

19

Volvo

440

1155

10000

12

23

Honda

Prelude

1810

6000

12

24

BMW

3-Series 91-99

12825

68000

6

31

Skoda

Fabia

3585

20000

7

34

Mazda

121

1620

55000

10

43

Mazda

Demio

1920

71000

9

70

Daihatsu

Sirion

4915

17500

5

71

Mercedes

Cab E-Class

10920

9500

10

75

Ford

Mondeo 96-00

3335

22000

8

79

Subaru

Forester

4550

50000

10

92

Mitsubishi

Carisma

1385

71000

12

100

Nissan

100 NX

1005

43000

12

107

Mercedes

SL-Class 89-02

19260

12000

9

114

Fiat

Bravo

1125

90000

12

116

BMW

5-Series 1996

6145

19900

10

120

Nissan

Almera

9075

90000

3

140

Fiat

Stilo

4900

60000

5

146

Toyota

Previa

10700

12000

7

148

Chrysler

Grand Voyager

6690

15000

10

156

Land Rover

Range Rover

7735

12000

12

166

Mercedes

M-Class

25810

19000

5

169

Ford

Explorer

4715

10000

10

177

Mercedes

A-Class

12320

80000

3

190

Ford

Escort

1225

10000

11

Below there is a scatter graph showing the results that I got with the data I have shown above.

image06.png

image23.pngimage06.png

The line of best fit shows a medium negative correlation between the data which shows that as the age of a car increases the value of the car decreases. This is shown in this section of the graph.

image24.png

This proves that my theory/assumption was correct; moreover the anomalies may have affected the results in my graph.

I have worked out the correlation coefficient;

r = -0.555694

r2 = -0.3087958216

   = -30.87958216%

The correlation coefficient (product moment correlation) of the data for age and price used is -0.555. So according to the scale, this indicates my correlation as being a medium and weak correlation.

I have expanded the working out of the correlation by using Spearman’s rank.

...read more.

Conclusion

make’ because it played a huge part in my data and showed a close relationship throughout.

Evaluation

To evaluate, I believe that the data I used was only a portion of the population that exists. Moreover it was also secondary data, which had negative and positive aspects about it. I think that if I had more time I would have researched for primary data to supplement my secondary data. This will make my results even more reliable then they are. Also increasing the amount of cars for my main study to about 100-150 cars to make my results more accurate. Also including other makes of cars, such as Renault, Jaguar, etc. I would have to make sure that I had equal numbers of each make for my investigation so that it does not become biased.

The thing I found easy was collecting the data, since it was already provided from Edexcel. I found it difficult to attempt in collecting primary data, since it is very time consuming. Another difficulty aspect of the study was sorting out the data, and choosing a sample, without having biased results, this lead on to analysing the results, which was quite hard, since the formulas/methods was complicated to use for some data.

Next time, I believe that I should consider the factor of ‘colour’ because:

Colour: Colours differ in Middle Eastern countries. If the colour of a car is ‘white’ it is known to be quite costly. However in European countries white is known to be rather cheap. To sum up I believe rich colour coated cars are more expensive.

This would be a great thing to investigate, since it is interesting and I would like to see the results that I obtain. Overall I feel that I have proven my hypotheses to be correct, and I am not surprised about with result I obtained.

...read more.

This student written piece of work is one of many that can be found in our GCSE Gary's (and other) Car Sales section.

Found what you're looking for?

  • Start learning 29% faster today
  • 150,000+ documents available
  • Just £6.99 a month

Not the one? Search for your essay title...
  • Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

See related essaysSee related essays

Related GCSE Gary's (and other) Car Sales essays

  1. Statistics: Factors Affecting the Price of Used Cars

    Ford Fiesta 2005 20 045 5,599 29 Ford Fiesta 2005 23 553 5,399 30 Ford Fiesta 2005 18 366 5,499 31 Peugeot 206 2003 25 813 5,199 32 Peugeot 206 2002 17 988 4,999 33 Peugeot 206 2003 14 477 5,199 34 Peugeot 206 2003 22 583 5,299 35 Peugeot

  2. What Influence Did Henry Ford Have On 1920s America?

    The results were astounding; the time taken to produce 1 car dropped from 121/2 hours to 11/2 hours. With each worker only doing one job, Ford could hire unskilled, cheap labour in his factories, and therefore make a greater profit.

  1. used car coursework

    The trend line I added shows a slope going downwards from left to right. It shows a negative correlation between used car price and age. From the graph I can say that as the age increases the used car price decreases.

  2. I have been given instructions to collect data for my GCSE statistics coursework and ...

    A Survey can be particularly useful because the data is likely to be personal. It may fail however if a person was not telling the truth. SAMPLING When sample data are collected, information is taken from part of the population.

  1. Statistic coursework-what has the most influence on the price of a second hand car?

    19 Mercedes 12000 1998 58270 19260 20 Mercedes 51000 2001 17772 3895 21 Mercedes 16890 1998 15640 4635 22 Mercedes 12000 1999 41540 18020 23 Mercedes 21000 1995 101975 15105 24 Mercedes 12000 1995 40760 7355 25 Mercedes 20000 2000 31390 21170 26 Mercedes 12000 1997 30710 8805 27 Mercedes

  2. Maths Data Handling-Secondhand Car

    The gradient in these formulas or the m in y=mx+c will be the number which is multiplied by the age and will be the percentage depreciation per year for the trend line. Also c is the depreciation of the car which occurs immediately after it has become a 2nd hand car.

  1. Handling data Used car prices

    In such a large sample of 10 cars, there are a few exceptions and anomalies that do not comply with this rule. This may be due to the fact that some people use their car a lot more than the average person does.

  2. T-Total Maths

    63 FORMULA =5N + 63 FORMULAS GRID 5 = 5N +35 GRID 6 = 5N + 63 GRID 7 =5N + 49 GRID 8 = 5N + 56 GRID 9 = 5N + 63 GRID 10 =5N + 70 I have turned the T-shape upside down.

  • Over 160,000 pieces
    of student written work
  • Annotated by
    experienced teachers
  • Ideas and feedback to
    improve your own work