• Join over 1.2 million students every month
• Accelerate your learning by 29%
• Unlimited access from just £6.99 per month
Page
1. 1
1
2. 2
2
3. 3
3
4. 4
4
5. 5
5
6. 6
6
7. 7
7
8. 8
8
9. 9
9
10. 10
10
11. 11
11
12. 12
12
13. 13
13
14. 14
14
15. 15
15
16. 16
16
17. 17
17
18. 18
18
19. 19
19
20. 20
20
21. 21
21
22. 22
22
23. 23
23
24. 24
24
25. 25
25
26. 26
26
27. 27
27
28. 28
28
29. 29
29
30. 30
30
31. 31
31
• Level: GCSE
• Subject: Maths
• Word count: 8351

# Used Cards - find which factors will influence the price of a second hand car and in what way

Extracts from this document...

Introduction

_                                 _                              Maths Statistics – Used Cars

Pilot Study

Aim

The aim of the coursework is to find which factors will influence the price of a second hand car and in what way. I believe that the price second hand cars are sold at is dependant upon several factors; certain factors will have a much larger effect on the price than others. In my investigation I am going to chose the two most popular (this being the most amount of a specific make provided by Edexcel of two specific makes) manufacturers of cars from a tally chart of all of the cars because the thing that people look at first when they are buying a car is the make and I believe this is what will affect the second hand price also. Different makes have different prices and depreciate every year at different rate resulting in sum cars holding their value due to the make. In addition certain cars have a very good reputation of being reliable while others are not. Also some cars have a higher social status than others for example people would prefer a Mercedes over a Ford.

Before I commenced with developing a hypothesis, I firstly explored and tried to discover any data, information or details needed for my statistic coursework. Previous to this, I was given a candidate sheet that provided me with factors which I may have liked to consider. The factors are:

• Price
• Age
• Mileage
• Cost when new
• Engine size
• Colour
• Make
• Fuel type
• Estate/saloon/hatchback

I then began investigating, whether second hand prices change due to different factors of a car. Furthermore, considering whether these factors will affect the depreciation of a car from the new price to the used price would be another aspect to investigate. I will have to reflect on the following questions and the main one being the first:

• What affects the price of a Used Car?
• What is my population?
• What data do I need to collect?
• What sampling method am I going to use?
• How large a sample do I need?
• What method of data collection am I going to use?
• How am I going to record the data I collect?

As secondary data was provided from the Edexcel website, I used that data to my advantage and then began with my simple hypothesis.

Factors Elaborated / Example Hypotheses

Age: The older the car, the cheaper it is, however this may not apply effectively when comparing prestigious cars to standard cars.

Mileage: The lower the mileage, the higher the price of the car.

Cost when new: The cost of a standard car when new may depreciate vastly when it is used, but with premium cars it differs.

Engine Size: As engines differ in size or capacity, I consider that the smaller the size of the engine, the less expensive the car will be as less petrol would be needed.

Colour: Colours differ in Middle Eastern countries. If the colour of a car is ‘white’ it is known to be quite costly. However in European countries white is known to be rather cheap. To sum up I believe rich colour coated cars are more expensive.

Make: The make of a car is one of the most important factors when considering the depreciation a used car. As I have previously stated, standard cars are to a great extent affected by other factors, though premium or prestigious cars are practically not affected by some factors. So therefore the higher rank a used car is the higher the price.

All the variables above are ratio variables (Numbers are used) except the make of the car and the colour, because they are called nominal variables.

Hypothesis

Primarily, I will first try to find any relationships between the current price of the car and the other variables such as age, mileage, car engine size, and the make. After finding the relationships, I will then attempt to progress the data in order, allowing me to eliminate weak relationships and to find stronger relationships between specific variables and eventually I will try to find a general relationship between them. From this, I will draw the conclusions from the examination about how the price decreases and what affects this.

As secondary data was easy to collect and access, I then began with a simple hypothesis for my pilot study which was ‘the higher the mileage, the lower the price of the used car’. The negative thing about secondary data is that it may not be exactly correct; data may also be missing or out of date. I decided not to use primary data because it can be very time consuming, however the positive aspect of it is; data is accurate most of the times. Sometimes, the entire population may be sufficiently small, and I could have included the entire population in the study. However since the data provided was too much to utilize, I will have to apply a sampling to my data. I will carefully choose the sample which can be used to represent the population. It needs to be large enough to represent the population, but small enough to be manageable. The sample reflects the characteristics of the population from which it is drawn. I took a sample of approximately 15% of my data, which is; 30 cars. Samples are used representatively. They represent the whole database. We use samples because it would take too long to investigate every piece of data on the database, so we only investigate a census (portion) of the population. It is important to clearly define the target population.

Sampling Methods

Sampling methods are classified as either probability or non-probability. In probability samples, each member of the population has a known non-zero probability of being selected. Probability methods include random sampling, systematic sampling, and stratified sampling. In non-probability sampling, members are selected from the population in some non-random manner. These include convenience sampling and quota sampling. The advantage of probability sampling is that sampling error can be calculated. Sampling error is the degree to which a sample might differ from the population.

Random sampling is the purest form of probability sampling. Each member of the population has an equal and known chance of being selected. When there are very large populations, it is often difficult or impossible to identify every member of the population, so data may become biased.

Systematic sampling is often used instead of random sampling. It is also called an ‘Nth’ name selection technique. After the required sample size has been calculated, every ‘Nth’ record is selected from a list of population members. As long as the list does not contain any hidden order, this sampling method is as good as the random sampling method. Its only advantage over the random sampling technique is simplicity. Systematic sampling is frequently used to select a specified number of records from a computer file. (E.g. Edexcel Database)

Stratified sampling is commonly used probability method that is superior to random sampling because it reduces sampling error. The researcher first identifies the relevant stratums and their actual representation in the population. Random sampling is then used to select a sufficient number of subjects from each stratum. Stratified sampling is often used when one or more of the stratums in the population have a low incidence relative to the other stratums.

Convenience sampling is used in exploratory research where the researcher is interested in getting an inexpensive approximation of the truth. As the name implies, the sample is selected because they are convenient. This non-probability method is often used during preliminary research efforts to get a gross estimate of the results, without incurring the cost or time required to select a random sample.

Quota sampling is the non-probability equivalent of stratified sampling. Like stratified sampling, the researcher first identifies the stratums and their proportions as they are represented in the population. Then convenience or judgment sampling is used to select the required number of subjects from each stratum. This differs from stratified sampling, where the stratums are filled by random sampling.

As you can see above I have analysed each sampling method to its extreme, and have noticed that since random sampling may give an equal chance of a car being selected, I decided to choose it. Furthermore as this is my pilot study I will only be working with a small population of my data, so it will be easy to avoid any incorrect conclusions. As for my main study I will not use this sampling method since I will be working with a large population of data. Moreover some of the sampling methods (Quota and Convenience) include aspects of random sampling. So they may also distort my final conclusions and invalidate my hypothesis. So now I have a choice of systematic and stratified sampling. As I will be focusing on a more complex hypothesis for my main study, I decided to use stratified sampling because of it reduces the sampling error.

Source:http://en.wikipedia.org/wiki/Sampling_(statistics)

Method

The data I had at first was only a sample of the used cars population in the country, taken from recent adverts and reputable guides to the motor trade. For the pilot study, I deleted the unnecessary factors, and left the one’s which I was to work upon. For my sample I used a =INT(RAND()*204)+2 formula.

I clicked on the cell again and moved the cursor to the bottom right of the cell until it changed to a black cross. I dragged down until I reached the bottom of the data.

Here is the data with the random sample numbers on the side.

Selected data:

As you can see I have selected a random sample from my database, I chose the first 30 numbers that my random sample formula provided me and notified and replaced and errors which it had chosen.

Here is the original random sample of numbers and their data that I had (In ascending order):

Random Numbers

 79 92 100 107 114 120 120 140 146 148 156 166 169 177 190
 2 2 4 11 12 15 16 18 23 24 34 43 70 71 75

Random Data

 Car Make Model Price Mileage no. Used 2 Mercedes E-Class 2000 11395 12000 2 Mercedes E-Class 2000 11395 12000 4 Rover 25 2970 50000 11 Nissan Micra 860 28000 12 Fiat Bravo 1885 51000 15 Mercedes C-Class 93-01 90000 16 Ford Ka 2090 10000 18 Rover Mini 1190 12000 23 Honda Prelude 1810 6000 24 BMW 3-Series 91-99 12825 68000 34 Mazda 121 1620 55000 43 Mazda Demio 1920 71000 70 Daihatsu Sirion 4915 17500 71 Mercedes Cab E-Class 10920 9500 75 Ford Mondeo 96-00 3335 22000 79 Subaru Forester 4550 50000 92 Mitsubishi Carisma 1385 71000 100 Nissan 100 NX 1005 43000 107 Mercedes SL-Class 89-02 19260 12000 114 Fiat Bravo 1125 90000 120 Nissan Almera 9075 90000 120 Nissan Almera 9075 90000 140 Fiat Stilo 4900 60000 146 Toyota Previa 10700 12000 148 Chrysler GrandVoyager 6690 15000 156 Land Rover Range Rover 7735 12000 166 Mercedes M-Class 25810 19000 169 Ford Explorer 4715 10000 177 Mercedes A-Class 12320 80000 190 Ford Escort 1225 10000

As you can see I have notified data which the random sample had repeated. Moreover I have also highlighted a very significant error which may affect my results. This field had an empty cell, with no details of the price used. This is the problem with secondary data, some mistakes may occur with the data and they need to be ignored. Here are the details of car No. 15.

 15 Mercedes C-Class 93-01 Missing 90000

As mistakes can not be included in the sample, because it will cause an error and an anomaly in my graph, I decided to get rid of the car numbers that had errors. I decided to replace them through choosing random numbers myself, by using the calculator.

The scientific calculator has a random number generation capability, Ran# which can be used to generate the random numbers. The command generates a random number larger than zero and less than one. Eventually, the random numbers produced are spread evenly over the whole interval from zero to one.

The following calculator command will be used to generate the random numbers I need for my sample:

• Enter 204Ran# to generate a random number between 0 and 204

Middle

s =        1022834768 ÷ 30

s = 5839.048918

s = 5839.05 (2.d.p)

I also worked out the standard deviation for the mileage; here is what I did to get the standard deviation;

s =         Σ(X-x) 2

n

s =        21813883000 ÷ 30

s = 26965.33763

s = 26965.34 (2.d.p)

Since my data seemed to be of a weak correlation, due to the conclusion given by the standard deviation, I have decided to expand my variable from mileage and price used, to age and price used. I assume that the higher the age of the car, the lower the price.

I have adjusted my data and have added the ages of all the chosen cars. Here is the data that I am going to use:

 Car Make Model Price Used Mileage Age no. 2 Mercedes E-Class 2000 11395 12000 7 4 Rover 25 2970 50000 8 11 Nissan Micra 860 28000 12 12 Fiat Bravo 1885 51000 9 16 Ford Ka 2090 10000 10 18 Rover Mini 1190 12000 12 19 Volvo 440 1155 10000 12 23 Honda Prelude 1810 6000 12 24 BMW 3-Series 91-99 12825 68000 6 31 Skoda Fabia 3585 20000 7 34 Mazda 121 1620 55000 10 43 Mazda Demio 1920 71000 9 70 Daihatsu Sirion 4915 17500 5 71 Mercedes Cab E-Class 10920 9500 10 75 Ford Mondeo 96-00 3335 22000 8 79 Subaru Forester 4550 50000 10 92 Mitsubishi Carisma 1385 71000 12 100 Nissan 100 NX 1005 43000 12 107 Mercedes SL-Class 89-02 19260 12000 9 114 Fiat Bravo 1125 90000 12 116 BMW 5-Series 1996 6145 19900 10 120 Nissan Almera 9075 90000 3 140 Fiat Stilo 4900 60000 5 146 Toyota Previa 10700 12000 7 148 Chrysler Grand Voyager 6690 15000 10 156 Land Rover Range Rover 7735 12000 12 166 Mercedes M-Class 25810 19000 5 169 Ford Explorer 4715 10000 10 177 Mercedes A-Class 12320 80000 3 190 Ford Escort 1225 10000 11

Below there is a scatter graph showing the results that I got with the data I have shown above.

The line of best fit shows a medium negative correlation between the data which shows that as the age of a car increases the value of the car decreases. This is shown in this section of the graph.

This proves that my theory/assumption was correct; moreover the anomalies may have affected the results in my graph.

I have worked out the correlation coefficient;

r = -0.555694

r2 = -0.3087958216

= -30.87958216%

The correlation coefficient (product moment correlation) of the data for age and price used is -0.555. So according to the scale, this indicates my correlation as being a medium and weak correlation.

I have expanded the working out of the correlation by using Spearman’s rank.

Conclusion

make’ because it played a huge part in my data and showed a close relationship throughout.

Evaluation

To evaluate, I believe that the data I used was only a portion of the population that exists. Moreover it was also secondary data, which had negative and positive aspects about it. I think that if I had more time I would have researched for primary data to supplement my secondary data. This will make my results even more reliable then they are. Also increasing the amount of cars for my main study to about 100-150 cars to make my results more accurate. Also including other makes of cars, such as Renault, Jaguar, etc. I would have to make sure that I had equal numbers of each make for my investigation so that it does not become biased.

The thing I found easy was collecting the data, since it was already provided from Edexcel. I found it difficult to attempt in collecting primary data, since it is very time consuming. Another difficulty aspect of the study was sorting out the data, and choosing a sample, without having biased results, this lead on to analysing the results, which was quite hard, since the formulas/methods was complicated to use for some data.

Next time, I believe that I should consider the factor of ‘colour’ because:

Colour: Colours differ in Middle Eastern countries. If the colour of a car is ‘white’ it is known to be quite costly. However in European countries white is known to be rather cheap. To sum up I believe rich colour coated cars are more expensive.

This would be a great thing to investigate, since it is interesting and I would like to see the results that I obtain. Overall I feel that I have proven my hypotheses to be correct, and I am not surprised about with result I obtained.

This student written piece of work is one of many that can be found in our GCSE Gary's (and other) Car Sales section.

## Found what you're looking for?

• Start learning 29% faster today
• 150,000+ documents available
• Just £6.99 a month

Not the one? Search for your essay title...
• Join over 1.2 million students every month
• Accelerate your learning by 29%
• Unlimited access from just £6.99 per month

# Related GCSE Gary's (and other) Car Sales essays

1. ## Statistics: Factors Affecting the Price of Used Cars

Ford Fiesta 2005 20 045 5,599 29 Ford Fiesta 2005 23 553 5,399 30 Ford Fiesta 2005 18 366 5,499 31 Peugeot 206 2003 25 813 5,199 32 Peugeot 206 2002 17 988 4,999 33 Peugeot 206 2003 14 477 5,199 34 Peugeot 206 2003 22 583 5,299 35 Peugeot

2. ## This piece of coursework is designed to test the use and interpretation of statistics ...

I created this new column on Microsoft Excel by deducting the second-hand price from the price when new. I have done this for all of my 36 selected car records. The information in this column will help to identify the difference between the new and second-hand prices of each car.

1. ## Examine how the age of a second hand car affects its price. Also to ...

5 34% 16 Rover 820 SLi 21586 3795 6 18% 48 Hyundai Accent 6899 2800 6 41% 83 Rover 416i 14486 3685 6 25% 94 Rover Metro 5495 1995 7 36% 40 Fiat Tipo 8272 1500 7 18% 46 Nissan Sunny 7799 2595 7 33% 25 Rover Metro 6645 895

2. ## I have been given instructions to collect data for my GCSE statistics coursework and ...

A Survey can be particularly useful because the data is likely to be personal. It may fail however if a person was not telling the truth. SAMPLING When sample data are collected, information is taken from part of the population.

1. ## Statistic coursework-what has the most influence on the price of a second hand car?

19 Mercedes 12000 1998 58270 19260 20 Mercedes 51000 2001 17772 3895 21 Mercedes 16890 1998 15640 4635 22 Mercedes 12000 1999 41540 18020 23 Mercedes 21000 1995 101975 15105 24 Mercedes 12000 1995 40760 7355 25 Mercedes 20000 2000 31390 21170 26 Mercedes 12000 1997 30710 8805 27 Mercedes

2. ## Maths Data Handling-Secondhand Car

I shall look at percentage depreciation. I have used this formula to work out the percentage depreciation: There is only one real outlier in this graph and that belongs to the 15 year old Volkswagon Golf. The computer sets the depreciation for this car as 0 because there is no recorded value for the original price of the car in the spreadsheet.

1. ## Handling data Used car prices

5-6 Years The 21st, 11th, 1st, 18th, 15th, 23rd, 19th and 5th cars. 7-8 Years The 20th, 14th, 17th, 23rd, 22nd, 15th and 16th cars. 9-10 Years The 8th and 4th cars. 11+ Years The 1st car. The following cars have been included into the sample and will be used to find the relationship between age and value of depreciation.

2. ## Maths Coursework:Used Cars

This will then enable me to create a box plot to make information clearer. Median I will be finding the median instead of the mean because the mean is affected by very low or very high values in the data so the median is a better measure.

• Over 160,000 pieces
of student written work
• Annotated by
experienced teachers
• Ideas and feedback to