Maths coursework - car prices

Authors Avatar
Contents

INTRODUCTION 2

SAMPLING 2

HYPOTHESIS 2

First hypothesis 2

Second hypothesis 3

Third hypothesis 3

Fourth hypothesis 3

METHOD ON REMOVING OUTLIERS 4

PLAN 4

GRAPHS AND ANALYSIS 5

SCATTER DIAGRAMS 5

PRICES VS. AGE 5

Table of predicted age of car from the line of best fit 7

PRICES VS. AGE WITHOUT OUTLIERS 7

Table of predicted age of car from the line of best fit 8

PRICE VS. MILEAGE 9

Table of predicted mileage of car from the line of best fit 10

PRICE VS. MILEAGE WITHOUT OUTLIERS 11

Table of predicted mileage of car from the line of best fit 12

MILEAGE VS. AGE WITHOUT OUTLIERS 13

DEPRECIATION 15

Box plot 1 with outliers 16

Box plot 1b without outliers 16

ANALYSIS 17

CORRELATIONS 17

Graph 1 17

Graph 1b 17

Graph 2 17

Graph 2b 17

OUTLIERS OBSERVATIONS 18

Observations between graph with outliers and without outliers 18

Formula of outliers 18

Cars being outliers 18

Identified outliers with reasons 18

Removal on outliers 19

EQUATION OF BEST FIT 19

HYPOTHESIS CONCLUSIONS 20

First hypothesis 20

Second hypothesis 20

Coursework Improvements 20

Introduction

In my coursework I have to investigate the influences in car prices of the second hand price samples I am taking. In the coursework you can choose 20 factors but I am going to choose only 2 because I consider that the 2 I am picking will affect the prices all second hand cars the most. The first factor I am going to select is age because I believe that age of a second hand car can affect the price because of how good the condition of the car has been in because when you have an car that is 10 years old it is most likely that it needs repairs and also having a 10 year old car the parts of the car will start to fail because it is old so the price is put lower as of the condition of the car, the second factor I have consider is mileage because I believe that mileage depending on how much a car has travelled can affect the condition of the used cars with its price because when the car has travelled over 50,000 miles the parts of the car will deteriorate because the car parts start to be worn out so I think that the more mileage done on the car the less the price is.

Sampling

Out of the 204 samples I have chosen 110 samples because I believe that it will give more evidence to my hypothesis because the more the amount of samples chosen the most likely that your results would be correct also with the more samples I have taken the results would be more accurate e.g. if I had chosen 50 cars which is not a wide range I would not get accurate results because it is an smaller number but 110 cars is more of a wide range of cars which I have more data to compare and find more accurate results. I had collected the data from 204 samples to 110 samples by using a calculator and typing in the following keys, which are, (Input 204 x Ran) this gave me random numbers, which I rounded the number to the nearest whole number if the number on the calculator had been to one decimal place e.g. 34.1 I rounded to the nearest whole number which is 34 which then I would highlight the 110 cars that are staying in the table from the 204 numbers until I had the size of the sample I required for my data.

Hypothesis

In my hypothesis I will explain the different hypothesis, which I will show at the conclusion if my hypothesis is correct.

First hypothesis

The first of my hypothesis is that I believe that the:

H0. Price of the car is not related to how old the car is in second hand cars.

H1. Alternative hypothesis is there is a negative correlation between age and price i.e. as the age of the car increases, the price of the car decreases.

I will test this hypothesis by drawing a scatter diagram of 'price now' against the 'age of the car', I will then obtain the line of best fit through points of prices against age and estimate from the equation able tremline of my scatter graph, which then I will make predications from the line of best fit, which I will then conclude the scatter graph and refer back to my predications.

Second hypothesis

The second of my hypothesis is that I believe that there is:

H0. No correlation between the price of cars and how far it has travelled in mileage.

H1. Alternative hypothesis is there is a negative correlation between price and mileage i.e. the most mileage a car has travelled the price of the car goes down.

I will test this hypothesis by drawing a scatter diagram of 'price now' against the 'mileage of the car', I will then obtain the line of best fit through points of prices against age and estimate from the equation able tremline of my scatter graph, which then I will make predications from the line of best fit, which I will then conclude the scatter graph and refer back to my predications.

Third hypothesis

The third of my hypothesis is that I believe that there is:

H0. The percentage rate of depreciation is constant from one year to the next.

H1. The rate of depreciation is more in the first year and then decreases year to year.

I will test this hypothesis by drawing a scatter diagram of 'age' against the 'depreciation of the car', I will then obtain the line of best fit through points of age against depreciation, which then I will make predications from the line of best fit, which I will then conclude the scatter graph and refer back to my predications.

Fourth hypothesis

The fourth of my hypothesis is that I believe that there is:

H0. There is no correlation between mileages vs. age

H1. There is a positive correlation between mileages vs. age.

I will test this hypothesis by drawing a scatter diagram of 'mileage' against the 'age', I will then obtain the line of best fit through points of mileage against age, which then I will make predications from the line of best fit, which I will then conclude the scatter graph and refer back to my predications.

Method on removing outliers

To remove my outliers of my second hand car I will use quartiles to find out the outliers of my second hand cars, which then I will remove, and any second hand cars over the greater than or less than outliers. I will first check before a calculate to see if there is any outliers by determining what type of distribution I have in my spreadsheet that I have created which I will see if it's a normal distribution, positive skew or a negative skew. I then will check the distribution and type in the greater than formula which is (Q3+1.5*IQR) this formula will tell me the greater outliers which then I will see if any cars go over the range of the greater than outliers which I will delete the car e.g. the greater than outlier is 400 and one off my cars is 500 I will delete that car because they are large values. Also for the negative skew I will type in the formula which is (Q3-1.5*IQR) this formula will tell me the less than outliers which I will see if any of my second hand price car is below that outlier and if a car is I will then delete the car from the spreadsheet e.g. the less than outlier is -1 and one of my cars is -2 I will delete the car because it is a small value.
Join now!


Plan

In my coursework I want to see which factor affects the price of the car the most, which values its car at a lower price. Im first going to write up how hypothesis and list factors which I have chosen. Im then going to make scatter graphs with outliers and the same without outliers where I will compare the graphs with the outliers and without the outliers, which then Im going to show the line of best fit and show predictions from the line of best fit from the priced values given which are £6000, ...

This is a preview of the whole essay