The aim of this experiment is: To observe how differences from car to car effects the second hand value e.g. colour, make, mileage, engine size length of MOT and the number of seats.

Plan

The aim of this experiment is:

To observe how differences from car to car effects the second hand value e.g. colour, make, mileage, engine size length of MOT and the number of seats.
To do this I will display the data in a wide range of graphs and charts from which I will make comparisons.
To select the data in the first place I will use a range of sampling methods. Systematic sampling an example of this is selecting 10% of the data by taking every tenth value etc. For this method to work the data must be arranged in an unbiased way in no particular order (random). Attribute sampling this is were the data chosen would depend on a completely different factor e.g. if I want to select the data for mileage I may use red and blue cars, as this doesn’t affect the data in any way. This has one set back as sometimes the other variable may have an effect on the data without you knowing but this is a good sampling method to use as I have lots of sets of data which otherwise would not be used. Stratified sampling this is were the data is put into sub groups for example if there are 3 times more cars that are diesel than petrol there should be 3 times more in the sample. Random sampling in a random sample every set of data has a chance of being used to do this data values could be drawn out of a hat or given a number and select a number at random. Quota sampling this is were the data used has to be from a certain sub group i.e. Vauxhall. Cluster sampling the population is divided into small groups called clusters then one or more of these clusters are selected. Stratified random sampling this is obtained by separating the data in to appropriate categories called stratas e.g. by mileage. Then find out what percentage of all the data Is in each strata then selecting a random sample form each sample in proportion to its size.
When the data is chosen by whichever method it will be placed in a table an then represented on a graph the graphs I will use are: scatter graph, cumulative frequency curve.
On my scatter graph I will work out and draw the line of best fit going through the mean and work out the least squares regression line, standard deviation, and the spearmans rank correlation coefficient all these methods will be explained further in my explanation page.
From the cumulative frequency the median and upper and lower quartiles will be worked out these values will then be displayed in box plots and then all the data will be analyzed see explanation for further details

Explanation

Cumulative frequency curve: the frequency tells you how often a particular result was obtained. The cumulative frequency tells you how often a result was obtained which is less than ‹ or less than or equal to ≤. The cumulative frequency is given by adding all the frequencies together to give a running total. I will take a random sample of 20 sets of data from the 0-3 years old range 20 from the 3-6 years old range and 6-9 years old range. These sets of data will be shown on 3 separate graphs and from that I will work out the median. This is done by taking the middle value from the % value decrease and drawing a straight line vertically upward up to the curve. Were it will pass horizontally to the y axis in line horizontally until it touches the y axis this value will be marked off on the y-axis and the value will be noted as the median. The lower quartile will be done in a similar style using a quarter of the way through the x-axis and the upper quartile will be done similarly using 3 quarters of the y-axis. Box plots are used to show the data is distributed it portraits median and higher and lower quartiles as well as either end of the scale. I will take the median, lower and upper quartile values from my cumulative frequency curves and place them on 3 separate Box plots. From this I will make links and comparisons between the 3 of them.

Scatter graph: for this I will take a systematic sample of each of the second hand values: 67% between 0-5k, 24% 5-10k, 6% 10-15k, 3% over 15k. I am going to use an over all sample of 30 sets of data to work out the proportion of them that should be between 0-5k I times 30 by 0.67 for 5-10 by 0.24 etc. This works out 67% of 30 and 24% of 30 therefore giving me a stratified sample adding up to 30. I will use a random sample of 20 values between 0-5. 7 values between 5-10. 2 ...