Driving test

GCSE coursework Aaron Patterson 11A

Introduction

I have been given the assignment of investigating how students perform in their driving test. I will look at different factors and see what effect they have on the number of mistakes the driver makes. The data of 240 records, subdivided into various categories (shown below), will be analysed.

Driver: The drivers name is not given- instead the name is replaced by a letter. This hides the identity of the student.

Gender: Shows if the driver is male or female. I could use the data to find out if males are better drivers than females.

Number of Lessons: I will investigate the number of lessons needed to pass the test. There is a big range in the data. This could be due to some students having extra lessons from friends and parents etc. Also some people may be lacking in self-assurance, which could effect their concentration while driving.

Number of Mistakes: A pupil has failed if they have made over 15 mistakes.

Instructor: Instructor names have been given letters A, B, C, and D to hide their identity. From the data it appears that instructors A and B have the most students than C and D. This is could be because instructors A and B are more popular e.g. Instructor A and B may be better teachers than C and D.

Time of Test: Timing of a test may have an effect how a student performs e.g. rush hour is between 9 – 5.

Day of test: The day the student takes their examination can affect their performance. E.g. busy roads Mon – Fri.

Firstly, I will tally up the amount of male and female pupils for each instructor.

Because of the large amount of statistics, I will take a sample of the data to make calculations easier to manage. The sample should represent the complete data set, so I will take a sample of 60 (a quarter of 240). I will ensure that the proportion from each instructor and of each gender is the same as complete data set.

I will use random numbers from the data to choose the sample. The data will be stratified by gender (male or female) and instructor (A, B, C &D).

If one of the records that I choose is incomplete, I will choose the nearest record ensuring instructor and gender are the same.

Checking the Data

I will now check that my sample is good a representation of my full data set. I will compare the two data sets by using box plots. By using box plots you can easily see the difference between the two sets of data. E.g. box plots show the quartiles, median, maximum & minimum values; hence I will be able to see how accurate my sample data is compared to the full data. I will draw two sets of box plots for:

Number of lessons
Number of mistakes

I will then compare the sample with the complete set of data to make sure that it is representative.

Box plot (a)-Number of Lessons

The table below is comparing the figures between the complete data and the sample from the “number of lessons” box plot.

All the results are very close, so I am satisfied that this is a good match.

Box plot (b) - Number of Mistakes

Again the results are very close.

I am confident that the sample matches up to the complete data. I will now continue with the investigation.

Hypothesis 1

I will investigate the hypothesis:

“The more lessons taken by a pupil the fewer mistakes they make in the test.’

I predict that this hypothesis is correct. This is because I think this because the more

you practice at something, generally the better you become at it hence the saying practice makes perfect. Therefore I anticipate that there is a negative ...