# Maths driving test

Extracts from this document...

Introduction

Introduction

The hypothesis I am testing is:

- “The number of mistakes a candidate makes during their driving test is affected by the number of one hour lessons that they have had.”

In this report I should find out what affects the number of mistakes a candidate makes during his/her driving test.

Expectations

- I expect that the number of mistakes made will be affected by the number of lessons taken. I think this as the more lessons you have make the driver learn more efficiently. Also this could give the driver more confidence in driving.

Exceptions

There are some factors which could affect this hypothesis.

- The driving instructor- this could affect the number of mistakes as the instructor may not be very good. Also some instructors have better teaching methods than others and this may make a candidate learn quicker.
- Gender of the candidate- it could depend if the candidate is male or female. . It has been stated that boys and girls perform differently, and that boys often do better in practical tests that girls do. This may mean that males won’t need as many one hour lessons that girls need.
- Any extra practise form siblings or parents- some candidates may have used their extra time to take part in any extra driving practise. The data provided doesn’t state whether or not anyone has had any extra practise and this could affect the number of mistakes made. This also means that the candidate will have more driving practise than the rest who haven’t had any extra practise.
- Weather conditions- if a candidate takes their test in sunny weather, this could affect the mistakes made as sometime the sun may prevent the candidate form seeing properly. Also if it snows or the roads are icy, this may result in having an accident
- Natural Learners-some people are just more natural than others and pick up driving quite quickly, this may result in fewer lessons taken. But on the other hand some candidates may be a lot slower and not interpret information that they have learnt in their driving lessons as quicker. The natural learners will do a lot better then as they have understood and remembered what they have been taught.
- The day and time a candidate takes their test- this may affect the number of mistakes mad as the candidate could take their test during rush hour or when there is hardly any traffic. If the test is taken during rush hour then this may make the candidate more nervous as there may be a lot of traffic, they could also be afraid of occurring an accident and therefore perform more mistakes. However, if the candidate takes there test during a time when there is not much traffic, they will not stress and make fewer mistakes.
- Area that the test is taken in- if the candidate takes their test in a place that they are quite familiar with, they may make less mistakes as they know there way around. However if the candidate takes their test in a place that they are not familiar with then they may make more mistakes.

DATA-The data I am using to test my hypothesis with is secondary data. The data consists of the gender of the candidate, the number of one hour lessons that the candidate has had the number of minor mistakes the candidate has made in their driving test, the instructor they have had before they have taken the test and the time and date of their driving test. Most of the data is qualitative (e.g. the time of day, number of one hour lessons and number of mistakes.) In the qualitative data there is discrete and continuous data. The time of data is continuous as the candidates can be tested at any time of day, and the number of mistakes and lessons is discrete as it takes on whole values and not any decimals. This is secondary data as it has been obtained from another source, e.g. internet. As it is secondary data it may contain any anomalies, outliers and missing data which may have an impact on the outcome of the results. As some parts of the data may be incorrect I will need to clean it up and delete any of the previous data which is no good for me, before making ay assumptions about the hypothesis.

Missing data- several rows have been deleted as all the information isn’t present in the spreadsheet. I deleted these rows as it isn’t reliable for my project. In this example below, the number of minor mistakes isn’t present. This wouldn’t have been useful as it wouldn’t be able to prove my hypothesis to be true or false. The hypothesis relies on the data “number of minor mistakes made” to be able to do a full investigation. The entire row will have to be deleted in order to make further investigations accurate.

- I also found an outlier as shown below. The outlier said that the candidate had only taken 10 lessons and makes 1 mistake. However I have decided not to delete this from the data as it shows examples of some ‘extremes’ and also it is plausible so it will tell me something important about candidates and driving tests; if some candidates are faster at learning that others.

- I didn’t find any examples of any anomalies.

I did a test sample in order to make sure that the hypothesis I am proving has an answer. Also this will make sure that the rest of the work that I will do, will be worthwhile and tells me if there is any relationship between the number of mistakes and number of lessons taken. I have taken the data of the first ten candidates from each instructor and put this into the graph, so there is a sample of 40 candidates in total. I chose 40 as it is a suitable amount of sampling.

- The graph shows that there is a negative correlation with a coefficient of -0.5905. The relationship between the number of one hour lessons and the amount of mistakes made tells us that there is a fairly strong correlation. Even though it isn’t really strong, it shows some kind of correlation. This means that my hypothesis is worth testing for and says that there are other factors which can have an affect on the number of mistakes made during a driving test. These other factors are the exceptions.
- Therefore, the hypothesis will be likely to be reliable and tells us that the more one hour lessons you have the fewer mistakes you will make. In order to find a definite correlation I will need to study the data and hypothesis further.
- I put in a line of best fit as there seems to be some kind of correlation between the points. The equation of the line of best fit is “y=-0.6124x+29.58”
- The gradient is -0.6124 and tells you every time I have one lesson, my number of mistakes will go down by 0.6124.
- The y-intercept is 29.58, which isn’t valid as you wouldn’t take your driving test if you didn’t have any lessons. This tells you that if you have no lessons then you will make 29.58 mistakes and this cannot happen as you wouldn’t consider taking your driving test without any lessons taken beforehand
- The expectation that I wrote down became true because the more 1 hour lessons you have the fewer mistakes you perform. To get a more accurate correlation, I need to analyse the data further and take other factors into account later.

Hypothesis: The number of mistakes a candidate makes during their driving test is affected by the number of one hour lessons that they have had.”

I will need to do a stratified sample to see whether or not the number of one hour lessons affects the number of mistakes. I am only doing a sample of 100 as it will be too time consuming to do all the population. This is because I need to take a fairly large portion of the population in order to get an accurate representation, in case I coincidentally choose a section of the data that has a particularly strong or weak relationship. Also by using a stratified sample it will ensure that the data will be proportional. This will mean that none of the instructors will be either underestimated or overestimated and this will make it fairer. The sample number that I have chosen to do is 100, because it is a suitable amount of data to test. In order to get the correct samples from each instructor I will need to do the calculation:

The number of candidates for each instructor ÷ the total number of candidates × 100

- Total number of candidates for all the instructors was 227.
- Total number of candidates for A was 60
- Total number of candidates for B was 93
- Total number of candidates for C was 24
- Total number of candidates for D was 50

- Instructor A) 60/ 227 * 100 = 26.431718

There will be 26 samples from A

- Instructor B) 93/227 * 100 = 40.969163

There will be 41 samples from B

- Instructor C) 24/227 * 100 = 10.572687

There will be 11 samples from C

- Instructor D) 50/227 * 100 = 22.026432

Middle

0 0 0 0 0 0

KEY

2 | 3 |

This is an example | This tells us that the number of lessons is 23 |

- The range of the number of lessons was 32, and this shows that the results are varied and would have depended on more than one factor.
- The average was 22.7, which shows that the average amount of lessons taken was 23 and therefore could’ve been achieved easily.

Number of minor mistakes

0

6 4 2 5 4 5 9 4 2 1 3 1 5 6 3 1 3 5 4 9 8 8 1 3 9 5 6 5

1

6 9 7 3 3 9 0 5 9 9 5 6 5 9 5 5 0 7 9 7 2 2 1 5 9 0 3 0 1 5 2 4 9 1 2 9 0 4 4 4 4 0 6

2

7 7 5 4 8 1 4 5 7 6 3 8 2 6 4 4 3 1 2 8 7 1

3

6 3 2 0 2 3 0

KEY

3 | 6 |

This is an example | This tells us that the number of lessons is 36 |

- The range of the number of minor mistakes was 35, and this shows that the results are varied and would’ve depended on more than one factor.
- The average was 25.29, which shows that the average amount of mistakes made was 25 and therefore could’ve been achieved easily.

I did these stem and leaf diagrams as it was the easiest way to order the data and look at it separately, before I made any conclusions about the number of lessons and number of mistakes relationship.

Below is the graph of the results found from the sample: y=-0.3468x+22.97

- I put a line of best fit in as some of the results looked like they followed a trend and also I wanted to check if the sample followed the hypothesis.
- The gradient of the line of best fit was -0.3468 which shows that as you have one lesson your number of mistakes decreases by 0.3468. And since the relationship of the line isn’t perfect this means that there are people who do not follow the trend.
- The y-intercept is 22.97 which tell us that when you have no lessons you will make 22.97 mistakes. This is incorrect as you wouldn’t take your driving test if you didn’t take any lessons.
- The correlation co-efficient is -0.3538 which is negative and moderately weak correlation. This weak correlation doesn’t give me confidence in the hypothesis, but this could be affected by the exceptions as written previously. The exceptions could have an impact on how many mistakes you make.

Conclusion

One of the limitations of the project was that there were not enough candidates of Instructor C, so this limited the data. The results I had may not have been reliable but the data seems relatively reliable and not too biased.

All the way through males seem to be the better drivers. Referring back to the original hypothesis, the number of lessons does affect the number of mistakes made, but this relationship is also affected by gender and instructor. Furthermore, it can be seen that males seem to be better drivers although they are unaffected by instructor. Women are not as good drivers so gender affects performance, but females also show that instructor has an affect on how well a person performs in their driving test. Overall, if I was one of the candidates I would choose instructor C mainly because gender wouldn’t affect the number of mistakes and also because the number of lessons against mistakes are quite low.

This student written piece of work is one of many that can be found in our GCSE Miscellaneous section.

## Found what you're looking for?

- Start learning 29% faster today
- 150,000+ documents available
- Just £6.99 a month