This box and whisker diagram makes it clear who are the better drivers!
From my diagrams and graphs it is clear to see that the 30 random men are better drivers than the 30 random women. Choosing 30 people from each gender randomly gives me a good amount of information to draw accurate results. I am not dealing with too many pieces of information and I can draw a conclusion for this hypothesis.
It would have been interesting to see what my histograms would have looked like if I had used all 240 pieces of data. I believe it would have been very similar to my results as there is a large difference in the mean values of males and females. I do not believe that the females would have been better drivers because of this reason.
The piece of data I believe to be most valuable was the differences in the mean values between males and females. On average males made 5.3 mistakes less than there women friends. This shows us that men are better drivers than women.
HYPOTHESIS TWO – DRIVERS WILL MAKE LESS MISTAKES IN THEIR TEST IF THEY HAVE MORE LESSONS
Now that we have found that being male makes you better at driving, we can now see if a person’s driving can be improved by the number of lessons they take. We will see if males are affected by the number of lessons they receive and if females are affected.
To do this we will take a stratified sample. This means that instead of randomly selecting my drivers I will personally select drivers. The reason I am doing this is to take drivers from all 4 different instructors so I will get a range of results. However, each instructor gave lessons to a different amount of people so taking an equal amount of people from each instructor would not be appropriate. Instead I took a number of people according to how many drivers the instructor took. For example for males instructor C only took a 16 people for their test, where as instructor B took around 60. For my 30 males and 30 females the number of people for each instructor where as follows
- Instructor A – 8 people
- Instructor B – 12 people
- Instructor C – 5 people
- Instructor D – 5 people
I constructed two scatter graphs with number of mistakes going along the y axis and number of lessons along the x axis, One scatter graph for males and one for females.
Male scatter graph
Female scatter graph
In the male scatter graph we can clearly see that the line of best fit goes down a strong negative correlation. We see that when more lessons are taken, less mistakes are made in the test. We can also see that there are a large number of points in the bottom right hand corner of the graph. This tells us that when a male driver takes between 30 and 40 lessons he will make less than 10 mistakes in his test, this is true for all the points in the 30 to 40 lesson region. This proves that my hypothesis that drivers can improve there driving with more lessons is correct for males.
For the females it is a different story. There seems to be no correlation in my results as there are points scattered everywhere. One would expect that the more lessons a driver makes should make them a better driver. However, my line of best fit shows that in fact for the 30 females used in this hypothesis, the more lessons they take the worse drivers they become! This is unusual and this does not make much sense.
I believe that the 30 females used in my test do not show the truth. I believe if I were to take another set of 30 females I would get entirely different results. I believe that the data I used for females was not reliable and my graph shows this.
Overall, my hypothesis is half correct. I said that all drivers would get a better result in their test with more lessons. This is true for male drivers but not for females. However, I think that if I had taken all 120 drivers from each gender I would have different results. I believe that my results for males are reliable but for females it is unreliable. A repeat of this investigation would have made my results more accurate.
HYPOTHESIS THREE – DRIVING INSTRUCTORS PREFER MALE DRIVERS TO FEMALE DRIVERS
My third hypothesis will show if the instructor makes a difference in how many mistakes a driver makes and if the instructor prefers taking males or females for driving lessons.
For this hypothesis I will be using 30 male drivers from each instructor and 30 female drivers for each instructor. Unfortunately each instructor does not take 30 drivers of each gender. For the instructors who do not have 30 drivers I will use as many drivers as they have. For example instructor C only takes 18 males for their test, not 30. So I will use as many as possible, all 18.
INSTRUCTOR A
MALE DRIVERS FEMALE DRIVERS
Here are the data I will be using for my graphs for Instructor A. I will construct a scatter graph for each driving instructor showing mistakes against lessons, one graph for males and one for females. This will show me if the instructor prefers to take male drivers, and is he more effective at teaching males.
We have a complete set of data here with no missing values, this helps us to make more accurate conclusions. There are 29 pieces of data for males and 30 for females.
Remember that number of mistakes are on the y axis and number of lessons are on the x axis.
Here we can see that instructor A treats female and male drivers the same. The lines of best fit go in the same direction at almost the same angle. The best fit line seems to go through a number of points, showing us that it is accurate.
The correlation coeff shows us that for females it is -0.401 and for males is is -0.629. This shows us that the lines are very similar meaning that the instructor teaches men and women equally.
INSTRUCTOR B
MALE DRIVERS FEMALE DRIVERS
I do not have complete data for instructor B. If I receive reliable results I may be able to predict by using my best fit line what the missing values are. If my results are not reliable I will draw incorrect conclusions.
I have the full 30 males and females for instructor B.
MALE DRIVERS – INSTRUCTOR B
FEMALE DRIVERS – INSTRUCTOR B
Here we can see that for male drivers there is a strong negative correlation. That the more lessons he takes the less mistakes he makes. This is very true for Instructor B as most of the points move along the best fit line. This tells us that this instructor teaches males well.
However, for the females there seems to be no correlation whatsoever. The best fit line does not move in the direction I expected but in the opposite direction. This is a similar graph we encountered in hypothesis two where there was no correlation for female drivers.
From this graph we can clearly see that Instructor B prefers to do drive with male drivers. This is because there is a strong correlation for them but no correlation for female drivers. This tells us that if you were a female driving with Instructor B would not be helpful to you.
There is such a strong negative correlation for male drivers with instructor B that I will be able to fill in data that was missing during the investigation.
Here is the graph with the results. Using the blue line of best fit I will be able to predict what the missing values were by reading off the number of mistakes (y axis) and seeing where it hits my line of best fit.
The missing values are filled in with bold.
I could only use this method for the males using Instructor B because there was no correlation with the females, and there would have been false results.
Driving Instructor B proves my hypothesis that he prefers male drivers to females.
INSTRUCTOR C
MALE DRIVERS FEMALE DRIVERS
Instructor C conducted the least amount of test out of all the Instructors. I do not have enough data to draw reliable conclusions from. I would need more data to have complete results that will accurately show who the instructor prefers to take tests with.
MALE DRIVERS – INSTRUCTOR C
FEMALE DRIVERS – INSTRUCTOR C
From these two scatter diagrams we can see that instructor C treats both genders of driver equally. For both diagrams there is a strong negative correlation and this was what I expected in hypothesis 2. Here for both graphs only the people who took between 30 and 40 lessons made under 10 mistakes. This shows that the instructor is not as good a teacher as the others but treats them equally. This does not follow my hypothesis that the instructors give males less mistakes than females.
INSTRUCTOR D
MALE DRIVERS FEMALE DRIVERS
I do not have many drivers that used instructor D either but I do have more than C. I will be able to deduce some accurate conclusions from my graphs.
MALE DRIVERS – INSTRUCTOR D
FEMALE DRIVERS – INSTRUCTOR D
Instructor D marks is the least effective instructor. There are few points below the 10 mistakes line on my graph. Most people for instructor D both male and female make over 25 mistakes.
For the males the line of best fit does not move along where the points are situated. The points are scattered around the graph and without the line of best fit seem to show little correlation. There is a small group on the male graph who make 25 – 30 mistakes and take 25-30 lessons. This shows that the instructor is inconsistent in how he treats males.
He treats females slightly worse but he is very consistent on how he treats them. All the points lie close to the best fit line and there is a strong negative correlation. This shows us that he treats both genders fairly similarly but treats the females slightly harsher. This fits my hypothesis.
I believe that my scatter graphs are not easy to interpret and comparing the male and female results is difficult. This is why I have constructed box and whisker diagrams to make my results clearer and see if my hypothesis is correct.
Remember that mistakes (x axis) B for these box and whisker diagrams.
MALE DRIVERS
This box and whisker diagram shows how each instructor treated the males. Instructor A gives the lowest average amount of mistakes with 10 and instructor D gave people the most mistakes with an average of 24. Instructor C gave a wide range of results, more than any other instructor and his inter-quartile range is the largest. These instructors generally prefer males and none of them give any driver over 35 mistakes in their test.
FEMALE DRIVERS
When I compare the female box and whisker to the male box and whisker, the first thing that catches my eye is that for females all the average number of mistakes are above the male average number of mistakes. This shows me that every instructor gives females more mistakes than they do males.
Instructor C is the least consistent with his results as his inter-quartile ranges are the largest. He gives a wide range of results compared to instructor A and D who has the majority of their drivers in the same area.
The box and whisker diagrams are very helpful in exploring my hypothesis. I said that instructors prefer male drivers to females and I have proven this with my analysis.
CONCLUSION
My three hypothesis’s are linked together. In my first hypothesis I said that male drivers were better than females. I found this to be correct. I than asked if the number of lessons would affect the number of mistakes in the test. I found that they affected males but not females. I believed this to be unreliable data and a way of making it more accurate would have been to repeat the investigation. My third hypothesis was if the instructor made a difference on the number of mistakes you made and if the instructor preferred taking male drivers to females. I found that this was correct and that different instructors marked differently.
I would say that my investigation was a successful one and my hypothesis’ were correct. The software I used was accurate and gave me reliable results. Overall male drivers were better than females!