By drawing a cumulative frequency graph and a histogram it will enable me to see whether my sample is normally distributed, this will allow me to see how accurately the findings of my investigation can be applied to the population as a whole.
My Prediction
I think that as the Key Stage 2 results will increase the same thing will happen to the IQ of the student. I also believe that there will be a positive correlation between the two – as both sets of information are supposed to be measures of a person’s intelligence.
I think that boys will be proven to be cleverer than girls. I will show this by doing a cumulative frequency graph. I believe that the results for the girls will be more towards the left of the graph whereas the results
For the boys will be more towards the right. Each cumulative line will be a different colour. On the same page I will draw a bar by the side that will be coloured in, according to which line is more towards the right.
The Pilot Test
I have conducted two pilot tests; one for key stage 3 and the other for key stage 4. These pilot tests will show me the relationship of the pupils IQ and the Key stage results. I have conducted the tests using all of the data that I was given from Mayfield Data High. To show the pilot test I have presented it as a scatter graph using Microsoft Excel, so that I can easily see the correlation and relation of the two.
I can see from both pilot tests that there is a positive correlation, showing me that there is a relationship between the IQ and the Maths Score. This relationship is that, the higher the Key Stage 2 Score the higher the IQ. From the graph I can also see that there are a few points in the graph that does not follow this pattern. These points are called outliers.
For both of these pilot tests I found the average IQ results and took the key stage 2 results and made another scatter graph. Here are the scores that I used to conduct the scatter graphs.
Put average graphs after the table
The reason that I took the average IQ was so that I could see how the average IQ compared with the first graph where I only used IQ and not the average IQ. By using the averages it gives me a good indication to where the points should lie on the graph.
By looking at these graphs I can easily see where most of the results should lie on the graph, in accordance to where the averages lie on the graph.
The Sampling Techniques
I was given a large amount of data, which consisted of 1183 pupils. Because this is too much to analyse I am going to break it down into a smaller sample. From the given data I will chose 10% so I will chose 120 pupils. I can pick this sample by using different sampling methods. These methods consist of:
Stratified Sampling
I will use stratified sampling this is because it is the best method to use when you have a large amount of data. A stratified sample is one in, which the population is divided into groups called strata and each stratum is randomly selected. Stratified sampling is when you pick out a sample so that it will be equally the same to the overall data so that a correct analysis can be made from the data.
Systematic Sampling
This is a method which involves a system which involves regularity. For example you could take every tenth pupil in the list. This type of sampling is unbiased and it does not need the data to be in any sort of particular order. But when using this type of sampling you have to be careful as you can sometimes get answers, which can represent one piece of data but not others.
For example if I were to chose from a sample which was listed in order of age between 5 – 15 years, it would become unrepresentative if I ended up with a sample of 30 10 year olds and no other pupils from any other age group. Even though you have to be careful when using this method it is quicker than random sampling.
Quota Sampling
Quota sampling is a technique that brings in the control of quotas, that is to say, the particular person whose task it is to compile the sample, is given a series of quotas. This kind of sample is often used in market research surveys. As explained, each interviewer is given definite instructions, about the public he/she will ask.
Cluster Sampling
This is when the pupils are put into small identical clusters. A random sample of clusters is chosen and every item in that cluster is surveyed. A large number of small clusters minimises the risk of this being unrepresentative. An advantage of using this type of sampling is that it can be quick, however the disadvantage of this sampling technique is that it may not represent every single pupil that goes to that school.
Random Sampling
Another sampling method that I will use is random sampling. Random sampling is a method, which involves numbering the data, which is going to be used. You then pick from the sample by using random numbers, which is generated by the computer. This is a good method as it is unbiased, which lets all of the data to have a fair and equal chance of being selected. However by using random sampling you can lose time, as it is very time consuming.
I will use this data to compare the IQ results with the Key Stage 2 results, to see if there is a correlation or pattern between the two results. To represent my findings I will use a scatter graph – to see if there is a correlation, a histogram – to see whether the data is normally distributed, a cumulative frequency graph – to see whether boys or girls are more intelligent, and a box plot – as this will allow me to quickly compare the two sets of data.
I will use a scatter graph because I want to see the relationship between the IQ and the Key Stage 2 results. If my prediction is correct there should be a positive correlation – showing that the higher the IQ results the higher the Key Stage 2 results will be. By using a scatter graph I should be able to draw a line of best fit, which I can then use to find an equation, which will help me find a result for either the IQ or the Key Stage 2 score if I know what the other result is.
Picking My Sample
After explaining the different sampling methods that I could use, I feel that stratified sampling would be the most suitable way to obtain an unbiased sample it is also to make sure that the sample accurately reflects the original data. I will pick my sample of 120 pupils by stratifying the data by gender and age. I have chosen gender as a variable because this links to one of my hypothesis, and I also feel that the gender can affect my results.
Firstly I will gather al the data for gender in each year group into a pivot table:
From this table I can see the number of girls and boys in each year group and the total number of girls and boys as a grand total. I then gained a stratum for the boys and girls by doing the following sum:
As you can see the table shows that I am going to chose 61 boys and 59 girls. However I am going to change those results slightly by making them both 60 each, making it a fair test, from all of the year groups as a whole stratum. As the number of boys and girls that should be selected using a stratified sample is roughly equal I will add one girl to the sample and take away one boy as this will make comparisons between the two genders a little easier later on. To select the number of each gender in each year I will use the following method:
DIVIDE THE SAMPLE SIZE OF 60 BY the total number of boys/girls in the original data and the MULTIPY THIS NUMBER BY THE TOTAL OF GENDER IN EACH YEAR GROUP.
I will show you the sample size for year 7:
I will then use this same calculation to compile a stratified sample from each year group. To do this I used a spreadsheet program to help me with the formulas and also to speed the process up. This is how I did calculations to help me chose my actual sample size:
This is the formula that I used to obtain my answers for each year group:
From the male and female sample size in each year group, I will use random sampling to pick my sample from the given data. For random sampling I will number the data, which is going to be used. I am then going to pick from the sample by using random numbers, which will be generated by the calculator, using the random number button on the calculator (Ran#) to select a random sample. This is a good method as it is unbiased, which lets all of the data to have a fair and equal chance of being selected.
For example the random for year 7 girls was obtained like this:
Firstly I will:
-
Arrange the year 7 girls in number form 1-131
-
Then Press x = 55.02 ≈ 55
- I then choose the girl numbered 55 as the first piece of information from my data.
- I then repeated this process until I had 13 year 7 girls
- I also repeated for each of the other genders in each of the other year groups
My Sample
Once I had my sample I had noticed that one of the sets of data did not contain an IQ number. So I went back to whole data that I was given and a chose another pupil with the same results as the pupil that I had chosen before. The pupil that I have now used has been indicated below- it has been underlined and it is also in bold. Below is my replaced sample that I shall be using for my investigation:
Now that I have selected my data I can continue with my investigation.
I will now draw two cumulative frequency graphs. The first graph will compare the Average Key Stage 2 results with the IQ.
I can see from both pilot tests that there is a positive correlation, showing me that there is a relationship between the IQ and the Maths Score. This relationship is that, the higher the Key Stage 2 Score the higher the IQ. From the graph I can also see that there are a few points in the graph that does not follow this pattern. These points are called outliers
I can see from this graph that there is a positive correlation, showing me that there is a relationship between the IQ and the Average Key Stage 2 scores. This relationship is that, the higher the Key Stage 2 Score the higher the IQ. From the graph I can also see that there are a few points in the graph that does not follow this pattern. These points are called outliers. The equation of the line is: y= 12.08x + 49.081.
For the second graph I took the average IQ and the Average Key Stage 2 score and made another scatter graph.
For both of these pilot tests I found the average IQ results and took the average key stage 2 results and made another scatter graph. Here are the scores that I used to conduct the scatter graphs:
Table goes before graph
The reason that I took the average IQ was so that I could see how the average IQ compared with the first graph. where I only used IQ and not the average IQ. By using the averages it gives me a good indication to where the points should lie on the graph.
By looking at these graphs I can easily see where most of the results should lie on the graph, in accordance to where the averages lie on the graph. The equation of the line is: y= 14.41x + 42.684.
Cumulative Frequency Graph for both Girls and Boys
By looking at the cumulative frequency graph I can see that there is not much difference between the girl’s frequency and the boy’s frequency. However it seems that the girls are more towards the right taking the lead. Nevertheless at some points both genders seem to be equal. Because it is difficult to see this I have drawn a bar which will allow me to see which gender is cleverer than the other. At the bottom of the graph I also due two box plots, one for each of the genders. I can easily see that the girl’s hypothesis tends to be longer than the boy’s box plot. By looking at the bar I can see that the girls are cleverer than the boys, proving that my sub-hypothesis is correct.
The cumulative frequency graph shows that my prediction- girls are more intelligent than boys- was correct for my sample. This can be seen as the cumulative frequency graph for the girls was to the right as that for the boys for almost the whole range of the graphs, although there were two areas where the two graphs seemed to mirror one another. Having said this, the graphs also show:
- The girls and boys lower quartile were the same.
- The girls’ median was slightly more intelligent, which shows that the girls are slightly are more intelligent.
- The girls’ upper quartile was slightly higher which shows that girls are more intelligent.
- The boys interquartile range was slightly lower which shows that the IQ of the boys showed a little less variation, which means their results are a little more reliable.
- The range of the boys results (41) was slightly lower than that for the girls (54) which would seem to show that the girls’ results included more extreme values (outliers).
Standard Deviation
I have also worked out the average IQ for the boys and the girls as well as the IQ for the boys and the girls as well as the standard deviation and the percentage deviation. By doing this I will more accurately their IQs.
Average KS2 vs. IQ graph for Girls
For this graph I took the average KS2 was so that I could compare the average KS2 with the actual KS2 scores. The average KS2 score was complied by adding together each of the pupil’s scores from English, Maths and Science and then dividing that result by 3. By looking at the graph I can see that there is a positive correlation. Because the correlation is positive it shows me that the higher I IQ the higher the average KS2. The equation of the line is: y= 13.119x + 48.24.
Average KS2 vs. IQ for Boys
For this graph I took the average KS2 was so that I could compare the average KS2 with the actual KS2 scores. The average KS2 score was complied by adding together each of the pupil’s scores from English, Maths and Science and then dividing that result by 3. By looking at the graph I can see that there is a positive correlation. Because the correlation is positive it shows me that the higher I IQ the higher the average KS2. The equation of the line is: y= 12.673x + 50.008
Comparison bar Chart
This is the information that I used to create the bar chart:
By looking at the graph I can see that for each of the levels the subject frequencies are quite close.
Histogram
This is the information that I used to draw my histogram.
By looking at the graph I can see that
Conclusion
Although using my sample, I have shown that girls are more intelligent than boys there are some results which would make me think that my findings are not that reliable. To start I found that girls have a mistake IQ and their cumulative frequency graph is to the right of that for the boys would both seem to show that girls are more intelligent however, this difference is very small and one or two results would well change there findings. On the other hand, the results for the boys shows less variation (with lower interquartile ranges, lower ranges and lower standard deviation) and these would seem to show that the boys results are more reliable. The findings overall are therefore very mixed and do not either strongly prove or disprove my prediction.
There are also several other limitations that need to be taken into accord when looking at my results, these are shown below:
- We were give this data by the exam board (i.e. secondary data and therefore have no idea as to how accurate the data is. As a result everything we have found out maybe completely wrong.
- When choosing my sample I found that 1 piece of data needed to be replaced as it was anomalous. In a normal school this would probably be caused be a typographical error by the admin staff. The data may well contain many more of these errors which again would make my findings unreliable.
The only way to improve this investigation would be to obtain sets of primary data and go through the whole investigation again.
MATHS coursework: Mayfield Page