I think that in general Investigation 1 went successfully. There was nothing that was wrong with it that could have been corrected except the anomalous result. However the only way for me to have changed it would have been to omit the result. I could not do this because if I did it would have undermined my random sampling method because I would be choosing my results. If I were to repeat this investigation, I would like to use a larger sample; this may allow me to explain the anomaly better.
Investigation 2
Plan
Investigation 2: I plan to investigate whether there is a difference in average height between Australian and British Males.
Hypothesise
Null Hypothesis: There will not be a difference between the Median height of Australian men and that of British men.
Alternate Hypothesis: The median height of Australian men will be higher than that of British men.
In this investigation I am going to examine average height of men in Britain and Australia. However there are three different types of average, mean, median and mode. In this investigation I will be examining median. I am examining median because it discounts anomalies because it is only finding the mid-point.
To try and prove or disprove these hypothesise I will construct a cumulative frequency graph. Above the cumulative frequency graphs I will use box plots, as they will summarise the statistics of the data effectively. A box plot contains the median point, the lower and upper quartile ranges and the lowest and highest points.
In this investigation I will use the census from school. It helps in this case because it allows me to get information from a large sample of subjects from countries around the world, which would otherwise be unfeasible. I will get 200 data items from British males and 200 data items from Australian males. If I plot all 200 data entries for each the results I would not be in control of what is plotted and the graph would look very complicated. For these reasons I plan to take a sample of 40 people. The data is taken from a range of 7 – 17 year olds so if I do random sampling there is a possibility that all of the British data will come from the younger years and all of the Australian data will come from the elder years. This would make the investigation unfair because the data was taken from different ages. For this investigation I am going to use stratified sampling. This means that I will take a sample from most of the ages and have the same amount of people from each age in both the data sets.
To obtain the number of subjects I should use from each age group, I calculated the proportion of subjects in each age group and multiplied it by 40, the desired total in my final group. For example, there 54 subjects out of 200 were aged thirteen, so for my selection of 40 subjects, I needed to select 11 thirteen year olds:
(54/200) x 40 = 10.8
I rounded it to the nearest integer. When this finished only 39 were accounted for so I changed the 5 to a 6 in the 11-year-old category because it came to 5.2 and there was a big difference between the UK and the Australian population. To get the amount of people selected out of the amount of people present I will use random sampling.
Stratified Data Tables
Cumulative Frequency Tables
UK
Australia
Investigation 2
Analysis
I have now drawn up my cumulative frequency graphs for both Australian and British Males. My original hypothesise were:
Null Hypothesis: There will not be a difference between the Median height of Australian men and that of British men.
Alternate Hypothesis: The Median height of Australian men will be higher than that of British men.
The cumulative frequency graphs demonstrate that on average Australian males are taller than British males. I can tell that this is true because the median value is 158cm for Britain but for Australia it is 161cm. This proves that my Alternate Hypothesis was correct.
The shortest person was British and the tallest was Australian. However the Inter-quartile range was smaller in the British population. The advantage of the inter-quartile range is that it eliminates wild and uncharacteristic results. This means that the Australian population had more variation in size and that they deviate more from the median than the British population. This result tells me that the Australian population may just have a few very tall people, skewing the results. If the mean was recalculated without the few very tall people then the British population may be taller.
I think that in general my Investigation 2 went successfully. The way that the graphs and box plots were presented allowed easy reading and an easy comparison of results. The stratified sampling of the data was a fairer method than random sampling because it ensured that the same age group from each population was represented. If I were to repeat this investigation, I would like to calculate the mean and standard deviation; this would give support to my theory that the Australians had a few tall people, which influenced the results.
Investigation 3Plan
Investigation 3: I plan to investigate whether there is a difference in average height between UK Males and UK Females.
Hypothesise
Null Hypothesis: There will not be a difference between the Mean height of UK Males and that of UK Females.
Alternate Hypothesis: The mean height of UK Males will be higher than that of UK Females.
In this investigation I am again going to examine the average height similar to that of Investigation 2. However this time I will use the mean to investigate the heights of the groups. I am examining mean because it takes all of the data entries into account and I believe it is a clearer average to understand.
To try and prove or disprove these hypothesise I will construct a histogram. The histogram will plot frequency density against class interval. On the histogram I will use unevenly distributed class widths because some frequencies in a class interval are very small and some large.
In this investigation I will use the census from school again. It helps in this case because it allows me to get information from a wide variety of males and females from all different ages and different areas of the country. I will use 200 data items from UK Males and 200 data items from UK Females. If I plot all 200 data entries for each the results I would be plotting different amounts of ages in each and the frequency density would be very high in each case giving a misleading graph. For these reasons I plan to take a sample of 40 people. The data is taken from a range of 7 – 17 year olds so if I do random sampling there is a possibility that all of the Male Data will come from the younger years and all of the Female Data will come from the elder years. This would make the investigation unfair because the data was taken from different ages. For this investigation I am going to use stratified sampling again. This means that I will take a sample from most of the ages and have the same amount of people from each age in both the data sets.
I came to my results by multiplying the percentage of males in each age group by 40. I rounded it to the nearest integer. To get the amount of people selected out of the amount of people present I will use random sampling.
Stratified Data Tables
Frequency Density Tables
Frequency Density = Frequency of Class Interval / Width of Class Interval
The Mean
To work the mean out from a frequency density table you find the mid-point of the class interval and multiply it by the frequency. You do this for all of the class intervals and add the answers up. Then this number is divided by the total frequency, which in this case is 40.
UK Males
(129.5x1) + (149.5x13) + (164.5x6) + (174.5x19) + (184.5x1) = 6560
6560 / 40 = mean
Mean = 164cm
UK Females
(129.5x2) + (149.5x12) + (164.5x25) + (174.5x1) = 6340
6340 / 40 = Mean
Mean = 158.5
Investigation 3
Analysis
I have now drawn my histograms for both UK males and females height my original hypothesise were:
Null Hypothesis: There will not be a difference between the Mean height of UK Males and that of UK Females.
Alternate Hypothesis: The Mean height of UK Males will be higher than that of UK Females.
The Histograms clearly demonstrate that the average height for UK Males is larger than that of UK Females. I can tell that this is true because the mean value for UK males is 164cm whereas the mean height for UK females is 158.5cm. This is quite a large difference. The alternate hypothesis has been proved to be correct.
For females the highest frequency density was in the 160 ≤ h < 170 category as it was 2.5. This would be the modal average for UK females. For males the highest frequency density was in the 170 ≤ h < 180 category as it was 1.9. This would be the modal average for UK males. Another indication that UK males are in average larger than that of UK females.
I think that in general Investigation 3 went successfully. Using a histogram allowed me to use group of different interval width because area demonstrates group size not height. The stratified sampling of the data was a much fairer method than random sampling because it ensured that the same age group from each population was represented.