- It is a simple process to initially sort data
- Straight away it can give us a rough idea of the mean with a small amount of calculation. Also we can find out a mean and an exact figure of median with little calculation.
- We can see the rough shape a graph would take.
The first stem and leaf diagram will be unordered. I will then do a second and order my data.
Year 7 Data
Males Females
47% population 53% population
Straight away it is possible to see that on average males are slightly taller than females at this age group. This is a simple observation.
I have now sorted my data.
Year 11 Data
Males Females
47% population 53% population
Males are clearly taller than females in year 10
Year 12 Data
Males Females
47% population 53% population
Sorted data
Drawing Conclusions from the data
As we can see from the box and whisker graph that accompanies this work that the data is visually pleasing. We can draw several conclusions from the box and whisker diagrams when they are laid out side by side to compare. The most simple observation we can make is that males are on average taller than females in the years we have surveyed. We can see that on average boys experience height changes in puberty earlier and have finished growing on average by year 11. The male means of year 11 and 12 are almost identical whereas the means of the year 11 and 12 girls are still increasing. With more data across more age groups we could successfully produce a mean age of puberty. With the amount of data available to us we can say that women and females grow later in there teenage years.
Hypothesis B Hand span and Height are directly related
To see if these two are directly related I will use Spearman's Rank Coefficient. I will take data from one year group and see if it related through the coefficient and whether they are ranked closely together.
I will use the following formulae to find out if there is any correlation between the two sets of ranking.
P = 1- 6∑d²
N(n²-1)
Where N is the number of pairs of data.
The result of the calculations is 0.61 that is an encouraging result when the population size is taken into account. This figure is well above the minimum point at which rank coefficient is likely. This is a good result to prove the theory above. The minimum figure below 1 of which rank coefficient is probable is 0.46. Therefore in this case there is a clear link between the rankings of the two sets of data. This proves the hypotheses B by proving strongly that there is a clear link between hand span and height. Therefore we can say quite confidently that the all bodies are roughly within set proportions.
Hypothesis C Heart rate and age are unrelated.
To find out if this hypothesis is true or untrue I will work out the mean for each year. The reason I think heart rate and age are unrelated is because I believe that heart rate is directly related to the physical fitness of the subject.
I will work out the mean and standard deviation for each year group I am working with. I will then be able to appreciate whether we have a normal spread across the data.
Year 7
The total is 143.43. I need to divide this by the number of sets of data, which is twenty.
Then I need to square root the answer, 13124.55.
This gives us the standard deviation figure of 25.5.
Standard Deviation 25.5
Mean 67.5
Year 11
At this stage I also realised I could save myself a lot of time by creating a spreadsheet in Microsoft Excel that would provide me with an answer for Standard Deviation by just entering the original data.
Year 12
Do my results fit into a standard distribution?
We know that their a certain facts surrounding a normal distribution and standard deviation. This diagram represents a standard spread. I have decided to use my year 11 results as my example to fit into a normal distribution and to compare it. Below is how my data fits into a standard distribution.
My data fits well into a standard distribution, there is only one result out of the twenty, 5%, that falls outside plus or minus 2 points of Standard Deviation. 75%, a little over the average 68% fall inside plus or minus one point of standard deviation which is also pleasing.
Drawing Conclusion
The high rate of standard Deviation for the year sevens may indicate either that there ability to carry out a seemingly simple task is hampered by there level of maturity or that there is a wide range in the answers and the class included both athletes and the less athletic couch potato character. The closeness of the two standard deviation results for years eleven and twelve and is interesting because of the distance the two respective means are apart. This mean would suggest that over the course of a year the health of the year 11 to 12 deteriorates a quite a rate. This may suggest that the stress of so many exams is taking its toll on the student population. If this is not the case then it highlights a problem with my statistical methods. For example; the selection/population is not big enough for the results I am trying to correlate.
In response to the to my original hypotheses the age and pulses are related the results would suggest so. But I think we need to take into account the spread of the ages we are looking at. Although between all three there is an increase of sorts the difference between the first two age groups is large the difference in the results is relatively minor. The age gap between the year 11 and 12 data is just a single year and we see a vast increase. When we see this on a graph with the means plotted the absurdity of the results is all to apparent. Therefore in future if repeating this I would have to use a bigger population as to find better results. As the population size increases it will better represent the full population. Therefore I would safely dismiss any relationship between age and pulse rate over the age we have covered.