Gullivers theory - introduction

“… They measured my right thumb, and desired no more; for a mathematical computation that twice round my thumb is once round the wrist and so onto the neck and the waist…”

(Extract from Gulliver’s travels by Jonathan Swift)

Investigate using any valid statistical method.

Aim

The aim of the coursework is to prove whether or not Gulliver’s theory is correct, (in accordance to the above extract), in reality.

Hypothesis

In my opinion, I do agree with the theory -to some extent- since, by measuring myself, I found the measurements of the body parts to be consistent with the other in agreement with the theory (± 4cm).

However since I’ve tested it only on myself for now, I cannot apply this rule to everyone since there are many factors to be taken into account.
And due to this fact, I believe that the theory is restricted to certain groups of people (e.g. those whose body parts are in direct proportion to the other) and may not necessarily comply with the majority as there are a number of aspects that can contribute to this.

One factor which can alter the consistency of the theory is gender. Boys tend to have a larger body build than girls and hence, I do not believe the theory to be true in this case. And so for boys I would say that ‘thrice round the thumb is once round the wrist, twice round the wrist is once round the neck and two and a half times round the neck is once round the waist.’ Whereas for girls, I would agree with Gulliver.

Age also has to be taken into consideration when dealing with the whole concept of the theory. A child and an adult have many differences; and to apply the theory to both groups would seem a bit irrational. An adult is fully matured in all physical aspects whereas a child still has to go through puberty. Another example is that a child hasn’t fully developed a figure yet whereas an adult has; hence, affecting the waist measurements.
In addition to that, there still remains the fact of obese and skinny people which may also cause a difference in measurements.
Although age and size are factors that could affect the theory’s consistency, my main comparison will be between gender.

Data to be used and collected

Primary data- data collected by the researchers themselves.
Secondary data- data collected by others to be "re-used" by the researcher

The data I will use is secondary data as it can be obtained at a fraction of the cost, time, and inconvenience of primary data collection.

However there are limitations to this as well as practical problems that may be encountered. (Some have already been mentioned above)

LACK OF AVAILABILITY – since the data has already been collected and given, you cannot be sure whether or ...

This is a preview of the whole essay

Data to be used and collected

Primary data- data collected by the researchers themselves.
Secondary data- data collected by others to be "re-used" by the researcher

The data I will use is secondary data as it can be obtained at a fraction of the cost, time, and inconvenience of primary data collection.

However there are limitations to this as well as practical problems that may be encountered. (Some have already been mentioned above)

LACK OF AVAILABILITY – since the data has already been collected and given, you cannot be sure whether or not bias has been avoided. Also, since there is no access to the boys’ side you cannot be sure on how the data has been dealt with.
BIAS- some pupils may have altered their measurements in order to influence the result of the investigation. However, this will come to be detected when plotting the graphs since you will be able to see any anomalies that are far off from the regular pattern of things. These anomalies will, as a result, be discarded/rejected.
INACCURATE DATA- again some pupils may have altered their measurements and/or young pupils may not know how to measure themselves properly which in effect would lead to inaccurate data. However, to reduce the chances of this I have gone to one of the younger classes and taken their measurements myself.

Sample size
I have divided my sample size into 3 main categories - the highest, middle and lowest classes.
Therefore my sample size consists of: (For each class there’ll be 20- 10 boys, 10 girls, giving a total of 60 results.)
1) year 4 girls/boys
2) year 7 girls/boys
3) year 11 girls/boys

To choose all the classes, boys and girls, would seem far-fetched since it would mean analysing more than 100 different results and doing more than necessary calculations. Therefore, choosing the highest, lowest and in-between seems like a sensible idea.

The measurements taken will be of the thumb, wrist, neck and waist. Three measurements will be taken of the thumb (top, middle, and bottom) as well as the waist. However, I will be taking an average of each.
Unfortunately year 4 girls are a total of 10 girls and so I will not be able to carry out my sampling method on them since my sampling size is originally 10 from each class anyways. The same situation has risen in year 4 and 11 boys.

Sampling method

Simple random sampling is when a group of subjects (a ) are chosen from a larger group (a ). Each subject from the population is chosen and entirely by chance, such that each subject has the same of being chosen at any stage during the sampling process.
Systematic sampling is the selection of every kth element from a sampling frame, where k, the sampling interval, is calculated as:

k = population size (N) / sample size (n)

Using this procedure each element in the population has a known and
equal probability of selection.

Stratified random sampling is when a random sample of specified size is drawn from each stratum of a population.
There may often be factors which divide up the population into sub-populations (groups / strata) and we may expect the measurement of interest to vary among the different sub-populations. This has to be accounted for when we select a sample from the population in order that we obtain a sample that is representative of the population. This is done by stratified sampling.

I do not think stratified is needed since the sample size chosen is not very big that it needs to be divided into sub-groups etc. and also the fact that it is more time-consuming compared to the others.
In systematic, the researcher must ensure that the chosen sampling interval does not hide a pattern as any pattern would threaten randomness. A random starting point must also be selected. If this is not taken into account, bias may be introduced.

Thus, I have chosen simple random sampling as I find it the most convenient and sufficient for this type of investigation.
I will do this by using the random function on the calculator (Ran#) which will generate random numbers from a given value. Example:
if I have 10 sets of data I will enter the number 10 and generate the random function on the calculator. This will in turn give me any random number from 1-10 from which I will be able to get my first of five values. (Decimal numbers will be rounded off to the nearest whole number).
How this will be done is given in an example below:

For year 7 Girls there are 22 sets of data given; I need 10 since that is my required sample size.
1) 22Ran# = 8.12 [8] my first value will be of my 8th data set.
2) 22Ran# = 6.97 [7] my second value will be of my 7th data set…..

Graphs
The graphs I intend to use to show the distribution of the data consist of a variety; varying from histograms to scatter graphs.

Histogram- is a representation of a frequency distribution by means of bars, whose widths represent class intervals and whose areas are proportional to the corresponding frequencies.
Since continuous data is involved it would make sense to use a histogram and also the fact that it would make it easier to compare distributions and calculate the mean.

Cumulative frequency diagram- will help analyse the data by calculating an estimate of the mean. It will also help calculate the median and if needed an inter-quartile range; which shows how consistent the subject being tested is.

Pie Chart- creates a visual representation of data as a proportion of a whole. Each sector is proportional to the quantity it represents. It is very useful for comparisons as a comparison can be made instantly just from looking at the diagram.
Moreover, with the use of this graph, the mode can easily be determined.

Frequency Diagram- used to graphically summarise and display the distribution of a process data set. Useful in comparing distributions and identifying the modal and least common class.

Scatter graphs- I will be using this graph, as it is a good way of illustrating two sets of data and establishing whether or not there is a relationship between them and what type (in this case it will be the thumb/wrist, wrist/neck, etc).
Furthermore, by creating a line of best fit, a gradient can be determined which will, in-effect, tell us how true the theory is.

Calculations
To prove or disprove the theory I will be carrying out numerous tasks and various calculations.

This will involve me analysing and comparing graphs; as well as interpreting them e.g. finding out the gradient, correlation and so on.
And from doing all this I will hopefully be able to determine whether or not there is a relationship between the body parts in accordance to the theory.

The gradient {for the scatter graph(s)} will be calculated as:

Gradient = difference in y = dy
difference in x dx

I will also be working out the modal class (including the least common too), the median value and the mean. The modal class will be determined by the group having the highest frequency along with the least common class being the one with the lowest frequency.

The median {for the cumulative frequency diagram} will be calculated as:

The mean {for the histogram} will be calculated using the following formula:

Mean x = ∑fx

∑f

Another calculation I will be doing is percentage error (including average percentage error) which will help me in deciding how close to reality the theory is; as any error will lead to inaccurate data and conclusions. So by using percentage error I will be able to determine how close to the actual or accepted amount I came.
To work out the percentage error I will use the following formula:

Plan
To see whether or not the theory is true.

To do this I will investigate using any valid statistical method (explained above) which as said before will involve me interpreting, analysing and comparing graphs not forgetting calculations.
I have already stated my aim and hypothesis as well as my selections i.e. choosing simple random sampling, specific graphs etc.
I will now carry out my sampling method in order to get my required data and then plot it on various graphs. Calculations will then be done [e.g. gradient, mean, mode etc.] in order to see how valid the theory is.

Gullivers theory - introduction

This is a preview of the whole essay

Document Details

Related Essays

Mayfield High School

Maths grid extension

To test my hypothesis, I will use primary data. I will collect information...

Maths Mice Coursework