I have chosen to compare the weights with the three factors listed above: age, gender and height and throughout this investigation I am aiming to show that weight of a certain group of people

Authors Avatar

Renee Uba

Data Handling Project

In this piece of coursework I am investigating how the weight of a sample of pupils can be affected by various factors e.g. age, gender, height etc.

        I have chosen to compare the weights with the three factors listed above: age, gender and height and throughout this investigation I am aiming to show that weight of a certain group of people may be affected by certain factors and that other factors may have no affect on the weight at all.

Plan:

        I have been given a survey on all the pupils in ‘Mayfield School’. I have over 800 cells of data containing the following information on a pupil:

  • Year Group
  • Surname
  • Forename
  • Age (in years and months)
  • Gender
  • IQ
  • Height (m)
  • Weight (kg)

In order to carry out my investigation I need to take a sample out of the collected information. To achieve data that will enable me to work with a valid statistical technique I require a sample size with over 30 data points for my analysis to be valid.  I am going to take a sample size of 60 because later on in my investigation I am going to split my data into genders so I will need 60 data points for males and 60 data points for females. The reason for such a large sample size is: if I come across any outliers then I will rule them out completely and my sample size will not be affected because it is so large. The samples will be taken at random; I am going to select my data point for both males and females by doing stratified sampling. The reason for my selected random sampling is so that my investigation is fair (NOT BIASED).  Stratified sampling will ensure that it is fair because I will be attainment a fair range of data from different parts of the Mayfield data.

Join now!

My data is going to be collected in a frequency table and this information is diagrammatically going to be represented using:

  • Scatter diagrams – by drawing the line of best fit I will be able to determine how the two variables I am comparing relate to each other or not.
  • Box and Whisker plot (where my information is going to be obtained from doing a stem and leaf diagram and manually counting the data to achieve the median, lower quartile, upper quartile and interquartile range.  The median is useful as it helps me to find where the middle of ...

This is a preview of the whole essay