Selecting Secondary Data
Method
The Excel Mayfield School Spreadsheet holds over 800 pieces of data about fictitious pupils attending Mayfield School. Because of the extensive amount of data for purposes of the coursework, we had to narrow down the numbers of pieces of data. For the purpose of my coursework, I felt that 60 pieces of data was a sufficient sample.
There are many different types of sampling I could have used to select my data;
Systematic Sampling – Every member of the sample is chosen at regular intervals from the list. A sample chosen in this way can be biased, if a low or high value occurs on a regular pattern.
Random Sampling – Every member of the sample has an equal chance in being selected. Random samples need to be carefully chosen. A random sample is one of the ways to avoid bias data.
Stratified Sampling – A population may contain separate groups or strata. Each group needs to be fairly represented in the sample. The number from each group is proportional to the group size. The selection is made at random from each group.
Other types of data sampling include convenience sampling, quota sampling and cluster sampling, however I felt that there were neither relevant nor useful in helping me with my coursework project. Out of the three relevant methods of data sampling, I decided to use random sampling, as it was the least biased and easiest to use out of all three of them. I went onto the internet, and found a random number generator. I entered the specifics I needed for my sample and the random number generator produced sixty random numbers out of the over eight hundred given in the spreadsheet. I then deleted the data I no longer needed, leaving only my random sample of sixty pieces of data from which to generate graphs etc. with.
Selected Data
Random Sample
Plan of Action
Using my newly refined and minimised data, I plan to make three scatter graphs, one showing whether there is a correlation between height and weight. Another, showing weights and average number of hours of television watched, and another with average Key Stage 2 SAT results against average number of hours of television watched per week. I also pan to find the ranges of certain parts of my data, and some form of averaging. Also possibly some other forms of graphs such as cumulative frequency polygons and histograms.
I also plan to collect some of my own data, from pupils in my school, similar to the data on the Mayfield school spreadsheets, so that I can compare the data.
Questionnaire
Name:
Average hours of TV watched per week:
Height: cm
Weight: (kg)
Please tick the correct box.
Key Stage 2 SAT results.
Please tick the correct box.
English
Maths
1
2
3
4
5
Science
1
2
3
4
5
Limitations.
In my GCSE Statistics coursework, there are some aspects of my work that I would change. If there had been more time, I may have decided to use a stratified sample in sorting my primary data, as I feel some of the information I found with my questionnaire may have been slightly biased, and not representative of the whole school. I also must mention that two of the 60 participants failed to return their questionnaire, so there are only 58 pieces of data being used. I also think that it would have been a good idea to explore more different areas of graphs and charts, as I feel some of my results could have been enhanced by a wider variety of ways to display my data, but I did not feel that there was enough time nor urgency to use so many different types of graphs and charts. If there had been more time, another good thing to do would have been to compare the data at Key Stage 3 pupils with the pupils at Key Stage 4. I did not look in any way at the data given for Key Stage 4 students as I thought I had enough to do with all the things I was looking at with Key Stage 3 pupils. I have come across a number of difficulties throughout my coursework. The most prominent of these has to be my difficulties in using Microsoft Excel computer programme. I could sort data on the spreadsheet, and manage with most of the fractions to find sums and averages etc, but as far as interpreting the data into graphs was concerned, I really struggled. After battling with the graphs on the Spreadsheet for a good while, I then decided that it was both easier and more preferable for me to draw the graphs by hand. If there had been more time, then I think that maybe I would have like to have had the opportunity to work out for myself how to draw graphs on the computer. I found doing coursework, very time consuming and sometimes quite frustrating, however, overall I really enjoyed the experience of using my Maths skills and putting them into practical use. I also found writing up all the analyses and conclusions very helpful, as it is not often that I get to write in detail about mathematic processes. This piece of coursework has really improved and helped to revise some of my knowledge of Maths statistics, and found it a very beneficial thing to do.
Primary Data Introduction
For my coursework I needed to be able to show that I could work not only with secondary, ready provided data, but also that I could find, select, and display in graphs, charts etc., data which I found and specified. In order to do this I had to prepare a questionnaire to hand out to various pupils around the school.
Selecting Primary Data
Ideally, I would have like to have handed my questionnaire to a stratified sample of 60 pupils in the school, giving it to the correct proportion of each year group. However, in the end I decided to take a random sample from all the year groups, as this was a much quicker and easier task to perform as far as my coursework was concerned. I based my questions on the data given in the Mayfield School spreadsheet, asking the participants of the questionnaire to tell me the average number of hours of television they watched per week, their height in Centimetres, their weight in Kilograms, and their Key Stage 2 English, Maths and Science SATS scores. With these findings, I then intend to make similar graphs to the ones I drew using the secondary data, and compare the results of the two. I predict that if my hypothesis is correct, and my samples are both as fair as possible, then the graphs that I compare will technically be very similar to each other.
Selected Primary Data
Averages
There are three different types of averages that I am considering using in this Statistics Coursework, and these are the three main types.
Mean Average
This is the most commonly used, and most mathematically accurate type of average. This is when digits are added up and the total sum is then divided by the number of digits.
Median Average
The median average it also fairly accurate, but occasionally a tedious average to find. It involves putting digits in a chronological order and then counting down into the middle out of the numbers, this is then the median average.
Modal Average
In my opinion, the most easy to use average is the modal average. This is simply the digit that occurs the most.
For my coursework I decided to use the modal average technique to find the averages of my data, because it is an easy and simple method, which will require me using tally charts to work out my modal averages.
Bibliography
Key Maths Statistics Book
The Pupils of Balshaws High School