Statistics - Mayfield High database - TV watching and exam results are negatively correlated.

Maths Coursework

Statistics

By Arnita Manandhar

Form 11S

Mrs Dodds

For this Statistics coursework, I will be studying one hypothesis and will try to prove them in three different ways. To do this I will need to have a fairly large amount of data and hence I have used the Mayfield database. (Just as a reference, I would like to include that Mayfield School is a fictitious school). Before making my hypothesises, I needed to sort the data to only what I needed for the coursework. I stratified my data from 1183 students to only use year 11 that has 170 students in total, as they would have been preparing for their GCSE’s which automatically links to their exam results. I will also include gender and average hours of TV watched per week as factors to be used in my hypothesis.

HYPOTHESIS:

It is said that the more TV you watch, the more likely it is likely to affect your exam results in a negative way. Relating to that, it is also commonly believed that girls tend to achieve a greater success in their exam results rather than boys and so it seems predictable to think that boys watch more TV than girls. I personally agree with this theory and so to prove them I will split this hypothesis into three parts:

TV watching and exam results are negatively correlated.
Boys watch more TV than girls;
And hence, girls have the best exam results overall.

The Mayfield database was not created by myself and was not found out by myself so I fail to call the data I use in this coursework as primary data. The data is secondary and not primary as for me to collect enough data to sample it out, would take too much time. It is secondary data and so need to be studied with more detail just in case of any anomalous results as the data is already not a 100% reliable, so to make it as accurate as possible is compulsory.

For the whole of the hypothesis I will initially sample out at least 50 because I believe that it is a number that is more or less suitable to represent the whole population, but small enough to be manageable and less confusing to understand. I have decided to use 100 students so I can have a wider range of results which would make my coursework further reliable. The 100 I will choose will have 50 boys and fifty girls stratified. This can be used for all three parts in my hypothesis so I will be able to compare efficiently as if I had chosen different people for the different parts of my hypothesis, I would not be able to compare whether my theory was correct or not.

To select my 100 students of year 11, I will firstly use the stratified sample to separate the data into two gendered data, boys and girls. Then I will use the random method of sampling for both genders to select 50 from each. To do this simply, I will just have to use a calculator, press shift ran and then multiply it by the number of girls which is 86. Then by using the same method same method I enter the number of boys which is 84. I will use this method of random so that I am sure that every student has an equal chance of being selected, thus making the sample fair.

For the first part of my hypothesis, to present my results I will do a scatter graph and will also work out the coefficient of regression, the average and the mode. The results of the second part of my hypothesis will be illustrated upon stem and leaf diagrams with box plots as well as a population pyramid. My results of the third hypothesis will be presented on histograms.

The reason I have not chosen to use a systematic sample is that it would not work with my second and third hypothesis. What I mean by this is that there are too few of numbers to use it as it would almost be like picking the first 50 results instead of being fairer.

The next four pages show the gender, the average number of hours TV watched per week and the total exam score for Key Stage Two for the whole of Mayfield’s year 11 students.

...