I will use simple random sampling for my sub hypothesis. I will randomly choose 20 pages from each newspaper. Then I just have to count how many pictures are on the sample page. There are a variety of problems that could be encountered; for instance whether or not to include diagrams, logos and adverts in the count. I will not count diagrams and logos but to avoid too much confusion I will include pictures that are a part of adverts.
I will not be using cluster sampling, quota sampling, convenience sampling, opinion polls or questionnaires. I am collecting data from newspapers so asking people with opinion polls, questionnaires and convenience sampling would be pointless. Cluster sampling requires the population to be divided into groups and quota sampling requires the data in the sample to be of a particular type and since my raw data is neither of these it would not work.
For my main hypothesis I will use a variety of diagrams. I plan to use histograms, cumulative frequency diagrams, box plots, population pyramids and comparative pie charts. The reason I will be using all these is I expect to get a wide range of data so it will need to be grouped and these are the best diagrams for grouped data. Because all my data is grouped I will not be using scatter graphs as these require ungrouped data and their purpose is to compare the relationship between two variables, which is not what I am trying to find.
These diagrams will also help me find out my calculations. From cumulative frequency diagrams and box plots I will find out the mean, median, mode and interquartile range of all of the four newspapers tested. I will also do Spearman’s Rank to find a correlation between the two types of newspapers and the correlation between the broadsheet and tabloid papers issued on the same day.
Selection And Collection Of Data
I collected all of my data from four newspapers, two issues of the Daily Mail and two issues of The Times. Two issues of the paper meant that results for my main hypothesis would be more accurate, especially if there had happened to be an unusually low amount of words per sentence on that day in one paper- which may lead me to come to the wrong conclusion. For my main hypothesis I collected 100 pieces of data from each newspaper, 400 samples in all. I believe this was a good sample size as it is large enough so the results are reliable and bias is impossible, and also any larger sample would become too time consuming to obtain. In order to get a broad range of news articles (news, sports, columnists etc.) I used simple random sampling and stratified sampling. I used the simple random sampling to get data from 20 pages in each newspaper and this ensured that every page had an equal chance of being chosen. I then used stratified sampling to get a fair proportion of words from each article, so if the article was short I would only sample a couple of sentences from that article; whilst if the article was longer I would sample more sentences. This is a good technique to not get biased results however to completely eliminate that possibility I then used systematic sampling. This is a simple and quick method for me to chose which sentences from the article I would sample. There is the possibility that the data may be unrepresentative if a pattern exists but I found when collecting my data that this was not the case.
For my main hypothesis I found that using all three of these types of sampling was the best way to get a quick sample and it virtually made it impossible to get biased results. I did not choose to use convenience sampling or opinion polls as these require data to be collected from people and this was not necessary for my hypothesis as I was collecting data from newspapers. Cluster sampling and quota sampling could not be used as there were no obvious groupings of data in the newspaper and no way to get data of a particular type. On some pages I found that there were no articles, the whole page was dedicated to adverts so results from this page was unobtainable. My solution to this was to ignore that page and randomly choose another one.
For my sub hypothesis I collected data from 20 pages in each newspaper. To do this I used simple random sampling to select which page I was going to collect from. This was the quickest and easiest method to select a random sample and it works especially well when only a small sample is required, which is what I needed. Also, as with the main hypothesis, every page had an equal chance of being chosen although there was no guarantee that the sample would be unbiased. I found that all of the ways to sample were unnecessary as I did not need to divide the sample into categories and data didn’t need to be obtained from people. As predicted in my plan I encountered the problem of whether or not to include adverts, diagrams, logos etc. As I stated in my plan I did allow pictures that were included in adverts but everything else didn’t count- as it isn’t really a picture. One unexpected problem that I encountered early on was the huge difference in the number of pictures on a page between the two newspapers of the same time. This actually made my data less accurate and there were quite a few anomalies that would affect the mean later on. I also had timing issues so for the sub hypothesis I only collected data from one broadsheet and one tabloid newspaper- not two like I had done for the main hypothesis.