The size of sample chosen for a survey should depend upon how accurate the final estimates are required to be. Normally a compromise is made between the ideal sample size and the anticipated cost of the survey. The complexity of a sample usually depends on the availability of supplementary information that can be used to introduce efficiencies into the overall design.
A crucial element in sample design and selection is defining the source of materials from which a sample can be chosen. The source, known as the sampling frame, generally a list, such as a list of housing properties in a town, a list of students studying in a University, etc.
The sampling frame can also incorporate geographic areas with well-defined natural or artificial boundaries, when no suitable list of the target population exists. In the latter instance, a sample of geographic areas is selected and an interviewer canvasses the sample areas and lists the appropriate households, retail stores, or whatever, so that some or all of them can be designated for inclusion in the final sample.
The sampling frame can also consist of less definite things, such as all possible permutations of integers that make up banks of telephone numbers, in the case of telephone surveys that seek to include unlisted numbers. The quality of the sampling frame whether it is up-to-date and how complete is probably the dominant feature for ensuring adequate coverage of the desired population.
It is clear that statistical/scientific sampling is only required when large and/or irregular variations in the population and time and money is limited, preventing the measurement of the whole population. It is accepted that when using a sample result, accuracy may be lost, as the limitations of errors are often unknown.
To avoid selecting samples that can give biased results which could misrepresent segments of the population. A random or probabilistic procedure should be implemented.
The first and simplest way of selecting a sample of which I am going to discuss is a method known as ‘Simple Random Sampling’. Basically as its name suggests is randomly selecting a sample from a population. A.S.C Ehrenberg gives a good explanation, he suggests that an easy why of random sampling is drawing slips of paper from a hat instead of checking every entity in a population.
The advantage of this method is that systematic selection bias can be evaded, as each entity in the sample would be selected by chance. To explain this concept further I think it is necessary to use an example, If one was to conduct a survey to determine how many families owned a second car. Instead of checking each home, one could take a sample. One could put slips of paper identifying each home into a hat, shuffled them well, and randomly selected the number needed for the sample; one would get a random sample.
If a large number of samples were taken, the number of families with a second car would inevitably balance out; so that on average one could get the correct population proportion. Practically one only would ever take one sample. This could result in an unrepresentative sample; one could be unlucky and get all families that do not own a second car.
The chances of getting such a result, which would be an unrepresentative sample, can be minimised by making the sample size large enough.
Reducing the size of a survey is normally done to save on costs, whether they are financial or time saving costs. Another reason for cutting costs would be the likely size of the sampling error for a given budget. There are three methods that can be used when sampling human populations. These methods are known as multi-stage sampling, clustering and stratified sampling. These methods are the most recognised, although other statisticians apart from A.S.C. Ehrenberg have adopted other methods.
Multi-stage sampling as its’ name suggest, involves stages in the sampling process. The number of stages depends on the person, or reason for the survey. Say for example, one was doing a survey on Soap Opera’s, to establish the most popular one. Due to financial and time restrictions, to sample the population of the United Kingdom, one could pick a sample of towns or districts. From this point then one could pick out individuals in that town or district. This could be achieved by stopping people in the street or inviting them to reply in a local newspaper advertisement. This type of sampling can be called ‘Two stage’ sampling, reason being that it involves two stages.
Using this method allows one to avoid the cost of handling lists of individuals in towns or districts in which no one may have been selected. One therefore will have an extensive list of the towns or districts are required which lists all the selected individuals.
Clustering is another form of sampling, though it is restricted to cases were each unit at the last stage of multi-stage sampling is selected. Diversification is the key, Clustering should be as diverse as possible. Diversification reduces the chances of getting sampling errors. However, in practise this is never the case. The opposite usually happens and sampling errors do occur as similarities in the collected data can be seen clearly. This means that cluster sampling is from a statistician’s point of view is not an efficient method of sampling. Although, one advantage of cluster sampling is cost, minimal compared to other sampling methods. One has to carefully consider the cost factor against the accuracy factor, even though that there may not be any accurate information collected in the sample.