There is no formula to determine the size of a non-random sample. Often, especially in research you simply enlarge the sample gradually and analyse the results as they come. When new cases no longer yield new information, you conclude that your sample is saturated, and finish the job. This method is however very sensitive to biased sampling, so you should be careful and make sure that you do not omit any groups from your population.
MINIMISING BIAS:
When considering the issue of bias in the survey, the following points were implemented in the questionnaire:
- characteristics of interviewer: attitudes and opinions tendency for interviewer to seek responses that support his/her pre-conceived ideas
- misunderstandings by either interviewer or respondent about what the other is saying
- characteristics of respondent (e.g. race, religion, social class)
SAMPLING FRAME
The above points were implemented through the procedure of collecting the data. This was done, by approaching people in the city centre of Doncaster over three separate days, Monday, Wednesday and Friday. It was a conscious decision when approaching people to gain a diverse opinion, this was achieved by approaching people between eight different age groups, (see appendix 1), and getting responses from both male and females. Another factor to be considered while approaching people was to gain responses from passing visitors through the centre and also working people travelling into the city centre for lunch. On all three days of the data collecting, the survey was conducted at the same point, the middle of the town centre, this was done in order to gain a diverse response and to gain consistency and minimise bias.
THE QUESTIONNAIRE:
When designing and producing the questionnaire, the following points were considered:
Most problems with questionnaire analysis can be traced back to the design phase of the project. Well-defined goals are the best way to assure a good questionnaire design. When the goals of a study can be expressed in a few clear and concise sentences, the design of the questionnaire becomes considerably easier. The questionnaire is developed to directly address the goals of the study.
One of the best ways to clarify your study goals is to decide how you intend to use the information. This was considered before I began designing the study.
Whenever I was unsure of a question, I referred to the study goals and a solution became clear. I asked questions that directly addressed the study goals. Avoiding the temptation to ask questions because it would be "interesting to know".
As a general rule, with only a few exceptions, long questionnaires get less response than short questionnaires. Therefore the questionnaire was kept short. Response rate is the single most important indicator of how much confidence you can place in the results. A low response rate can be devastating to a study. Therefore, you must do everything possible to maximise the response rate. One of the most effective methods of maximising response is to shorten the questionnaire.
Many people have difficulty knowing which questions could be eliminated. For the elimination round, I read each question and asked, "How am I going to use this information?" If the information will be used in a decision-making process, then keep the question... it's important. If not, throw it out.
One important way to assure a successful survey is to include other experts and relevant decision-makers in the questionnaire design process. Their suggestions will improve the questionnaire and they will subsequently have more confidence in the results.
Give the questionnaire a title that is short and meaningful to the respondent. A questionnaire with a title is generally perceived to be more credible than one without.
Include clear and concise instructions on how to complete the questionnaire. These must be very easy to understand, so I used short sentences and basic vocabulary.
Begin with a few non-threatening and interesting items. If the first items are too threatening or "boring", there is little chance that the person will complete the questionnaire. People generally look at the first few questions before deciding whether or not to complete the questionnaire. Make them want to continue by putting interesting questions first.
Use simple and direct language. The questions must be clearly understood by the respondent. The wording of a question should be simple and to the point. Therefore I didn’t use uncommon words or long sentences. I made items as brief as possible. This will reduce misunderstandings and make the questionnaire appear easier to complete. One way to eliminate misunderstandings is to emphasise crucial words in each item by using bold, Italics or underlining.
Leave adequate space for respondents to make comments. One criticism of questionnaires is their inability to retain the "flavour" of a response. Leaving space for comments will provide valuable information not captured by the response categories. Leaving white space also makes the questionnaire look easier and this increases response.
Place the most important items in the first half of the questionnaire. Respondents often send back partially completed questionnaires. By putting the most important items near the beginning, the partially completed questionnaires will still contain important information.
Hold the respondent's interest. We want the respondent to complete our questionnaire. One way to keep a questionnaire interesting is to provide variety in the type of items used. Varying the questioning format will also prevent respondents from falling into "response sets". At the same time, it is important to group items into coherent categories. All items should flow smoothly from one to the next.
If a questionnaire is more than a few pages and is held together by a staple, include some identifying data on each page (such as a respondent ID number). Pages often accidentally separate.
Use professional production methods for the questionnaire--either desktop publishing or typesetting and keylining. Be creative. Try different coloured inks and paper. The object is to make your questionnaire stand out from all the others the respondent receives.
The final consideration of a questionnaire is to try it on representatives of the target audience. If there are problems with the questionnaire, they almost always show up here. If possible, be present while a respondent is completing the questionnaire and tell her that it is okay to ask you for clarification of any item. The questions she asks are indicative of problems in the questionnaire (i.e., the questions on the questionnaire must be without any ambiguity because there will be no chance to clarify a question when the survey is mailed).
Qualities of a Good Question:
There are good and bad questions. The qualities of a good question are as follows:
1. Evokes the truth. Questions must be non-threatening. When a respondent is concerned about the consequences of answering a question in a particular manner, there is a good possibility that the answer will not be truthful. Anonymous questionnaires that contain no identifying information are more likely to produce honest responses than those identifying the respondent. If your questionnaire does contain sensitive items, be sure to clearly state your policy on confidentiality.
2. Asks for an answer on only one dimension. The purpose of a survey is to find out information. A question that asks for a response on more than one dimension will not provide the information you are seeking. For example, a researcher investigating a new food snack asks "Do you like the texture and flavour of the snack?" If a respondent answers "no", then the researcher will not know if the respondent dislikes the texture or the flavour, or both. Another questionnaire asks, "Were you satisfied with the quality of our food and service?" Again, if the respondent answers "no", there is no way to know whether the quality of the food, service, or both was unsatisfactory. A good question asks for only one "bit" of information.
3. Can accommodate all possible answers. Multiple choice items are the most popular type of survey questions because they are generally the easiest for a respondent to answer and the easiest to analyse. Asking a question that does not accommodate all possible responses can confuse and frustrate the respondent. For example, consider the question:
What brand of computer do you own? __
A. IBM PC
B. Apple
Clearly, there are many problems with this question. What if the respondent doesn't own a microcomputer? What if he owns a different brand of computer? What if he owns both an IBM PC and an Apple? There are two ways to correct this kind of problem.
The first way is to make each response a separate dichotomous item on the questionnaire. For example:
Do you own an IBM PC? (circle: Yes or No)
Do you own an Apple computer? (circle: Yes or No)
Another way to correct the problem is to add the necessary response categories and allow multiple responses. This is the preferable method because it provides more information than the previous method.
What brand of computer do you own?
(Check all that apply)
__ Do not own a computer
__ IBM PC
__ Apple
__ Other
4. Have mutually exclusive options. A good question leaves no ambiguity in the mind of the respondent. There should be only one correct or appropriate choice for the respondent to make. An obvious example is:
Where did you grow up? __
A. Country
B. farm
C. City
A person who grew up on a farm in the country would not know whether to select choice A or B. This question would not provide meaningful information. Worse than that, it could frustrate the respondent and the questionnaire might find its way to the trash.
5. Produces variability of responses. When a question produces no variability in responses, we are left with considerable uncertainty about why we asked the question and what we learned from the information. If a question does not produce variability in responses, it will not be possible to perform any statistical analyses on the item. For example:
What do you think about this report? __
A. It's the worst report I've read
B. It's somewhere between the worst and best
C. It's the best report I've read
Since almost all responses would be choice B, very little information is learned. Design your questions so they are sensitive to differences between respondents. As another example:
Are you against drug abuse? (circle: Yes or No)
Again, there would be very little variability in responses and we'd be left wondering why we asked the question in the first place.
6. Follows comfortably from the previous question. Writing a questionnaire is similar to writing anything else. Transitions between questions should be smooth. Grouping questions that are similar will make the questionnaire easier to complete, and the respondent will feel more comfortable. Questionnaires that jump from one unrelated topic to another feel disjointed and are not likely to produce high response rates.
7. Does not presuppose a certain state of affairs. Among the most subtle mistakes in questionnaire design are questions that make an unwarranted assumption. An example of this type of mistake is:
Are you satisfied with your current auto insurance? (Yes or No)
This question will present a problem for someone who does not currently have auto insurance. Write your questions so they apply to everyone. This often means simply adding an additional response category.
Are you satisfied with your current auto insurance?
___ Yes
___ No
___ Don't have auto insurance
One of the most common mistaken assumptions is that the respondent knows the correct answer to the question. Industry surveys often contain very specific questions that the respondent may not know the answer to. For example:
What percent of your budget do you spend on direct mail advertising? ____
Very few people would know the answer to this question without looking it up, and very few respondents will take the time and effort to look it up. If you ask a question similar to this, it is important to understand that the responses are rough estimates and there is a strong likelihood of error.
It is important to look at each question and decide if all respondents will be able to answer it. Be careful not to assume anything. For example, the following question assumes the respondent knows what Proposition 13 is about.
Are you in favour of Proposition 13?
___ Yes
___ No
___ Undecided
If there is any possibility that the respondent may not know the answer to your question, include a "don't know" response category.
8. Does not imply a desired answer. The wording of a question is extremely important. We are striving for objectivity in our surveys and, therefore, must be careful not to lead the respondent into giving the answer we would like to receive. Leading questions are usually easily spotted because they use negative phraseology. As examples:
Wouldn't you like to receive our free brochure?
Don't you think the Congress is spending too much money?
9. Does not use emotionally loaded or vaguely defined words. This is one of the areas overlooked by both beginners and experienced researchers. Quantifying adjectives (e.g., most, least, majority) are frequently used in questions. It is important to understand that these adjectives mean different things to different people.
10. Does not use unfamiliar words or abbreviations. Remember who your audience is and write your questionnaire for them. Do not use uncommon words or compound sentences. Write short sentences. Abbreviations are okay if you are absolutely certain that every single respondent will understand their meanings. If there is any doubt at all, do not use the abbreviation. The following question might be okay if all the respondents are accountants, but it would not be a good question for the general public.
What was your AGI last year? ______
11. Is not dependent on responses to previous questions. Branching in written questionnaires should be avoided. While branching can be used as an effective probing technique in telephone and face-to-face interviews, it should not be used in written questionnaires because it sometimes confuses respondents. An example of branching is:
1. Do you currently have a life insurance policy? (Yes or No) If no, go to question 3
2. How much is your annual life insurance premium? _________
These questions could easily be rewritten as one question that applies to everyone:
1. How much did you spend last year for life insurance? ______ (Write 0 if none)
12. Does not ask the respondent to order or rank a series of more than five items. Questions asking respondents to rank items by importance should be avoided. This becomes increasingly difficult as the number of items increases, and the answers become less reliable. This becomes especially problematic when asking respondents to assign a percentage to a series of items. In order to successfully complete this task, the respondent must mentally continue to re-adjust his answers until they total one hundred percent. Limiting the number of items to five will make it easier for the respondent to answer.
The Order of the Questions
Items on a questionnaire should be grouped into logically coherent sections. Grouping questions that are similar will make the questionnaire easier to complete, and the respondent will feel more comfortable. Questions that use the same response formats, or those that cover a specific topic, should appear together.
Each question should follow comfortably from the previous question. Writing a questionnaire is similar to writing anything else. Transitions between questions should be smooth. Questionnaires that jump from one unrelated topic to another feel disjointed and are not likely to produce high response rates.
Most investigators have found that the order in which questions are presented can affect the way that people respond. One study reported that questions in the latter half of a questionnaire were more likely to be omitted, and contained fewer extreme responses. Some researchers have suggested that it may be necessary to present general questions before specific ones in order to avoid response contamination. Other researchers have reported that when specific questions were asked before general questions, respondents tended to exhibit greater interest in the general questions.
It is not clear whether or not question-order affects response. A few researchers have reported that question-order does not effect responses, while others have reported that it does. Generally, it is believed that question-order effects exist in interviews, but not in written surveys.
Length of a Questionnaire
As a general rule, long questionnaires get less response than short questionnaires. However, some studies have shown that the length of a questionnaire does not necessarily affect response. More important than length is question content. A subject is more likely to respond if they are involved and interested in the research topic. Questions should be meaningful and interesting to the respondent.
PRESENTATION OF DATA:
From the data collected by the questionnaire, the following table was produced in order to produce an ogive.
Below is a bar chart showing the answers given by the 50 people asked the following question: What is the average amount you spend per visit?
ANALYSIS:
Followed are an explanation of what and how the calculations were done in order to produce the graphs and tables in the previous pages.
Cumulative Frequency
The cumulative frequency is the running total of the frequencies. On a graph, it can be represented by a cumulative frequency polygon, where straight lines join up the points, or a cumulative frequency curve.
My cumulative figures were produced as below:
1+13= 14
14+20= 34
34+12= 46
46+4= 50
50+0= 0
These data are used to draw a cumulative frequency polygon by plotting the cumulative frequencies against the upper class boundaries.
Mean:
There are four types of average: mean, mode, median and range. The mean is what most people mean when they say 'average'. It is found by adding up all of the numbers you have to find the mean of, and dividing by the number of numbers. So the mean of 3, 5, 7, 3 and 5 is 23/5 = 4.6 .
When you are given data which has been grouped, the mean is Sfx / Sf , where f is the frequency and x is the midpoint of the group (S means 'the sum of').
Therefore my figure for mean was derived as follows:
195.5/50 = 3.91
Mode:
The mode is the number in a set of numbers which occurs the most. So the modal value of 5, 6, 3, 4, 5, 2, 5 and 3 is 5, because there are more 5s than any other number.
On a histogram, the modal class is the class with the largest frequency density.
The mode from looking at my table is 3.01-4.50, as this has the largest frequency, being 34.
Range
The range is the largest number in a set minus the smallest number.
Therefore the range is maximum spend per visit subtracted by minimum spend per visit
9.00-1.00 = 8.00
Measures of Dispersion
Measures of dispersion measure how spread out a set of data is.
Variance and Standard Deviation
The formulae for the variance and standard deviation are given below. m means the of the data.
The standard deviation, s, is the square root of the variance.
What the formula means:
(1) xr - m means take each value in turn and subtract the mean from each value.
(2) (xr - m)² means square each of the results obtained from step (1). This is to get rid of any minus signs.
(3) S(xr - m)² means add up all of the results obtained from step (2).
(4) Divide step (3) by n, which is the sum of the numbers
(5) For the standard deviation, square root the answer to step (4).
Following are my figures for standard deviation:
Sfx2/Sf – x2
= 610757.9/50 – 3.91x3.91= 15.2881
= 12215.158-15.2881
= 12199.8699
=
Quartiles:
If we divide a cumulative frequency curve into quarters, the value at the lower quarter is referred to as the lower quartile, the value at the middle gives the median and the value at the upper quarter is the upper quartile.
A set of numbers may be as follows: 8, 14, 15, 16, 17, 18, 19, 50. The mean of these numbers is 19.625 . However, the extremes in this set (8 and 50) distort the range. The inter-quartile range is a method of measuring the spread of the numbers by finding the middle 50% of the values. It is useful since it ignore the extreme values. It is a method of measuring the spread of the data.
The lower quartile is (n+1)/4 th value (n is the cumulative frequency, i.e. 157 in this case) and the upper quartile is the 3(n+1)/4 the value. The difference between these two is the inter-quartile range (IQR).
By looking at my ogive it can be seen that the quartile ranges are as follows:
Q3 = 4.71
Q1 = 2.55
Therefore the IQR is, Q3 – Q1 = 4.71 – 2.55 = 2.16
Median:
The median value was also derived from the graph, this is: 3.70
Skewness
A is a bell-shaped distribution of data where the mean, median and mode all coincide. A frequency curve showing a normal distribution would look like this:
In a normal distribution, approximately 68% of the values lie within one standard deviation of the mean and approximately 95% of the data lies within two standard deviations of the mean.
If there are extreme values towards the positive end of a distribution, the distribution is said to be positively skewed. In a positively skewed distribution, the mean is greater than the mode.
A negatively skewed distribution, on the other hand, has a mean which is less than the mode because of the presence of extreme values at the negative end of the distribution.
There are a number of ways of measuring skewness:
From looking at the ogive it can be seen that the skewness is negative, as there are 1 or 2 unusually low values. Therefore the bulk of people showing figures above average.
INTERPRETATION:
From looking at the data collected, correlating it and presenting it in the forms that I have, it is obvious to anyone that the majority of the 50 people surveyed in the city centre of Doncaster, spent between the amount of 3.01 to 4.50, on lunch.
Therefore I feel it is not required to discuss it or analyse it any further, as the whole interpretation was based upon one single question, from the eight asked. However the possible sources of error in the data and any restrictions which were imposed and the method by which I collected the data will be discussed as follows.
EVALUATION:
The sampling method I chose was that of the non-random sampling method.
The reasons for why I chose to use this method is as follows:
. Purpose of the study
- identify variable relationships
- exploratory research
. Cost versus value
- probability sample too expensive
- low incidence of preferred respondents
- willingness to participate
. Time constraints
. Amount of acceptable error
The data was collected over a 3-day period in the city centre of Doncaster, between 12 and 2 o’clock.
People were approached on the assumption of them purchasing lunch at dinnertime, whether they are visitors or working people out at lunch.
This in itself poses the first source of error and restriction as it may be argued that the results are not significant to what was trying to be achieved.
Therefore the sampling frame may also be questioned, not to be representative of the population of Doncaster.
Another restriction was that of time availability to the research. The data as mentioned above was collected over three different days but at the same time. Thus time spent may be questionable.
Overall the following suggestions are made in ways which the whole investigation could have been improved:
. Could have used a different method of sampling, for example, random sample
. Collected the data over more than 3 days, to broaden the net
. Approached more than 50 people, in order to gain a better representation of the target audience
. Tried a pilot survey, in order to minimise error and bias.
INTERPRETING AND PRESENTING INFORMATION
ASSIGNMENT 2
Produced for: Trevor Louth
By: Tanzeela Anwar
APPENDICIES