Aims
The aim of this study is to see if identifying the colours of words reading the same colours that they are printed in (e.g. The word ‘red’ printed in red) will take a similar amount of time to accomplish as identifying the colours of words that are written in opposing colours (e.g. The word ‘blue’ printed in yellow).
Hypothesis(es)
The experimental hypothesis is that participants will still take longer overall to state the colours of words printed in a different colour to what they say in comparison to the time it takes them to identify the colours of words printed in the same colour as what they say. The null hypothesis is that there will be no relationship in time overall between participants stating the colours of the words written as the same colour as what they say and stating the colours of the words written as a different colour to what they are printed in.
Method
Method and design
The study was a laboratory experiment, which was chosen because it did not originally appear to be necessary to carry out the study in the participants’ natural environment. It also allowed careful monitory of the dependent variable, and so extraneous variables could also be controlled. Each participant did both parts of our study, making the method a ‘repeated measures’ design. This type of design was chosen because the participants would have to do both parts of the study in order to compare the results for the two parts. It was not immediately recognised that there would be any order effects in the experiment; however, the experiment was still counter-balanced, so one half of the participants did part one first, and the other half did the part two first.
Variables
The independent variable in the study was the two different cards; one of which contained words printed in the same colour as what the words said, and the other of which contained words printed in a different colour to what they said. The dependent variable was the time it took each participant to read all of the words on each card in seconds.
Participants
An opportunity sample was used in the experiment, which would be made up of people readily available. This was practical, as it was time effective. Participants from a class in a Staffordshire school that contained thirty students was used. The students in this class were aged thirteen to fourteen, and there was a reasonable mix of both genders.
Apparatus
The apparatus in the study consisted of two cards, and a stopwatch in order to time each participant. There were two sets of the two cards, as both halves of the experiment were being done at the same time. The words and the colours of the words on the cards were selected as they came to mind, although it was ensured that the colours/words weren’t repeated immediately, as it may’ve become an extraneous variable if participants were able to recognise the same colour straight away. See the appendix for a copy of the cards.
Procedure
Informed consent was received from the headmaster of the school as the class contained students that were under sixteen years of age. Permission was also gained from the tutor of the class, because the participants would be tested in his class time.
The participants were taken out of their classroom and the experiment was done in a different room, as it was quieter in an empty classroom and therefore less distraction for the participants.
In the actual study, each participant was asked to state the colours of the words on one card first, followed by the colours of the words on the second card. The time it took each participant to read out each of the two cards was recorded. The two groups were tested at the same time by using a split-half method (two experimenters testing fifteen and another two testing fifteen), in order to get through the experiment more quickly and practically.
Controls
Extraneous variables that were identified before the experiment were considered throughout. For example, the font on each of the two cards was kept the same size and style, which was simple and easy to read. The participants were also given the right to not take part in the study if they wished, or to withdraw their result from the study afterwards. The participants were not commented on personally, and no details were taken besides those necessary. Therefore, ethical issues were also considered.
Results
Summary table showing tendency and standard deviation for the time it took to read out each card in seconds
See appendices for the working out of the averages and the standard deviation.
Summary table commentary
The mean, median and mode in the case of both groups are similar, varying no more than two digits apart.
Looking at the averages, there is a definite difference between the time in both of the groups. In all three averages, the time for group 2 is roughly double the time for group 1. Therefore, it is obvious that there was little difficulty in stating the colours of the words on card 1 than there was for card 2, meaning that there was no interference from the written word if it was written as the same colour that it was printed in.
There is a definite difference between the standard deviation for card 1 and the standard deviation for card 2, meaning that there was a wider range of results for card 2 than there was for card 1. There were a few anomalous results, which would have accounted for the bigger difference in standard deviation for the second card.
Graphical Description of Results
Descriptive statistics commentary
Looking at the bar chart, there is a definite difference between the time it took to state the colours on each of the two cards. The mean, median and mode are also fairly similar for the same cards. This graph compliments the idea that there is little if any interference from the words on card 1, and that the participants underwent a lot more interference on card 2.
Relationship of results to hypothesis
The experimental hypothesis was participants would take longer to state the colour of each word when they read differently to what they were printed in than when they read the same colour as what they were printed in. Looking at the original results (see appendices), the summary table and the bar chart, they definitely appear to suggest that the experimental hypothesis might be true. The mean, median and the mode represent a very large gap between the two times, card two is more or less twice as long as card one in all three averages.
Discussion
Validity
The participants were accurately timed as planned during the automatic and non-automatic tasks. However, in the non-automatic task (card 2), it is uncertain if it was actually the written words interfering with the participants’ ability to identify the colour of the ink, because the participants weren’t asked if that was the case. Another problem that was encountered was do to do with the fact that two participants did different tasks at the same time in the same room, and they tended to get confused as the other was doing his or her task aloud. It may have, therefore been a different type of interference that was being measured instead.
The participants were sent back to their classroom after the study, which meant that they conferred with others that had not yet done the study. This affected the outcome of the study because there appeared to be competition effects further through the experiment, as participants appeared to be trying to get through the tasks as quickly as possible. This does not appear to have made the individual timings faster, probably because the participants struggled as they went through the tasks too quickly. Instead of timing the difference between getting through the two tasks, however, it may have been the level of competition between each participant that was timed.
Taking account of these factors, the level of the study’s internal validity may not be very high. The task itself may also have a low level of validity, because it was not too realistic; identifying the colours of words is not an everyday task. Because it was a laboratory experiment, it was under controlled conditions; which was obviously unnatural and the unfamiliar environment may have intimidated the participants.
The classroom used for the experiment was for general use, and a teacher walked in whilst the experiment was being carried out. An experimenter was distracted at this point, and failed to stop the stopwatch in time. The amount of seconds gained was therefore estimated and was taken off. This result was obviously not too valid.
Suggestions for improved validity
A number of things could’ve been done to improve the study’s validity. If one participant were tested at a time, then it would be ensured that the interference was from the stroop effect, and not the other participant. This would have made the result timings shorter overall. Testing the level of interference from someone else reading aloud could, however, be a more ecologically valid task. The participants in this case could have been tested on their memory of what they had read, or the time it took them to read it out. A large level of interference would have been demonstrated if the timings were long or if the participants had trouble recalling what they had read. This task could have also been carried out in the participants’ classroom, which would be their natural environment and the interference from the rest of the class talking could have been measured. This would be applicable to everyday life, as it could be aimed to see if allowing a class to talk amongst themselves had an effect on an individual’s education.
The participants could have also been asked what was effecting their ability to identify the colour of the ink, to ensure a higher level of internal validity in our experiment. The participants could have also been sent back to a different classroom to ensure that they did not confer with those who had not yet done the study. This would hopefully mean that there would be no competition effects in the experiment, though it is difficult to identify if this means the timings will be shorter or longer. If the participants worked through the tasks at their own pace, then they would not struggle and have to repeat some of the colours. In this case, the timings might be longer. They might, however, work through the tasks as quickly as possible naturally anyway, meaning that the timings will be either shorter or the same overall.
Reliability
There is internal consistency in the results, in the sense that all of them suggest that the experimental hypothesis may be true. However, some of the timing figures for card one were not as far apart from the timing figures for card two as the other results. There does not appear to be any link between this and doing a particular card before the other, thus there were probably little if any order effects. This could be because the participants picked up on the fact that they were being timed, and tried to get through the cards as quickly as possible.
The experiment was highly standardised; every participant was read the same brief and the same de-brief, which was basic to account for the lower age of the participants. Each participant had to do the same task with the same cards. However, although the split-half method was used to eliminate order effects, it may have affected the results on either half of the timings. For example, those who did card two first would have undergone practice effects and found card one a lot easier than those who did that task first.
It could be that the difference in time between the two cards was generally supposed to be shorter, because a number of unfortunate factors may have made the timings longer, meaning that if the study was replicated, the results may not have been similar A teacher also walked into the room in the middle of the experiment, which distracted two of the participants, making their timings longer.
Improving Reliability
The colours on the two cards should be altered if the study were to be repeated, so that the colours were definite, and could not be, say, blue or purple depending on the individual participants’ opinions. In this case, it could be ensured that the participants wouldn’t delay in responding to any of the colours. The timings in this case would have been shorter overall. Measures should have also been taken to prevent anyone interrupting our experiment, to prevent distraction from the task. This would have also made the timings shorter in general.
General alterations like these would mean that if the study was carried out repeatedly, then the results would similar each time. There would also be more consistency in the results, because there would be fewer factors to alter the participants’ timings.
Although there were minor standardisation problems to do with order effects in the experiment, they probably could not have been avoided because the study have to be a repeated measures design; in order to compare the two sets of results.
Implications of study
In regards to past studies that were mentioned earlier, the results are the same as past conclusions. These results cannot be compared directly with Stroop’s original results because there were more words in each part of the task, and so naturally the timings would be longer in this study. Because of the larger amount of words, however, the results in this experiment would show a clearer division between the timings for the two tasks. This could have, however, created a bigger margin of error in comparison to Stroop’s original experiment. There were also reliability problems in the experiment, and such problems were probably avoided in Stroop’s original experiment.
This study, unlike past ones, has proven that the written word will not interfere with identifying the colour of it if the colour is the same as the word written. Interference does, therefore, come from a word reading differently to the colour that it is printed as in this task, and as Stroop’s original experiment shows, no interference is received from opposing ink colours when one is asked to read the colour words. It is possibly common sense to see that the automatic task of reading each word in part one this study can assist in identifying the colours of the words.
Generalisation of Findings
The results can possibly be generalised for this age group because the results were all in line with the experimental hypothesis. There was also a fairly large sample of participants. However, the sample that was used was an opportunity sample and was therefore biased. The participants were also of the same class and from the same area, which may have had an impact on the results as their education and upbringing would have been similar.
Because the task was fairly simple, however, it could be said that the class and the area would not have been a validity issue, and therefore could be generalised for the target population.
Application of study to everyday life
This study shows that when automatic processes are carried out, other factors can be neglected. For example, little attention is paid to colours of the words that we read. An example of this being an issue in everyday life would be in advertising. Do the colours of words on objects such as billboards actually have an impact on potential customers? A picture on a billboard may also not be necessary if the peoples’ immediate response is to read any words on the billboard, or even if it is, advertisers should account for the fact that people will usually read the words before looking at the picture, meaning that the intended impact of the advertisement shouldn’t come from looking at the picture first, and then reading the word.
References
Studies and theories mentioned in the background research
Webpages
- J.R. Stroop’s original study, 1935:
-
Macleod and Dunbar’s theory that reading is a faster process, 1988:
- Kline’s Benchmark study, 1964:
-
Mckenna and Sharma’s study on the emotional stroop effect, 1995:
- General introduction to the stroop effect:
- The ‘picture-word Stroop task’:
Books
N/A
Appendices
List of appendices
Item 1: The brief used in the experiment (includes standardised instructions)
Item 2: The de-brief used in the experiment
Item 3: Card 1 (words printed in the same colour that they are written as)
Item 4: Card 2 (words printed in different colours to what they were written as)
Item 5: The raw results from the experiment (two pages)
Item 6: The working out of the mean, median, range and standard deviation from the results.