In conjunction with the aims of the study, the target population was people of both genders aged 16-18, and participants were gathered using an opportunity sample. The participants numbered 20 and were male and female sixth form students at St Aidan's Church of England High School. The youngest was 16 and the oldest 18. Conditions were allocated to participants by alternation, whereby odd-numbered participants (1st, 3rd, 5th, ...) were allocated to Condition A (categorised) and even-numbered participants (2nd, 4th, 6th, ...) were allocated to Condition B (randomised). Psychology students did not participate because they would be more likely to guess the aim of the experiment, which could lead to confounding variables
The materials used were as follows:
- Standardised instructions and agreement form (one per participant; see Appendix 4);
- Either categorised word grid or randomised word grid (one per participant; see Appendix 4);
- Lined paper (one A4 sheet per participant);
- Pen;
- Stopwatch.
Before carrying out the investigation, the word grids were prepared. Six words were chosen from each of the four different semantic categories: sports, animals, countries and colours. In the categorised word grid the words were arranged such that each category had its own line. For the randomised word grid, the words were arranged randomly, with the order determined by a custom PHP script (see Appendix 5) to eliminate any confounding variables that may have arisen from manual randomisation.
The materials were then prepared: 20 blank sheets of lined paper were gathered and 20 copies of the consent form were printed, along with one printout of each word grid.
An empty classroom was used to carry out the experiment, and no two participants were in the room simultaneously in order to avoid cheating or distraction of any sort. Once in the classroom each participant was given the standardised instructions and agreement form to read, understand and sign. When ready, the appropriate word list and blank sheet of paper were given to the participant and the stopwatch was set for 60 seconds. After this, the word list was taken away and covered, and the participant was allowed as much time as they required in order to recall the words that they remembered.
When each participant finished writing they were debriefed, and were told the aims of the experiment. Also, they were again given the opportunity to withdraw and the chance for any questions to be asked or comments to be given.
Results
The mean number of words recalled by participants in Condition A (categorised condition) was 14.6, compared to 15.6 in Condition B (randomised condition). This was unexpected, as it was hypothesised that those to whom information was presented randomly would remember less information than those to whom it was presented in categories. Figure 1 shows the median (14.5 for Condition A; 15 for Condition B), range (6 for Condition A, 10 for Condition B) and interquartile range (4 for Conditions A and B) of the results.
Figure 2 shows the mean and standard deviation of the number of words recalled by participants in the two conditions. It is evident that the distribution of results in Condition B is more widely spread than in Condition A, and that the mean number of words remembered in Condition B is higher than that of Condition A, despite being found to be statistically insignificant (see Appendix 2). The reason for this may be due to individual differences, see discussion.
The frequency with which each word was recalled was also recorded. Most frequent were dog, Uganda and red (each being recalled 18 times out of a possible 20), and least frequent were swimming and cow (each 8/20), Australia (6/20) and China (4/20). 303 words in total were recalled out of a possible 480, meaning that 177 were forgotten. The most-recalled group was colours (85/120) and the least-recalled group was countries (64/120). See Appendix 1 for each word's frequency of recollection individually.
The independent t-test (see Appendix 2) was used in order to measure the significance of the difference in results between the two conditions. A value of t ≥ 1.734 was required for significance with p ≤ 0.05 and 18 degrees of freedom. The value of t was found to be 1.188, meaning that the results are not significant.
All raw data and calculations can be found in Appendices 1-3.
Discussion
Unlike previous research such as that by Bower et al. (1969), who found organisation of information to increase recall, this experiment found that the degree of organisation to which the word grids were presented to participants had little effect on the amount of information they remembered, which would suggest that organisation of information upon presentation has little effect upon how well a person remembers that information. This means that the experimental hypothesis for this study can be rejected and the null hypothesis retained.
Nonetheless, organisation cannot be rejected as a key factor in remembering information. A trend was noticed in the participants assigned to Condition B to group together the words into the same categories that were listed in Condition A. Indeed, this appears to be the same phenomenon that Bousfield (1953) describes as categorical clustering. This trend having been noted, it was analysed by measuring the degree of categorisation (as a percentage; see Appendix 3) for each of the participants and for the two word lists themselves.
Had the word grids been recalled word-for-word, participants in Conditions A and B would have produced results with 100% and 0.5% categorisation, respectively. The actual average for each condition was 85.1% and 68.83%, respectively, which indicates that a significant amount of categorical organisation of the information presented was undertaken by participants; and particularly those in Condition B. This further suggests that organisation as a cognitive process is very important for the encoding of information, supporting Mandler's (1967) claim that organisation is a necessary condition for memory. The reason why the average degree of categorical clustering for Condition B was lower than that of Condition A may be because the categories were not obvious, allowing for more freedom in terms of choice of the method of organisation to be used.
However, if semantic categorisation was the only way of storing information, it would follow that all of the participants would have an average degree of categorisation of 100% for each condition. This may be accounted for by individual differences. For example, one participant (B01, see Appendix 1) said that she remembered the words by associating colours with animals, which does itself suggest a degree of categorisation, but not one in which the categories are clustered in blocks. An example of a method of organisation which did not involve categorisation was shown by participant B08 who commented that she used rhyming strings of words to remember more easily. This was a form of acoustic organisation, not one which relied on semantic categories. However, even though categorical clustering was not evident in all of the participants' recalls, each participant organised the data in their own way (rhymes, mental images, the order that they occurred on the list, etc.). This also supports Tulving's (1968) claim that people presented with randomly sorted information will attempt to organise it in some manner.
In Condition A, the categorisation of the words was very evident through the layout of the word grid and this might account for the higher average degree of categorisation. This held for participants who had forgotten words. For example, participant A04 said he knew that he had failed to recall an entire category, but couldn't remember what the category was. Upon being told what the category was he successfully recalled all six words (the countries) without further prompting. Participant A05 similarly commented that she knew how many words of each category she had forgotten. Other participants in Condition A also commented that they counted how many words they had forgotten. This shows that, even without categories being explicitly demonstrated, participants in Condition A had the ability to notice patterns in the categories of the words and, alongside the given information that there were 24 words, they were able to decipher how many words in each category remained.
This indicates that when memories of this nature are stored and categorically organised, what might be called a domino effect is seen upon recall: the knowledge of what the semantic category is acts as a recognition cue for one of the words, and then that word for each word thereafter. A demonstration of this became evident in debriefing: many of the participants requested to see the word list again, and of those, each one made remarks similar to "I knew I'd forgotten a colour." A more commonplace example of this domino effect may be seen when reciting the alphabet. Most people can say it from A to Z without hesitation. However, asking a person to recite it from the letter T, for example, may cause hesitation as no previous letters preceded and therefore nothing could act as a recognition cue, as opposed to if the sequence R, S, T was given.
Returning to the original finding that there was no significant difference in the number of words recalled in either condition, the reason for the discrepancy between this study and previous research, which did find significant differences, may be that participants were given 60 seconds to remember the word list and an unlimited amount of time to write down the words that they remembered. This gave an average of 2.5 seconds to remember each word, which may have given participants in Condition B the opportunity to recognise that there were four distinct groups of words. A way round this for future research may be to either increase the number of words, increase the number of categories, decrease the amount of time given to remember the words or limit the time given to recall the words. In other words, it may be that if the obviousness of the categorisation is reduced, a significant difference between the two conditions may become evident as the semantic domino effect may not develop.
Another interesting finding was the distribution with which the words were recalled. For example, the most commonly-recalled country (18/20) was Uganda, compared with Germany second (14/20) and England third (13/20). Similarly, the most common animals to be recalled were dog (18/20), cat (17/20) and chimpanzee (16/20). Conversely, words such as China (4/20) and cow (8/20) were very infrequently recalled. The patterns observed here indicate that the words most often recalled fall into one of two groups:
- Words very common in usage and typical to their category in the word grid; or
- Infrequently-used words which stand out.
This would explain the large quantity of participants remembering Uganda and chimpanzee, for example, as they are very infrequently used and may have stood out from the more generic words in the table. This may also account for why words like China, cow and swimming were frequently forgotten: they are neither very common nor uncommon in their usage in everyday life, nor are they stereotypical of their respective categories. What is meant by this is if the question was asked name a sport, it is unlikely that the answer swimming would be given, whereas football would be a more likely answer, despite swimming being a relatively common word to encounter. This builds on the idea of categories acting as recognition cues for subsequent words.
Also interesting was the distribution of recalls by category: colours were recalled the most frequently (85/120), compared with sports and animals (both 77/120) and, least frequently, countries (64/120). There could be several explanations for this, but it appears to constitute primarily of two factors: the frequency of usage, and the size of the categories' domains. For example, colours are frequently used words and there are relatively few words that fall under that category; sports and animals are also categories from which often-used words are drawn, but there are many more words that fit into them than there are for colours; and countries are less frequently-used words. Therefore, a decrease in common usage and an increase in size may lead to proactive interference, causing more confusion and, occasionally, incorrect words to be recalled. This is demonstrated, for example, in that the word America was recalled three times despite it not being on any of the lists (see Appendix 1).
In the results from Condition B, there is also evidence that primacy and recency may have occurred. Respectively, green and dog are the first and last words on the grid, and they were recalled by 10 and 9, respectively, of the 10 participants in that condition. No such effect was found, however, in Condition A, suggesting that the order in which words are sequenced has little effect if there is a more significant method of organisation present (in this case, categories).
These patterns indicate that organisation is the key factor in remembering information, but at any one time there may be several methods of organisation occurring simultaneously, such as the words' semantic categories, the order that the words are written down, and the frequency of the words' usage, among others.
This study did, however, have limitations; the most prominent of which is the potential lack of population validity as a result of the relatively small sample size used and the highly restricted age group from which participants were drawn. This could be overcome in future research by widening the target population and using a larger sample in order to identify trends in more detail. In terms of ecological validity, the study uses artificial stimuli to test memory, and naturally occurring stimuli could be used instead in order to observe the effects of organisation on learning in a natural setting and thus improve the ecological validity.
There are implications of this study for many aspects of life which involve learning, but particularly education. It has shown that information is better learnt when organised, either upon presentation or as a mental process. The implication of this is that pupils and students may learn information more efficiently through teaching methods involving organising information into structures and providing tasks to do so if the information is not already organised. The former would provide explicit organisation, and the latter would allow individual pupils and students to find their own ways to learn greater amounts of information.
Future research might aim to investigate further into the effects of categorisation. This could be done by using a larger list of words or by drawing words from more distinct categories, and observing if, how and how much participants categorise these words; and relating this to the amount of information they remember. A wider target population would also be beneficial. It is often cited that children learn information more efficiently than older adults, and giving participants from the two age groups the same task and comparing the results would provide insight into how the process of learning is different between them, if indeed it is different.
To conclude, this study has found no significant effect of organisation of information upon the learning of this information, but organisation cannot be ruled out as a significant factor. It may be the case that organisation upon encoding, rather than presentation, is the factor that determines the storage of the information. This organisation may be in the form of categorisation, but individual differences exist with regard to how this information is organised. Other factors may be how commonly the information is experienced in the given context, and how many recognition cues are available for the information to be recalled.
References
Bousfield, W.A. (1953). The occurrence of clustering in the recall of randomly arranged associates. Journal of General Psychology, 49, pp. 229–240.
Bower, G.H., Clark, M.C., Lesgold, A.M. & Winzenz, D. (1969). Hierarchical retrieval schemes in recall of categorized word lists. Journal of Verbal Learning and Verbal Behaviour, 8, pp. 323–343.
Coolican, H. (2004). Research Methods and Statistics in Psychology. Hodder & Stoughton; 4Rev ed., p. 662.
Gutchess, A.H., Yoon, C., Luo, T., Feinberg, F., Hedden, T., Jing, Q., Nisbett, R.E. & Park, D.C. (2006). Categorical organization in free recall across culture and age. Gerontology, 52, pp. 314–323.
Hart, J., Berndt, R. & Caramazza, A. (1985). Category-specific naming deficit following cerebral infarction. Nature, 316, pp. 439–440.
Kahana, M.J. & Wingfield, A. (2000). A functional relation between learning and organization in free recall. Psychon Bull Rev. 2000, 7, pp. 516–521.
Mandler, G. (1967). Organization and memory. In K.W. Spence & J.T. Spence (Eds.), The psychology of learning and motivation, 1, pp. 327–372. New York: Academic Press.
Rubin, D.C. & Olson, M.J. (1980). Recall of semantic domains. Memory and Cognition, 8, pp. 354–366.
Tulving, E. (1968). Theoretical issues in free recall. In T.R. Dixon & D.L. Horton (Eds.), Verbal behavior and general behavior theory, pp. 2–36. Englewood Cliffs, NJ: Prentice-Hall.
Appendix 1: Raw data
The participant identification numbers are in the form Ann or Bnn, where A refers to the categorised condition, B refers to the randomised condition, and nn is a two-digit number rendering the full ID unique to each participant.
The numbers in the following table refer to the order in which the words were recalled, and the numbers in bold show the last word recalled (and thus the total number of words recalled by each participant). Words not recalled are marked by a dot. A key to other symbols used is shown at the bottom of the page.
* Word recalled similar in meaning but different to original.
† Word recalled more than once.
‡ Word misspelt but accepted.
The following table shows words recalled by participants which were not on the word grid. If a participant is not listed, they only recalled words which were on the grid.
* Word similar in meaning but different to a word on the grid
During debriefing, there was the opportunity for participants to ask any questions or give feedback. Any significant comments and questions were noted and are listed in the table below.
The words, ordered first by their frequency (out of 20), then by their category, then alphabetically, are shown in the table below.
Appendix 2: Central tendency, spread and significance
Central tendency and spread
The following set of tables show the amount of correct words recalled by each participant, the mean, standard deviation and other values required for calculating the significance of the results.
In summary:
-
The mean number of words recalled in Condition A is 14.6, and the mean for Condition B is 15.6.
-
The median for Condition A is 14.5, and that for Condition B is 15.
-
The range for Condition A is 6, and that for Condition B is 10.
-
The standard deviation for Condition A is 2.15, and that for Condition B is 5.21.
Significance
The t-test for unrelated data was used to test the significance of the difference between the results. This was calculated using the following equation with the values from the tables above.
A value of t ≥ 1.734 was required for significance with p ≤ 0.05. Putting in the values from the table above:
The value of t was thus found to be 1.188, which is below the threshold for statistical significance.
Appendix 3: Measure of categorisation in recall
The categorisation of recall was done by first replacing recalled words with letters that represent their respective categories.
S = sports; A = animals; X = countries; C = colours;
This leaves a string of letters. A block is defined as a series of one or more of the same letters joined together, e.g. [CCC] or [X]. The degree of categorical clustering was calculated by using the following equation.
C is the degree of categorical clustering, a is the number of categories recalled, n is the total number of words recalled, wi is the number of words in each category i, and bi is the number of blocks encompassed by that category. It is done in this way because Σwi will always be equal to n. This means that if all words are grouped together in their various categories, then bi will be 1 in each case, meaning that the numerator and denominator of the fraction will be equal, so C = 1. It also means that for any other values of bi, C < 1, then the increased proximity to 0 will indicate a decrease in categorisation. A value of C = 0 indicates a complete lack of categorical clustering, i.e. no adjacent words which are in the same category.
C when referring to the degree of categorisation of a participant's results will be represented with a percentage, so C = 1 refers to 100% categorisation and C = 0.3819 refers to 38.19% categorisation, and so on.
Using the data from participant A09 as an example:
All 4 categories were used → a = 4;
12 words* were recalled → n = 12;
There are 3 words in S in 3 blocks → w1 = 3, b1 = 3;
There are 4 words in C in 2 blocks → w2 = 4, b2 = 2;
There are 4 words in A in 1 block → w3 = 4, b3 = 1;
There is 1 word* in X in 1 block → w4 = 1, b4 = 1.
* Words not on the original word grid are omitted.
Putting these into the equation gives C = 0.8000, or 80%, which is the degree of categorisation for participant A09.
The degree of categorisation to four decimal places for each participant is shown below:
If words were recalled in the order that they appear on the grids, then:
For the categorised grid, C = 100%;
For the randomised grid, C = 0.5%.
Appendix 4: Original materials
Standardised instructions and agreement form
This is the form used to ensure that participants understood the procedures of the investigation, and that they understood their rights as participants, for example, to withdraw.
Word grids
The following 6×4 word grids are identical to those presented to participants. Both grids contain the same 24 words. The words are split into 4 categories each of 6 words. The categories are, respectively: sports, animals, countries, colours.
Categorised grid (Condition A)
Each category is listed on its own line.
Randomised grid (Condition B)
The order of the words in this list was randomised using a PHP script (see Appendix 5), working horizontally from the top-left to the bottom-right. This eliminated any patterns, specifically semantic patterns, that may have arisen through manual randomisation.
Appendix 5: Random word order script source code
This script, whose source code is shown in a monospace font, was used in order to eliminate any chance of a pattern occurring accidentally as a result of human error. Line numbers are shown on the left. The script is written in PHP and can therefore be executed by embedding it in a HTML page on a PHP-compatible server, or otherwise by using a PHP compiler.
1: $wordid = array();
2: $output = array();
Lines 3-12 put all integers from 1 to 24 in a random order in the array $output.
3: for($i = 1; $i <= 24; $i++) {
4: $wordid[] = $i;
5: }
6:
7: for($j = 1; $j <= 24; $j++) {
8: $key = mt_rand(0, count($wordid) - 1);
9: $output[] = $wordid[$key];
10: unset($wordid[$key]);
11: sort($wordid);
12: }
Lines 13-33 convert the numbers in $output to words. Line 15 avoids ambiguity between, for example [14] and [1][4]. E.g. [3] becomes [03]. Line 31 replaces the numerical values from $rep_num with the corresponding words in $rep_word and adds them to the final word list.
13: foreach($output as $num) {
14:
15: if($num <= 9) { $num = "0" . $num; }
16:
17: $rep_num = array(
18: "01", "02", "03", "04", "05", "06",
19: "07", "08", "09", "10", "11", "12",
20: "13", "14", "15", "16", "17", "18",
21: "19", "20", "21", "22", "23", "24"
22: );
23:
24: $rep_word = array(
25: "football", "rugby", "tennis", "badminton", "golf", "swimming",
26: "bird", "cat", "dog", "horse", "cow", "chimpanzee",
27: "England", "Germany", "China", "Uganda", "Australia", "Finland",
28: "red", "green", "blue", "purple", "yellow", "white"
29: );
30:
31: $wordlist .= str_replace($rep_num, $rep_word, $num) . " \n";
32:
33: }
Line 34 displays the randomised word list.
34: echo $wordlist;