The dependant variable in the research paper is the actual behaviour, which in this case is the length of time spent studying. The independent variable is the inter-test period (whether the daily, weekly or three-week testing schedule is applied). The reinforcer is the actual test itself.
The participants in the first section of this study consisted of six female students with an age range of 18 to 51 years old and two male students who were 21 and 23 years old. All the participants had passed a prerequisite course in general psychology and were members of an introductory lecture course in educational psychology. Participants were self-selected as they volunteered to take part in the study for course credit. All of the participants had to meet a strict selection criteria: they should have no previous experience in behavioural psychology; they need to have a minimum of an overall C average; and they should have no commitments (work or academic) that would conflict with the hours the study room would be available for use.
The study materials consisted of the full text of Ferster and Perrotts “Behavioral Principles” (1968) and selected chapters from The Technology of Teaching” by Skinner (1968). Any title or names that would allow the text materials to be identified were removed. The length of each daily reading assignment averaged at fifteen pages approximately.
The study area was a 17x25 ft seminar room in which there were desk chairs in which the students could sit, or around a large table. Admittance to the room during study times was forbidden to anyone other than the participants and notices to this effect were placed around the room.
The actual ‘time spent studying’ was defined as the time from which a participant sat down with the text material in front of him/her until they arose from the chair. This time was recorded to the nearest half minute by the observers. Staring into space, lighting a cigarette etc were all behaviours that were all counted as being part of the overall studying behaviour and as such were included in the recorded time spent studying. Zero minutes of study time were recorded for any session where a participant failed to attend.
The observation room consisted of an 8x11ft room adjacent to the study area in which was contained a 35x43 inch one way glass window which gave an unhindered view of the entire study area. The observers consisted of two male graduate students and two male faculty members of the university. During each three hour study session, observations were made by a single observer who sat at a table behind the window and recorded the times that each student began and stopped studying. Two other observers, seated to reduce the likelihood of copying, also independently recorded each student’s studying time from the same clock. During session 11, an inter-observer reliability check for all participants was taken, which resulted in agreement scores ranging from 92 to 100 percent.
The study area was available Monday through Thursday between the hours of 3pm to 6pm. All participants were free to come and go between these times or not to attend at all. Four requests were made of the participants: that talking was to be kept to a minimum; that they were to avoid bringing anyone else into the study room; only the materials provided within the room were to be studied; and no notes on the materials or the study materials themselves could be removed from the room.
The testing worked on a point system which resulted in a grade of A to D based on the total accumulated quiz points from a possible 180 received in the nine week course. Grade A was worth 160-180 points, B was worth 140-159, C was worth 120-139, and D was worth 100-119. Daily tests were worth 5 points each and the weekly test covering the four daily assignments set on the Monday to the Thursday were worth 20 points. A voluntary discussion where study methods and test items were reviewed followed every testing period. Tests were applied on Tuesday through Friday when on a daily testing schedule, and were administered on a Friday when a weekly testing schedule was in place.
The research design used was a within subjects (ABAB) design. Weeks 1-3 consisted of daily testing, weeks 4 and 5 consisted of weekly testing, week 6 saw a return to daily testing, in weeks 7 and 8, weekly testing was again employed, and in the final week (9) daily testing was once again initiated.
During daily testing conditions, every participant attended all the study sessions. However in the weekly testing schedules absences were occurring at the start of the week and decreasing with increasing proximity to the test. The overall mean study time per session during the daily testing schedule was 59.35 mins whilst the weekly testing schedule produced a slightly lower overall mean study time per session of 51.89 mins. When the conditions changed from a daily to a weekly testing schedule, there occurred a distinct increase in the variability of study each session. The high number of non-attendance that occurred in the early sessions of the weekly test decreased as the test got closer. One participant (#8) studied fairly consistently throughout the experiment and attended all the study sessions. Participants #6 and #7 dramatically changed their studying behaviour with the change from daily to weekly testing and produced scalloped result with the time in each study session increasing with increasing temporal proximity to the weekly test.
A second experiment was performed which compared the studying behaviour of students when on a daily testing schedule compared to a three week testing schedule, this was to investigate the effects of a longer inter-test interval on the participants studying behaviour.
The participants in the second section of this study consisted of six male and six female students with an age range of 19 to 26 years old again self-selected from an undergraduate course in educational psychology. The criteria for the selection of the participants, the reading materials used, study and observation areas, instructions given to the participants and the response definitions were all identical to that in Experiment 1. Checks on inter-observer reliability were taken in three sessions and resulted in agreement scores of between 96 to 100 percent.
The reading materials were introduced to the study room so that at any time the students had access to at least the next two weeks of reading materials. The material was allowed to accumulate over the duration of the course.
Once again a within subjects (ABAB) design was used. Weeks 1 and 2 consisted of daily testing, weeks 3,4 and 5 made up the three week test period, weeks 6 and 7 saw a return to daily testing, and in weeks 8, 9 and 10 a return was made to the 3 week testing interval. Daily tests were again worth 5 points each and the three-week test was worth 60 points.
In the first two daily testing weeks (sessions 1 through 8) results were obtained which showed consistent attendance similar to that found in Experiment 1. On changing the testing from daily to a three-week schedule, the variability in study for each participant from session to session showed a marked increase. No student had 100 percent attendance. In either one or both of the three-week test periods, nearly all participants produced scalloped response results. In session 21, a double reading assignment was given by mistake which resulted in unusually high numbers of minutes studied by four of the participants. Study patterns in the third and fourth daily testing weeks (sessions 21 through 28) produced similar study patterns to those previously observed in the first two weeks of daily testing. On a return to the three-week testing schedule in session 29, scallop patterns, similar to those seen in the first three-week testing schedule, were observed.
The overall mean study time per session during the daily testing schedule was 68.7 mins whilst the three weekly testing schedule produced a slightly higher overall mean study time per session of 73.52 mins. The consistent study patterns that were observed in the daily testing schedules of the first experiment were reproduced in the second. Under the three-week testing schedules, nearly all the participants showed clear scalloped patterns of study in the second experiment. However, two subjects exhibited unusual study patterns where they attended (at most) only every other session. It was later discovered that they had broken the rules of the course by making other commitments that interfered with the study sessions.
The two experiments showed a functional relationship between two schedules of testing and the subsequent distribution of study behaviour. The purpose of the second experiment was to see if the scallop effects seen in the first experiment would become more prominent with the larger period of time between testing, and this purpose was met.
There are some problems with the study however. The way in which study behaviour was defined does not take into account whether the participant was actually studying or not, they could quite easily be spending their time thinking about something else as, so long as they were seated with the text materials in front of them, they were deemed to be ‘studying’. The age range in experiment one was 33 years whereas in the second experiment it was only 7 years, as different age groups tends to have differing study patterns and levels of motivation (Grant and Evans, 1994), the comparison of the two groups would have been affected slightly. The differing sizes of the groups do not really affect the study as it is looking at studying behaviour from a behaviour analysis perspective therefore the focus is on the individual. The fact that there was only one person observing (at some points) twelve people could have been improved by perhaps having one observer for every three participants – this was slightly improved by having two independent observers noting the same data, and reliability checks between the observers were consistently high.
The implications of this study for educational course design are that theoretically students will study more often and more regularly if they are tested daily. However this is completely impractical for many reasons including time spent preparing the tests and marking them, as well as teaching time lost to testing; resources required – even something simple as the amount of paper that would be needed to test each child in every class every day. Also the very testing itself would become extremely monotonous to both the tester and the person(s) being tested. The study has shown however that leaving large inter-test intervals between assessments produces a scalloped effect in studying time. The majority of schools and universities use this length of test interval (i.e. prelims and exams; and midterms and finals) with an approximate inter-test time of around 4 months which would produce very low levels of studying for an extremely long period of time. From the results obtained in this study, a more effective method of reinforcing students studying behaviour would be to test more frequently. However as has been previously noted, daily testing is highly impractical, and even if testing was carried out every month, a scalloped effect would still be extremely likely to occur. A far better method would be to use a form of variable ratio testing where testing would occur on average once a month, but the pupils would not know exactly when the test would presented and would therefore be increasingly likely to study more continuously.
References
Chance P. (1999) “Learning and Behavior”, USA, Brooks/Cole Publishing Company.
Ferster C.B. and Perrott M.C. (1968) “Behavior Principles”, New York, New Century.
Ferster C.B. and Skinner B.F. (1957) “Schedules of Reinforcement” New Jersey, Prentice Hall.
Grant L. and Evans A. (1994) “Principles of Behavior Analysis”, New York, HarperCollins College Publishers.
Holland J.G., Solomon C., Doran J., and Frezza D.A. (1976) “The Analysis of Behavior in Planning Instruction”, California, Addison-Wesley Publishing Company.
Kelleher R.W. and Morse W.H. (1968) Determinants of the Behavioral Effects of Drugs, in Tedeschi D.H. and Tedeschi R.E. (Eds) “Importance of Fundamental Principles of Drug Evaluation”, New York, Raven Press.
Mawhinney V.T., Bostow D.E., Laws D.R., Blumenfeld G.J. and Hopkins B.L. (1971) A Comparison of Students Studying Behavior Produced by Daily, Weekly, and Three-Week Testing Schedules, Journal of Applied Behaviour Analysis, 4(4), p257-264.
Skinner B.F. (1968) “The Technology of Teaching” New York, Appleton.
Zeiler M.D. (1979), Output dynamics, in Zeiler M.D. & Harzem P. (Eds), “Reinforcement and the Organization of Behaviour”, New York, Wiley.