Fig 1 – Necker’s Cube (Pike & Edgar, 2005).
A criticism of Gibson and indeed bottom-up processing is that it fails to explain certain illusions such as the Necker’s cube (fig 1) where the cube can be seen from two perspectives. This, as proponents of top-down processing would suggest indicate that the stimulus is insufficient to perceive what is being observed. They would argue that there are cognitive processes at work that allow to perceive the two perspectives, with a hypothesis being formed and then changing when another presents itself. Gibsonian theory would suggest that the cube lacks any of the depth or texture cues that are required for perception, and therefore the cube lacks ecological validity (Pike & Edgar, 2005). However, we do perceive the cube in a three- dimensional form, albeit from two possible perspectives. Gibson cannot also explain naturally occurring illusions, for example if you stare at a waterfall for some time and then gaze at a stationary object it appears to move in the opposite direction. If everything for perception was contained within the stimuli then there shouldn’t be a perception of movement. This may suggest involvement of top-down processing. Whilst rich or complete stimuli may be able to provide all that is needed for perception bottom-up theories struggle to explain how incomplete or impoverished stimuli are perceived. Gibson also omitted to explain what he meant by “picked up” when referring to the environment, leaving the question of the processes involved in perception unanswered (Pike & Edgar, 2005).
David Marr’s theory of perception whilst agreeing that perception is bottom up began to research how sensory information is analysed contrary to Gibson’s mainly descriptive view of how perception occurs (as cited in Pike & Edgar, 2005). In a view dissimilar to Gibson, Marr concentrated on object recognition as the goal of perception rather than perception for action. His theory of how the retinal image is analysed is broken down into four stages, with each stage progressive in nature and so extended Gibson’s theory by addressing how objects are picked up. Whilst there is research to support parts of Marr’s theory particular in relation to the formation of the two and half d sketch (one of the stages in the theory) there is also a body of research that proposes a flaw in another of his stages the full primal sketch (Pike & Edgar, 2005). However it is perhaps Marrs computational approach to research that significantly impacted on the study of perception.
In the natural world where a more complicated situation exists, intuition suggests that a combination of top-down and bottom-up processing may be involved in perception (Pike & Edgar, 2005). If everything that is needed for perception is found in the stimuli then bottom-up processing theory struggles to explain the impoverished picture below (Fig 2). If this were the first time you saw this picture it may look like a series of black and white blobs. However, if it is explained that in fact it is a picture of a Dalmatian, we perceive it differently. Once we know the picture is of a dog it is difficult to perceive it any other way which may suggests that top-down processing has an compelling influence over our perception This strongly suggests the influence of top-down processing in our perception, suggesting prior knowledge influences perception (Pike & Edgar, 2005).
Fig 2. (BBC, 2003)
The constructionist theory suggests stimuli are impoverished or incomplete and in fact we build our perception with a combination of bottom-up and top-down processing. R. L Gregory one of the main proponents of the constructionist theory suggests that we form a series of hypothesis that help to fill in the gaps of incomplete stimuli (Pike & Edgar, 2005). This hypothesis is influenced by stored knowledge or top-down processing. From the constructionist view perception is indirect as opposed to the direct view of Gibson (Pike & Edgar, 2005).
A criticism of this theory however is that it does not answer how these hypotheses are formed in a similar way that Gibson does not explain what he meant by picked up (Pike & Edgar, 2005). Neither does it suggest any mechanism for how we decide when to stop hypothesising and arrive at a particular perception. There is also the issue of false perception when we look at an object and perceive it as certain object we know it is geometrically impossible. This is highlighted with the Penrose Triangle (Fig 3).
Fig 3: Penrose Triangle (Pike & Edgar, 2005)
Whilst we understand that this figure is impossible to construct when we attend to each corner we perceive it as a three dimensional possibility. This indicates conflict between our hypothesis and notion that this is an impossible two dimensional image (Pike & Edgar, 2005). So this theory at present doesn’t suggest an answer as to how we can know something is wrong yet still perceive it erroneously.
It is the conflict between bottom-up and top-down processing that has led toward a dual theory of processing. The theoretical approaches can also be classified along the perception for action and perception for object recognition positions (Pike & Edgar, 2005). Both Marr, Gregory tended to look at perception with an end goal of object recognition, but with Gibson it was perception for action. However, in the natural world we do both. If an object is thrown at someone it does not matter what the object is as long they move to avoid injury, but it does matter after when we want to study the object to discover what it was (Pike & Edgar, 2005).
Modern research conducted on hamsters has suggested at least two streams of information flowing back from the retina to the brain. The first is the Ventral stream which runs from the primary visual cortex towards the infer temporal cortex. This appears to be concerned with the processing of object recognition (Pike & Edgar, 2005). The second, the dorsal stream which also runs from the primary visual cortex but towards the parietal cortex, seems more concerned with the analysis of motion. At some point however these two streams combine to provide an integrated view of perception (Norman, 2002). Schneider (as cited in Pike & Edgar, 2005) labelled these systems the “what” and “where” processing streams. Norman (2002) points out that the dorsal stream is faster than the ventral, it is better at processing motion and driving visual led behaviour. It has a short memory and appears to operate without much conscious involvement. All of which seem to fit the Gibsonian view of perception.
The ventral stream whilst slower is more adept at processing fine detail. It is concerned with visual object recognition. It operates with conscious involment and involves a greater use of memory using stored visual representations (Norman, 2002). This could be argued is similar to the constructionist theory of perception.
However, as Norman has suggested the two stream work synergistically in parallel. Both systems deal with size, distance and shape (Norman, 2002). Milner and Goodes (as cited in Pike & Edgar, 2005) work with DF who suffered damage to her ventral system was able to have size information available to her unconsciously through her dorsal stream even though it was unavailable through her Ventral stream, clearly indicating the synergistic nature of these visual processes.
The argument that any distinction in the dorsal and ventral process may better understood as a synergistic dual process can also be extended to bottom-up and top-down distinction. There is evidence to suggest that it is in fact a combination of both that may be involved in perception. Backward masking studies demonstrate that masking occurs when the mask either overlays the target or exactly coincides with the target. If this occurs then it generally follows that a participant is unable to remember the target (Pike & Edgar, 2005). However, it was also discovered that masking occurred when a four dot mask was presented in either of two conditions. One when the mask and target were presented together and the target was displayed very briefly or the second if the mask was displayed after a very brief presentation of the target (Enns & Di Lollo, 2000).
This is explained by reference to the re-entrant process. Evidence from neuroscience suggests that information passed along a neural network is not all one way. Neurons fire a signal back along the re-entrant pathways (Pike & Edgar, 2005). In line with the constructionist view and in particular Gregory we develop perceptual hypothesis that is then tested. The stimuli are received and a hypothesis is returned along the re-entrant pathway. However, by the time the information is passed along the neural pathway the mask is in place and the information has changed. This lead to a new hypothesis needing to be formed and the other “forgotten”. Whilst bottom-up processing is required, it is the formation of hypothesis and its testing along re-entrant pathways that also demonstrates top-down involvement.
Making a clear distinction between bottom-up and top-down processing might be useful in looking at individual parts of perception however, it may not be useful in understanding the whole process of perception. It may be like trying to understand how a plane flies and concentrating on just how the wings function without taking into account the engine. It is clear that we need bottom–up processing as the stimuli are the genesis of perception. However, this is not sufficient to explain perception completely as top-down processing can make sense of impoverished stimuli. As Normon (2002) suggests:
“…..they can coexist in a broader theory of perception.”
Like the ventral and dorsal streams, the Enns and Di Lollo research strongly suggest a dual approach with bottom- up and top-down processes providing a more holistic approach to perception. However, further research is needed to explain how these processes fully interact particularly in relation hypothesis testing such as when we stop forming hypotheses or how we finally decide which one is correct.
References
BBC. (2003, November). Reith lectures 2003: The emerging mind. Retrieved from http://www.bbc.co.uk/radio4/reith2003/lecture3.shtml
Enns, J. T., & Di Lollo, V. (2000). What's new in visual masking? Cognitive Science, Vol 4, 345-52.
Norman, J. (2002). Two visual systems and two theories of perception: An attempt to reconcile the constructionist and ecological approached. Behavioural and Brain sciences, 25(no 1), 73-96.
Pike, G., & Edgar, G. (2005). Perception. In H. Kaye (Ed.),Cognitive Psychology (2nd ed., pp. 63-104). Milton Keynes, England: The Open University.