Evaluate the view that recognition is the only goal of visual perception.

Authors Avatar

Evaluate the view that recognition is the only goal of visual perception.

This piece of writing will firstly draw upon two computational theories of visual perception and a number of neuropsychological studies in an attempt to evaluate the view that recognition is the only goal of visual perception.   It will go on to outline Gibson’s ecological approach to perception and evaluate the degree to which this perspective opposes the idea that perception is for recognition. Finally, evidence from the field of cognitive neuroscience will be presented and the wealth of information it provides on the topic will be discussed.

According to Epstein and Rogers, (1995) perception is the set of processes by which individuals recognise, organise, and make sense of the sensations they receive from the external world. It is the modality of visual perception, that is, perception by means of the eyes, and more specifically visual recognition which is of particular interest within the context of this work. Visual recognition has fascinated psychologists for decades, and can be described as the matching of the retinal image of an object to a description or representation of the object which is stored in the memory (Farah & Ratcliff, 1994). In order to make sense of the way animals perceive their world, cognitive psychologists adopt a computational perspective, suggesting visual perception is mediated by internal processing mechanisms.

One theory of visual perception in recognition which exemplifies a computational perspective is that of Marr (1982). He proposed a theory in which a series of explicit computational stages contribute to retinal stimulation, which is gradually developed into the perception of an object. Four modules or representations which Marr termed ‘sketches’, help the viewer to elaborate the structure of light stimulation sensed from the environment into a percept. Raw and unrefined primal sketches of the light structure in the environment are gradually built up using features in the environment such as edges, blobs and terminations, into ‘full primal’ sketches, and later ‘2.5D’ representations are derived which include information about depth and distance (Braisby & Gellatly, 2005). The final ‘3D’ stage of the computation allows the observer to identify an object via the use of stored internal representations of the world in memory. A number of studies have attempted to test elements of Marr’s (1982) theory. Marr and Hildreth (1980) for example attempted to test the raw-primal sketch aspect of the theory using a computer programme. They reported that when the programme was applied to images of everyday scenes, the algorithm was to a degree, successful in locating the edges of objects. Furthermore, in an attempt to test the three—dimensional aspect of Marr’s theory, Enns and Rensick (1990) found that test subjects in their experiment were able to extract and make use of 3-dimensional information in a series of grouping activities.

Exactly how 3D representations are derived from a 2.5D sketch is complex. According to Marr and Nishihara (1978), three dimensional objects can be described by one or more generalized cones, that is, any 3Dshape that has a cross-section of a consistent shape throughout its length (Braisby & Gellatly, 2005), such as a cylinder or rectangular block. The object that is to be recognised is recovered by recovering the generalised cones of which it is composed; this is done by using several sources of information, such as the major axis of the generalised cones, and the contour outline of the object and its concavities. One particular problem then in recognising objects arises when these features are difficult to locate. In Lawson and Humphreys (1996) study, findings suggest that whilst the rotation of an object does not affect the test subjects ability to recognise objects, such ability is affected when the major axis were harder to locate, such as when the object is tilted towards the viewer and the major axis appear foreshortened. Additionally, a number of neuropsychological studies also provide evidence to support Marr and Nishiara’s (1978) theory of object recognition. For instance, Warrington and Taylor (1978) found that patients with right hemisphere damage could recognise objects when presented in a typical view, but failed to do so when objects were presented in an unusual view. Furthermore, these patients were unable to determine whether two photographs were the same object when presented in typical and unusual views, thus suggesting that the principle axis is a vital component in object recognition.

Join now!

Biederman’s (1987) recognition-by-components model or geon theory is another dominant and widely accepted computational theory of object recognition. According to this theory, objects are composed of a series of geons, or three-dimensional-shape concepts such as a block, cylinder, funnel or wedge. Biederman suggests that these simple geometric ‘primitives’ can be combined to produce more complex ones. Bierderman and Gerhardstein (1993) termed this distinctive arrangement of parts a ‘geon structural description’, which is extracted from the visual object and is matched in parallel against stored representations of the 36 geons that make up the basic set. The identification of a visual ...

This is a preview of the whole essay