The way in which objects are perceived leads on to the idea of physical object recognition. When we look at an object we immediately see it for what it actually is. For example we can see that a table is a table if there is one in front of us. However the visual process that takes place is complex as we need to know things like where a shape starts and ends, how different objects are distinguished from one another and how we know an object to be what it is. The ideas surrounding this have been acknowledged and theories surrounding how we come to recognise an object have be studied and tested. The initial studies of objects focused mainly on the recognition of two-dimensional patterns. Theories have looked at the idea that a template is stored in the long term memory and that when we see an object it is identified by us matching up the closest template to the stimulus input. Another suggestion is that we recognise things through its features. For example we know that a door is tall and has a handle and so we use these features to match up against the information stored in our memory. Scientific studies focusing on the idea that recognition is caused by cells within neurons have also been put forward. It has been suggested by Hubel and Wiesel that cells have an on off response that control the rate of firing depending on how the light falls upon the cells(1950 as cited in Perception, Robert Sekular and Randolph Blake). It was said that these neurons aided us in the detection of features as the processing in the visual cortex is based on a series of straight lines and edges.
Marr and Bierderman are two people that have studied object recognition based upon the constructivist theories. Both propose major concepts but Bierderman extends the work of Marr drawing his own conclusions. Marr’s approach is based upon a framework of features that in turn lead to the final visual output. Marr believed that object recognition is made up from an input, a raw primal sketch, a 2½d sketch and a 3d representation (Reisberg, 2001). The input is the intensity of light from each point of the image on the retina. This input enables us to identify the edges and primitives by flattening out the light intensity to show the image in its individual light. This is what is known as the raw primal sketch. The raw primal sketch is a ‘grey-level representation’ of the object caused by the change of light. This change is caused by different angles and textures changing the intensity causing shadows and brightness allowing an outline of the object to be formed. This follows Gestalt principles as features of similar size and orientation are grouped together. The 2½d sketch involves the group primitives being processed further by making use of the texture, shading and binocular disparity. This provides us with a sense of depth, which is important for the next stage of recognising an object. The final stage that Marr believed to be the visual output was the 3d representation. This draws together all of the previous stages giving the perceiver an overall picture of the object and allows them to recognise it from any viewpoint angle. Marr and Nishihara (1978, as cited in Eysenck and Keane,2001;Page,M, 2003) put forward the idea of using cylinders when describing objects
Fig 2
They proposed that recognising a 3d object involved matching a 3d model against a series of 3d objects stored in memory and that identification of concavities were identified first before anything else. This then gave the basis for the overall object recognition. Overall Marr’s explanation of object recognition does explain how the visual system could work however it operates on a greyscale, monocular level which doesn’t contain enough information to individualise each object. Also Marr’s idea of the light intensity identifying the edges of objects could pose problems as change in brightness may not always represent a border as it may be caused by lighting conditions.
Biederman believed that objects looked different depending on what perspective it was being looked at. He believed that objects had a general set of basic elements called geons and that images were broken down into these geon components. In all he said that there are 36 different geons however these could be arranged in a number of different ways to display many different types of shapes and patterns. Biederman’s model placed emphasis on five “non-accidental” properties of edges. These were: 1) that points on a straight line in a sketch meant that these points also lie on a straight line on the object. 2) Points on a curve suggest a curve on the object. 3) Points that are symmetrical suggest symmetrical parts on the object. 4) Parallel parts on the sketch will also be parallel on the object. 5) Lines meeting together in the sketch will be the same on the object. It is these five properties that decide where in the object the geons are found. A study carried out by Biederman, Ju and Clapper in 1985 (as cited in Eysenck and Keane,2001) found that objects could be recognised even when most of their geons were missing. Line drawings were presented to subjects with missing geons and it was found that in 90% of the time subjects were still able to recognise the object to be what it was. Biederman’s ideas were tested by a variety of studies however his ideas had criticisms too. Firstly it was said that although Biederman was able to show distinction between one class and another he was unable to show distinction between two objects from the same class. Also he didn’t take into account the idea of context, as it could be that we use our environment to aid our judgement of what an object is.
Looking at object recognition leads onto the concept of whether faces are recognised in the same manner as that suggested for objects. Faces all have the same features however they are all very unique which individualise them from one another. There have been suggestions that faces are recognised differently from objects and this was first indicated through looking at brain-damaged patients who suffer a type of agnosia called prosopagnosia (Gross, 2001). This prevented patients from recognising faces even though their ability to recognise objects remained relatively intact. This places the idea that there is a individual neural structure that is used purely for the discrimination and recognition of faces alone. Farah (1994 as cited in Eysenck and Keane, 2001) carried out a study in which he used a patient suffering from prospagnosia called LH. He tested LH against a series of normal controls in their ability to recognise pairs of faces against pairs of spectacles. He found that LH performed just as well as the controls in recognising the spectacles however was unable to perform well when recognising the faces.
Faces, like objects have been studied and theorists have created their own models regarding how faces are recognised. Bruce and Young (1986 as cited in Eysenck and Keane,2001) put forward an eight-component model, which looked at what is needed in order to recognise familiar faces, and process unfamiliar ones. According to Bruce and Young’s model a familiar face is recognised through structural encoding, face recognition units, person identity nodes and name generation. Structural encoding involves producing a number of descriptions and representations of faces, which is what gives the overall initial idea that what is being looked at is indeed a face. The face recognition units are what provide the structural information of familiar faces, which is what allows the face to be recognised as being known or seen before. The next part, the person identity nodes provides information about the individuals, for example where they live, occupation etc. This allows a mental picture to be built up regarding the identity behind the face. Finally the name generation is associated to the face, however this is stored separately from other information, which is said to be why people sometimes recognise someone, but are unable to put a ‘name to the face’. In the case of an unfamiliar face, processing involves structural encoding, expression analysis, facial speech analysis and directed visual processing. The structural encoding is the same process that takes place for a familiar face, as this is just a general process that informs the perceiver what they are looking at. Expression analysis takes a person’s facial features and uses them to extract their emotional temperament informing the perceiver of the mood of the person. Facial speech analysis is where the observer looks at the persons lip movements in order to identify speech sounds and patterns. Finally the directed visual processing allows the perceiver to process specific facial information individually which leads to an overall processed vision of a face. Bruce and Young’s model therefore concludes that familiar and unfamiliar faces are processed differently from one another. Experimental evidence supporting this model includes the work of Young, Hay and Ellis (1985, as cited in Gross, 2001). They asked participants to keep a diary of problems they encountered with face recognition. It was found that participants never said that they could put a name to a face without knowing other information about the person and on the other hand were sometimes able to remember information about a person but unable to remember a name. This was taken from a total of 1008 incidents of which on 190 occasions the person could not remember the name.
A lot of research looks into how a face is processed, whether it be as a whole structure or as parts, which are processed individually. Evidence supporting the configural processing includes the work of Farah et al (1998 as cited in Eysenck and Keane, 2001) who presented participants with a face followed by a mask and then by a second face. The idea was that participants had to decided whether the second face that was presented to them was the same as the first face. The key to this study was the mask and whether it was a whole face or a scrambled up face. Farah predicted that if faces were processed as a whole then participants would do better when shown the mask with the scrambled parts and this proved to be right. Participants were more distracted when a whole face was shown to them. Another study by Tanaka and Farah (1993 Page.M.2003) found that when facial features were shown in context rather than in a scrambled face then participants did better at recognising what feature belonged where when shown from a context concluding that a face is processed as a whole image. Contradicting studies, which refer back to Bierderman’s idea that objects are made up from components, look at the idea that faces are processed individually. Bradshaw and Wallace (as cited in Gross, 2001) carried out a study to see whether participants could distinguish whether a second face shown differed from the first depending on how many of the features were changed. Each participant was shown a line drawing of a face and then shown a second face a bit later. Bradshaw and Wallace found that the more differences between the two faces the quicker the participants responded which suggests that parts of the face are processed individually.
Research has also suggested that upright and inverted faces are processed differently. Thompson’s Thatcher illusion indicated that visual processing can indeed be deceptive. In this illusion one of the pictures is shown with inverted eyes and mouth and looks fairly normal when looked at inverted, but when turned upright it shows a completely different picture. This indicates that upright faces are processed more as a whole picture whereas inverted faces are processed by components.
Fig 3
http://www.essex.ac.uk/psychology/visual/thatcher.html
Overall it seems that most of the evidence does conclude that faces are processed as a whole which is why faces are said to be ‘special’. This differs from how objects are processed as this has been suggested to be done in a series of stages based upon component theories. Prosopagnosia patients indicate that faces are processed in a different part of the brain and so have their own unique processing system, which is why they can recognise, objects but not faces. When Gregory put forward the idea that we perceive things based upon environmental cues this lead onto further research of object recognition by Marr and Biederman, which outlined the use of light patterns to gain ideas of primitives and edges. Tanaka and Farah based their research on the idea of context when looking at faces and this too found that context plays an important role when identifying the face as a whole visual image. The model proposed by Bruce and Young indicates that although the face and its features are processed in a configular manner, aspects such as emotion and information like the person’s name and occupation are processed on their own. Faces can also be deceiving as shown by the Thatcher illusion. It seems how a face is viewed depends on the way that it is processed which may lead to mis-interpretation of how the face is perceived and judged by other people. In conclusion face’s are special which is why it takes a complex set of ideas and theories to explain how they are processed and seen by each individual in relation to how we see everyday objects and surroundings.
Bibliography
Reference:
Biederman, I.(1987).Recognition by components: A theory of human image understanding.Psychological Review, 94,115-147
Biederman. J,Ju.G.&Clapper.J.(1985).The perception of partial objects.Unpublished manuscript, State University of New York at Buffalo
Bradshaw,J.L.&Wallace,G.(1971).Models for the processing and identification of faces.Perception and Psychphysics, 9 443-448
Bruce, V.&Young.A.W.(1986).Understanding face reconition.British Journal of Psychology,77,305-327
Farah, M.J.(1990).Visual agnosial:Disorders of object recognition and what they tell us about normal vision.Cambridge,MA:MIT Press
Gregory, R.L.(1980).Perceptions as hypothesis.Philosophical Transactions of the Royal Society of London
Gibson,J.J.(1950).The Perception of the Visual World.Boston:Houghton and Mifflin
Hubel,D.H. and Wiesel,T.N.(1962).Receptive fields:Journal of Physiology,166,106-154
Marr,D.(1982).Vision:A computational investigation into the human representation and processing of visual information.San Francisco,CA:WH.Freeman
Marr,D.,&Nishihara,K.(1978).Representation and recognition of the spatial organisation of three-dimensional shapes.Philosophical Transactions of the Royal Society,Series B,269-294
Young,H.W.,Hay,D.C.&Ellis,A.W.(1985)The faces that launched a thousand slips:Everyday difficulties and errors in recognising people.British Journal of Psychology, 76,495-523
Internet Sources:
Fig3.
Directly consulted sources:
Eysenck,W.and Keane,T.(2001).Cognitive Pychology-A Students Handbook.Psychology Press
Biederman (1987)
Marr (1982)
Farah(1998)
Marr and Nishihara (1978)
Gross,R.(2001).The Science of Mind and Behaviour(4th ed).Hodder and Stoughton
Bradshaw and Wallace (1971)
Farah (1994)
Gibson (1966)
Gregory (1966)
Young, Hay and Ellis (1985)
Levine,M.W.(2000).Fundamentals of Sensation and Perception(3rd ed).Oxford
Gibson (1966)
Page.M.(2003).Lecture Notes.University of Hertforshire
Biederman (1987)
Marr and Nishihara (1978)
Tanaka and Farah (1993)
Reisberg,D.(2001).Cognition(2nd ed)Exploring the science of the mind.W.W.Norton and Company
Biederman (1987)
Sekuler,R.and Blake,R.(1950).Perception(3rd ed),McGraw-Hill Inc
Gibson (1963)
Hubel and Wiesel (1950)