Segregation is being able to tell apart sounds. The auditory system accomplishes this through the use of various cues, such as pitch, envelope and location, through the process of auditory scene analysis.
There has been a paradigm shift from post into modern day. Older studies into auditory perception were concerned with tones that weren’t generally heard in the real world. So, more modern studies have tried to be more realistic. Instead of focussing on pitch and frequency, they focus on sounds and words. (Gaver, 1988)
When we hear a sound, we usually use everyday listening to pay attention to the source of the sound. For example, we may hear an event happening and decide it is a car, approaching us quickly, driving through a large puddle, through all the auditory stimuli given. The alternative is musical listening, in which we listen to the texture of a sound, how smooth it may or may not be, and whether it is masking other sounds. Musical listening is more about the nature of the sound itself, not determining a source.
In the forest, I am using everyday listening, as I am attempting to locate and source each sound I hear. A given sound provides information about an interaction of materials at a specific location within the environment. It is aiming to present a route from the source of the event to my auditory system to enable me to pinpoint its location and in turn, understand my environment (Gaver, 1988). A frequently asked question in society is ‘if a tree falls in the forest and no one is there to hear it, does it make a sound’? The answer to this question is not relevant, but the interpretation is. Sound can refer to a physical stimulus, or a perceptual response. If referring to sound as physical, the answer would be yes, because a physical sound consists of pressure changes in the air or environment, therefore the pressure changes occur whether or not someone is there to hear (Goldstein, 2002). However, if sound is referred to as a perceptual response, then the answer is no, because sound is considered the experience we have when we hear, and if there is no person, there is in turn no experience to be had (Goldstein, 2002).
‘I can hear two foxes, or maybe it’s one...’
Auditory illusions occur when expectations are created but not followed-through. We expect sounds with similar pitches to come from the same source, such as two different tunes will sound different again when played together. The theory of scale illusion was discovered by Deutsch in 1973. It involves a major scale of tones with successive tones alternating from ear to ear. The scale is played simultaneously in both ascending and descending form (Deutsch, 1999). When a tone from the ascending scale is in the right ear a tone from the descending scale is in the left ear, and vice versa, and as the tones are equal-amplitude sine waves. Most people experience an illusion when listening to this, as a melody corresponding to the higher tones appears to be coming from one earphone, and a melody corresponding to the lower tones appears to be coming from the other one. When the earphone positions are reversed, the ear that had heard the higher tones continues to hear the higher tones, and the ear that had heard the lower tones continues to hear the lower tones (Deutsch, 1999). Scale illusion shows how top down, bottom up processing, mentioned a little later, can interact with perception. In Deutsch’s example, the participants are seen to group sounds according to pitch, in a single stream.
‘I can hear some sounds that I can’t identify....’
Processing sound information must begin with the information being received by the receptors. This is called bottom-up processing. But sometimes, information is processed firstly by considering the effect of this knowledge the person brings to the perceived stuimulus (sound) and this is known as top-down processing. Perception, working effectively, often involves both processes working together (Goldstein, 2002). To identify these unknown sounds, we use both processes. Top-down processing allows us to apply past knowledge to the sound. Have we heard it before? For example, it may be the sound of a bird. We have heard one breed of bird before, and this is similar, so top-down processing suggests it may be a bird. Bottom-up processing focuses on the sounds we hear, their pitch, intensity, tibre, and using this alongside top-down, I am able to think does this sound match my experience of the sound a bird makes? From this, I can conclude yes, it does.
In 1993, Ballas said that the time taken to identify a sound depends on the number of alternate causes for the sound (Goldstein, 2002). For example, a clicking sound could be made by a pen, a light switch, a keyboard, etc. This is called the Hick-Hyman law. Ballas presented participants with computerised sound samples and asked them to press the spacebar when they identified the sound. He found that causal uncertainty correlated with identification time, meaning that the more uncertain the participant was of the sound’s source, the longer they took to identify it (Goldstein, 2002). It also concluded that the quicker we react to a sound, the more accurate we are. Ballas then went on to look at sounds that are frequent within the environment. He wanted to see if we recognise more effectively sounds that we hear more frequency. He set off a timer of sounds at intervals during the day. Participants had to note the time, the date, the first sound they heard, the action involved and the context of the sound. Ballas found his hypothesis to be correct; the more ecologically frequent the sounds are, the more readily identified they were (Gaver, 1983).
‘I decide to move in a particular direction because I can hear cars, and when I move in that direction they get louder’
Everyday listening was introduced by Gaver in 1993. He took on an ecological approach to everyday listening, to iron out some of the constraints he saw in the traditional approaches. (Gaver 1993). He said that we pay attention to what’s making a sound. A car makes a sound. We recognise this through interpretation of the various events going on internally, such as the engine, the petrol and the tyres. Gaver reminds us that the environment, however, can change these sounds by absorbing them making shrill noises seem quieter. This is when the brain is given the task of identifying whether the noise comes from the sound producing landmark, or the environment. The car and environment have different intensities and spectrums, and auditory array subconsciously determines these. Auditory array is the ability to determine differences in intensitie4s and spectrums from sounds converging on the ear from every direction, and making sense of these to form an informative structure. Gaver went on to design a framework to hold every possible sound. His model consisted of levels; basic level suggested that the sound indicates something has happened, and the next level said that the sound producing event would consist of three categories, vibrating solids, aerodynamic events (vibrating gases), and liquid sounds (Gaver, Everyday listening and auditory icons, 1988). Examples of which would be a scraping sound, an explosion and a drip. Above this basic level, we can see evidence of temporal patterning such as walking, compounding events such as writing and scraping, and hybrid sources - a sound event which incorporates all these 3, such as a boat on water. The next level is compound events, involving more than one basic level event, such as writing which involves a complex series of impacts and scrapes. However applicable this model may be, it is not without its limitations.
'Suddenly, I realise the car sound is one I know well – I have been rescued by the Police!'
The alarming sound of the Police siren is a universally recognised sound, deliberately constructed to be easily heard. Most people can identify an alarm sound having never heard it before, due to certain characteristics that most alarm sounds have (Hellier et al, 1993). Hellier et al (1993) looked into alarm tones. They attempted to look for general characteristics by comparing alarm sounds from different sources. Up until recent years, warnings were given using high-pitched, shrill sounds, from whistles, bells and horns. However, in more recent years speech alarms have been implemented. It is not yet known which type, if either, is more reliable; however it can be seen through the previous few pages, that our auditory system is more than capable of figuring it out, with or without a little help from our other senses.
Works Cited
Deutsch, D. (1999). The psychology of Music. Chicago, USA: Academic Press.
Gaver, W. (1986). Auditory Icons: Using sound in computer inferfaces. CA, USA: Buxton USA Publishers.
Gaver, W. (1988). Everyday listening and auditory icons. University of California, San Diego: Unpublished Doctoral Dissertation.
Goldstein, E. B. (2002). Sensation and Perception. CA, USA: Wadsworth-Thompson Publishing.
Hellier EJ, Edworthy J, Dennis ID (1993) Improving auditory warning design: quantifying and predicting the effects of different warning parameters on perceived urgency. Human Factors, 35(4), 693-706.
Litovsky, R.Y. (2011). A cocktail party model of spatial release from masking by both noise and speech interferers. J Acoust Soc Am. 130(3):1463.