However effective these cues can be, they are not without their limitations. If the frequency is less than 1500 Hz, IIDs become unreliable and ambiguous leaving ITD as the main source of information. Additionally, if the wavelengths are shorter than the size of the head, having a frequency of around 750 Hz or less, these too become ambiguous as it is difficult to ascertain how many periods of the wave have passed. Because of this, it has been thought that ITDs are used to detect the location of sounds producing low frequency waves and IIDs used for high frequency ones, which is the basis of the ‘duplex theory’ (Rayleigh 1907). Although the evidence mentioned above does seem to support this idea, the theory fails to explain how information coming from a complex sound with a range of frequencies is assimilated (Searle et al. 1976) for example.
The ‘cone of confusion’ (see Figure. 1) also disputes whether ITDs and IIDs are enough to enable localization in their isolation. This states that sounds coming from different sources within a specific area will have the same ITDs and IID and therefore other cues must be used to localize the sound more specifically. The ellipses represent the areas from which different sounds can emanate and yet seem as though they are coming from the same source.
Fig. 1: The Cones of confusion (adapted from Wenzel and Begault).
Therefore, it can be said that head movements also work as a binaural cue as this would enable further information about the source of the sound to be obtained. Hirsh (1971) concluded that head movements do indeed improve the participant’s ability to localize the sound and therefore contribute greatly to this process, as for example, if the head is moved from side to side and there is no change in the sound, it can be deduced that it is coming from directly above or below the head. It should be noted, however, that the cone of confusion is based on the assumption that the head is a disembodied, perfectly spherical, solid object and so it does disregard the effect of diffraction for example off the shoulders or body of the person which have also been shown to play a part in sound localization.
One phenomenon related to binaural localization is that of binaural beats. This refers to when two sounds with differing frequencies are presented simultaneously one to each ear, resulting in the sounds appearing to fluctuate in loudness and roughness at a rate dependant on the difference in frequency between the two sounds (Licklider, Webster & Hedlun 1950). Licklider et al (1950) interpreted this to infer that phase information about the sound is stored within the auditory nerve as without this it would not be possible for the fluctuations to be at the rate of the exact frequency difference. Additionally, it was found that when this frequency difference is steadily increased, the perceived location of the sounds alter and gradually the listener reports hearing two distinct sounds as opposed to one that is warbling, indicating that there is a threshold after which the phenomenon no longer occurs (approximated at around 400 c.p.s.). Binaural beats occur mainly with sounds of a low frequency and are most distinct between 300 and 600 Hz and will usually cease to occur after 1000 Hz although this may differ slightly. This upper limit also appears to be related to hormones, resulting in the upper limit for men being higher than that for women although this gap lessens depending on their menstrual cycle as nerve transmissions are effected by this.
Another phenomenon is binaural adaptation. The studies conducted by Hafter et al in 1983 and 1988 are the main authorities in this area, involving lateralisation tests using trains of clicks with a high frequency (around 4 kHz) to investigate thresholds for identification of IIDs and ITDs. The thresholds decreased as the number of clicks in the train increased when the interval between the click was 10 ms. Green and Swets (1974) concluded that the results from this experiment inferred that each of the clicks in the train contained the same amount of information. When the interval between the clicks, however, was altered to 1 ms, the threshold did not decrease nearly as much as in the previous trials which has been interpreted to mean that the information provided is much more weighted on first click than subsequent ones. This means that in the case of trains of clicks with a high click rate, only the first click is processed to find the location of the sound source, with the later clicks being of no further use for this. When the click rate is increased, the rate at which this occurs does also. However, when a click is missed out, or a short, low intensity noise is inserted into the click train, binaural adaptation is interrupted (Hafter et al 1988). Hafter described this as an ‘active release’ as rather than simply halting binaural adaptation, it indicates that a change has taken place and so encourages the sound to be localized again.
Binaural masking level differences refers to the process of presenting a pure tone within white noise to each ear, one at it’s masking threshold (where the level of the tone is such that it cannot be heard above the white noise) and one which is inverted from this point allowing it to be distinguished from the noise (Hirsh, 1948a). The second tone is once again manipulated until it is also at the masking threshold, the difference between the two being called the binaural masking level difference. This phenomenon is also dependant on the frequency of the sound, as the binaural masking difference can be 15db for sounds around 500Hz which is the frequency that this is the most effective at (Jiang, McAlpine & Palmer, 1997) but only about 2db if the frequency is above 1500Hz. This indicates that if the pure tone is inverted it becomes much easier to detect. Additionally, if the tone is played to one ear under the white noise at the masking threshold and just the white noise is played to the other ear, the participant is able to distinguish the tone despite having previously not been able to when the white noise was only presented to one ear. Once the pure tone is added to the white noise in the second ear, it is once again masked. The tone is also undetectable if the white noise being played to each ear is different, and so must be obtained by the same noise generator for this to be effective. Investigations have been conducted into the presence of masking level differences in sounds other than pure tones, such as in speech or complex sounds and it has been found that they occur for these also, with detection of the masked sound being improved if the phase or level difference is different to the masking sound.
Binaural masking level differences are thought to occur because of the fact that the interaural phase of sounds with a low frequency are processed by the brain in order to indicate the horizontal position of the source of the sound (Goldberg and Brown, 1969) and so when this interaural phase is altered to enable it to be distinguished from the white noise for example, it is as though the noise is being moved away from directly in front of the participant (Jiang et al, 1997). Jiang et al (1997) also suggest that this phenomena demonstrates the ability to distinguish sounds by their location in order to make them more audible even when heard alongside other sounds.
In conclusion, sound localization and lateralization appear to be quite dependant on interaural time and intensity differences, although these can be made redundant at particular frequencies or as a result of the cone of confusion. In these circumstances, other methods can be employed, for example, moving the head. Binaural beats refer to the phenomena of two sounds of differing frequencies appearing to warble when presented to the participant with one in each ear. This is dependent on the frequency in the sense that the frequency difference between the two sounds cannot be too much or this ceases to work and two separate tones are heard instead. The same applies to binaural adaptation, as the clicks used to investigate it must be at a high frequency for the adaptation to be apparent. The phenomena of binaural masking level differences also requires specific frequencies as it dictates the extent of the differences so therefore frequencies are imperative in many aspects of binaural hearing.
References
Begault, D.R. (1994). 3-D sound for virtual reality and multimedia. New York: AP Professional.
Goldstein, E. (2002). Sensation and perception (Rev. ed.). Pacific Grove, CA: Wadsworth-Thomsom Learning.
Hall, J. W., Grose, J. H. and Hartmann, W. H. (1998). The masking-level difference in low-noise noise. The Journal of the Acoustical Society of America, 103, (5), 2573-2577.
Handel, S (1993). Listening: An introduction to the perception of auditory events. Massachusetts: Maple-Vail.
Jiang, D., McAlpine, D. and Palmer, A. R. (1997). Detectability Index Measures of Binaural Masking Level Difference Across Populations of Inferior Colliculus Neurons. The journal of neuroscience, 17 (23).
Licklider, J. C. R., Webster, J. C and Hedlun, J. M. (1950). On the frequency limits of binaural beats. The Journal of the Acoustical Society of America, 22, 468-473.
Sekuler, R., Blake, R. (2006). Perception (5th Ed.). New York: McGraw Hill.
Wenzel, E. M., Begault, D. R. (2000). The role of dynamic information in virtual acoustic displays. Retrieved 29th April 2006 from http://human-factors.arc.nasa.gov/ihh/spatial/research/Wenzel_dynamic_information.html