Head-related transfer function
Encyclopedia
A head-related transfer function (HRTF) is a response that characterizes how an ear
Ear
The ear is the organ that detects sound. It not only receives sound, but also aids in balance and body position. The ear is part of the auditory system....

 receives a sound from a point in space; a pair of HRTFs for two ears can be used to synthesize a binaural
Binaural
Binaural literally means "having or relating to two ears." Binaural hearing, along with frequency cues, lets humans and other animals determine direction of origin of sounds...

 sound that seems to come from a particular point in space. Some consumer home entertainment products designed to reproduce surround sound from stereo (two-speaker) headphones use HRTFs. Some forms of HRTF-processing have also been included in computer software to simulate surround sound playback from loudspeakers.

Humans have just two ear
Ear
The ear is the organ that detects sound. It not only receives sound, but also aids in balance and body position. The ear is part of the auditory system....

s, but can locate sounds in three dimensions – in range (distance), in direction above and below, in front and to the rear, as well as to either side. This is possible because the brain, inner ear and the external ears (pinna) work together to make inferences about location. This ability to localize sound sources may have developed in humans as an evolutionary necessity, since the eyes can only see a fraction of the world around a viewer, and vision is hampered in darkness, while the ability to localize a sound source works in all directions, to varying accuracy,
and even in the dark.

Humans estimate the location of a source by taking cues derived from one ear (monaural cues), and by comparing cues received at both ears (difference cues or binaural cues). Among the difference cues are time differences of arrival and intensity differences. The monaural cues come from the interaction between the sound source and the human anatomy, in which the original source sound is modified before it enters the ear canal for processing by the auditory system. These modifications encode the source location, and may be captured via an impulse response
Impulse response
In signal processing, the impulse response, or impulse response function , of a dynamic system is its output when presented with a brief input signal, called an impulse. More generally, an impulse response refers to the reaction of any dynamic system in response to some external change...

 which relates the source location and the ear location. This impulse response is termed the head-related impulse response (HRIR). Convolution
Convolution
In mathematics and, in particular, functional analysis, convolution is a mathematical operation on two functions f and g, producing a third function that is typically viewed as a modified version of one of the original functions. Convolution is similar to cross-correlation...

 of an arbitrary source sound with the HRIR converts the sound to that which would have been heard by the listener if it had been played at the source location, with the listener's ear at the receiver location. HRIRs have been used to produce virtual surround sound.

The HRTF is the Fourier transform
Fourier transform
In mathematics, Fourier analysis is a subject area which grew from the study of Fourier series. The subject began with the study of the way general functions may be represented by sums of simpler trigonometric functions...

 of HRIR. The HRTF is also sometimes known as the anatomical transfer function (ATF).

HRTFs for left and right ear (expressed above as HRIRs) describe the filtering of a sound source (x(t)) before it is perceived at the left and right ears as xL(t) and xR(t), respectively.

The HRTF can also be described as the modifications to a sound
Sound
Sound is a mechanical wave that is an oscillation of pressure transmitted through a solid, liquid, or gas, composed of frequencies within the range of hearing and of a level sufficiently strong to be heard, or the sensation stimulated in organs of hearing by such vibrations.-Propagation of...

 from a direction in free air to the sound as it arrives at the eardrum
Eardrum
The eardrum, or tympanic membrane, is a thin membrane that separates the external ear from the middle ear in humans and other tetrapods. Its function is to transmit sound from the air to the ossicles inside the middle ear. The malleus bone bridges the gap between the eardrum and the other ossicles...

. These modifications include the shape of the listener's outer ear, the shape of the listener's head and body, the acoustical characteristics of the space in which the sound is played, and so on. All these characteristics will influence how (or whether) a listener can accurately tell what direction a sound is coming from.

How HRTF works

The associated mechanism varies between individuals, as their head
Human head
In human anatomy, the head is the upper portion of the human body. It supports the face and is maintained by the skull, which itself encloses the brain.-Cultural importance:...

 and ear shapes differ.

HRTF describes how a given sound wave input (parameterized as frequency and source location) is filtered by the diffraction
Diffraction
Diffraction refers to various phenomena which occur when a wave encounters an obstacle. Italian scientist Francesco Maria Grimaldi coined the word "diffraction" and was the first to record accurate observations of the phenomenon in 1665...

 and reflection
Reflection (physics)
Reflection is the change in direction of a wavefront at an interface between two differentmedia so that the wavefront returns into the medium from which it originated. Common examples include the reflection of light, sound and water waves...

 properties of the head
Human head
In human anatomy, the head is the upper portion of the human body. It supports the face and is maintained by the skull, which itself encloses the brain.-Cultural importance:...

, pinna, and torso
Torso
Trunk or torso is an anatomical term for the central part of the many animal bodies from which extend the neck and limbs. The trunk includes the thorax and abdomen.-Major organs:...

, before the sound reaches the transduction
Transduction (physiology)
In physiology, transduction is the conversion of a stimulus from one form to another.Transduction in the nervous system typically refers to stimulus alerting events wherein a mechanical/physical/etc stimulus is converted into an action potential which is transmitted along axons towards the central...

 machinery of the eardrum and inner ear (see auditory system
Auditory system
The auditory system is the sensory system for the sense of hearing.- Outer ear :The folds of cartilage surrounding the ear canal are called the pinna...

). Biologically, the source-location-specific prefiltering effects of these external structures aid in the neural determination of source location
Sound localization
Sound localization refers to a listener's ability to identify the location or origin of a detected sound in direction and distance. It may also refer to the methods in acoustical engineering to simulate the placement of an auditory cue in a virtual 3D space .The sound localization mechanisms of the...

, particularly the determination of the source's elevation
Elevation
The elevation of a geographic location is its height above a fixed reference point, most commonly a reference geoid, a mathematical model of the Earth's sea level as an equipotential gravitational surface ....

.

Technical derivation

Linear systems analysis defines the transfer function
Transfer function
A transfer function is a mathematical representation, in terms of spatial or temporal frequency, of the relation between the input and output of a linear time-invariant system. With optical imaging devices, for example, it is the Fourier transform of the point spread function i.e...

 as the complex ratio between the output signal spectrum and the input signal spectrum as a function of frequency. Blauert (1974; cited in Blauert, 1981) initially defined the transfer function as the free-field transfer function (FFTF). Other terms include free-field to eardrum
Eardrum
The eardrum, or tympanic membrane, is a thin membrane that separates the external ear from the middle ear in humans and other tetrapods. Its function is to transmit sound from the air to the ossicles inside the middle ear. The malleus bone bridges the gap between the eardrum and the other ossicles...

 transfer function and the pressure transformation from the free-field to the eardrum. Less specific descriptions include the pinna transfer function, the outer ear
Ear
The ear is the organ that detects sound. It not only receives sound, but also aids in balance and body position. The ear is part of the auditory system....

 transfer function, the pinna response, or directional transfer function (DTF).

The transfer function H(f) of any linear time-invariant system
Time-invariant system
A time-invariant system is one whose output does not depend explicitly on time.This property can be satisfied if the transfer function of the system is not a function of time except expressed by the input and output....

 at frequency f is:
H(f) = Output (f) / Input (f)


One method used to obtain the HRTF from a given source location is therefore to measure the head-related impulse response (HRIR), h(t), at the ear drum for the impulse Δ(t) placed at the source. The HRTF H(f) is the Fourier transform
Fourier transform
In mathematics, Fourier analysis is a subject area which grew from the study of Fourier series. The subject began with the study of the way general functions may be represented by sums of simpler trigonometric functions...

 of the HRIR h(t).

Even when measured for a "dummy head" of idealized geometry, HRTF are complicated functions of frequency
Frequency
Frequency is the number of occurrences of a repeating event per unit time. It is also referred to as temporal frequency.The period is the duration of one cycle in a repeating event, so the period is the reciprocal of the frequency...

 and the three spatial variables
Spherical coordinate system
In mathematics, a spherical coordinate system is a coordinate system for three-dimensional space where the position of a point is specified by three numbers: the radial distance of that point from a fixed origin, its inclination angle measured from a fixed zenith direction, and the azimuth angle of...

. For distances greater than 1 m from the head, however, the HRTF can be said to attenuate inversely with range. It is this far field HRTF, H(f, θ, φ), that has most often been measured. At closer range, the difference in level observed between the ears can grow quite large, even in the low-frequency region within which negligible level differences are observed in the far field.

HRTFs are typically measured in an anechoic chamber
Anechoic chamber
An anechoic chamber is a room designed to stop reflections of either sound or electromagnetic waves.They are also insulated from exterior sources of noise...

 to minimize the influence of early reflections and reverberation
Reverberation
Reverberation is the persistence of sound in a particular space after the original sound is removed. A reverberation, or reverb, is created when a sound is produced in an enclosed space causing a large number of echoes to build up and then slowly decay as the sound is absorbed by the walls and air...

 on the measured response. HRTFs are measured at small increments of θ such as 15° or 30° in the horizontal plane, with interpolation
Interpolation
In the mathematical field of numerical analysis, interpolation is a method of constructing new data points within the range of a discrete set of known data points....

 used to synthesize HRTFs for arbitrary positions of θ. Even with small increments, however, interpolation can lead to front-back confusion, and optimizing the interpolation procedure is an active area of research.

In order to maximize the signal-to-noise ratio
Signal-to-noise ratio
Signal-to-noise ratio is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. It is defined as the ratio of signal power to the noise power. A ratio higher than 1:1 indicates more signal than noise...

 (SNR) in a measured HRTF, it is important that the impulse being generated be of high volume. In practice, however, it can be difficult to generate impulses at high volumes and, if generated, they can be damaging to human ears, so it is more common for HRTFs to be directly calculated in the frequency domain
Frequency domain
In electronics, control systems engineering, and statistics, frequency domain is a term used to describe the domain for analysis of mathematical functions or signals with respect to frequency, rather than time....

 using a frequency-swept sine wave
Sine
In mathematics, the sine function is a function of an angle. In a right triangle, sine gives the ratio of the length of the side opposite to an angle to the length of the hypotenuse.Sine is usually listed first amongst the trigonometric functions....

 or by using maximum length sequence
Maximum length sequence
A maximum length sequence is a type of pseudorandom binary sequence.They are bit sequences generated using maximal linear feedback shift registers and are so called because they are periodic and reproduce every binary sequence that can be reproduced by the shift registers...

s. User fatigue is still a problem, however, highlighting the need for the ability to interpolate based on fewer measurements.

The head-related transfer function is involved in resolving the cone of confusion, a series of points where ITD and ILD are identical for sound sources from many locations around the "0" part of the cone. When a sound is received by the ear it can either go straight down the ear into the ear canal or it can be reflected off the pinnae of the ear, into the ear canal a fraction of a second later. The sound will contain many frequencies, so therefore many copies of this signal will go down the ear all at different times depending on their frequency (according to reflection, diffraction, and their interaction with high and low frequencies and the size of the structures of the ear.) These copies overlap each other, and during this, certain signals are enhanced (where the phases of the signals match) while other copies are canceled out (where the phases of the signal do not match). Essentially, the brain is looking for frequency notches in the signal that correspond to particular known directions of sound.

If another person's ears were substituted, the individual would not immediately be able to localize sound, as the patterns of enhancement and cancellation would be different from those patterns the person's auditory system is used to. However, after some weeks, the auditory system would adapt to the new head-related transfer function. The inter-subject variability in the spectra of HRTFs has been studied through cluster analyses.

Recording technology

Recordings processed via an HRTF, such as in a computer gaming environment (see A3D
A3D
A3D was a technology developed by Aureal Semiconductor for use in their Vortex line of PC sound chips to deliver three-dimensional sound through headphones, two or even four speakers. The technology used head-related transfer functions , which the human ear interprets as spatial cues indicating...

, EAX
Environmental audio extensions
The environmental audio extensions are a number of digital signal processing presets for audio, present in Creative Technology's later Sound Blaster sound cards and the Creative NOMAD/Creative ZEN product lines...

 and OpenAL
OpenAL
OpenAL is a cross-platform audio API. It is designed for efficient rendering of multichannel three dimensional positional audio. Its API style and conventions deliberately resemble those of OpenGL.- History :...

), which approximates the HRTF of the listener, can be heard through stereo headphones or speakers and interpreted as if they comprise sounds coming from all directions, rather than just two points either side of the head. The perceived accuracy of the result depends on how closely the HRTF data set matches the characteristics of one's own ears.

See also

  • A3D
    A3D
    A3D was a technology developed by Aureal Semiconductor for use in their Vortex line of PC sound chips to deliver three-dimensional sound through headphones, two or even four speakers. The technology used head-related transfer functions , which the human ear interprets as spatial cues indicating...

  • Binaural recording
    Binaural recording
    Binaural recording is a method of recording sound that uses two microphones, arranged with the intent to create a 3-D stereo sound sensation for the listener of actually being in the room with the performers or instruments. This effect is often created using a technique known as "Dummy head...

  • Dummy head recording
    Dummy head recording
    In acoustics, dummy head recording is a method used to make binaural recordings, that allow a listener wearing headphones to perceive the directionality and the room acoustics of single or multiple sources.Human perception of the direction of a sound source is complex, and consists of:#Simple...

  • Environmental audio extensions
    Environmental audio extensions
    The environmental audio extensions are a number of digital signal processing presets for audio, present in Creative Technology's later Sound Blaster sound cards and the Creative NOMAD/Creative ZEN product lines...

  • OpenAL
    OpenAL
    OpenAL is a cross-platform audio API. It is designed for efficient rendering of multichannel three dimensional positional audio. Its API style and conventions deliberately resemble those of OpenGL.- History :...

  • Sound Retrieval System
    Sound Retrieval System
    The Sound Retrieval System is a patented psychoacoustic 3D audio processing technology originally invented by Arnold Klayman in the early 1980s. The Sound Retrieval System (SRS) is a patented psychoacoustic 3D audio processing technology originally invented by Arnold Klayman in the early 1980s....

  • Sound localization
    Sound localization
    Sound localization refers to a listener's ability to identify the location or origin of a detected sound in direction and distance. It may also refer to the methods in acoustical engineering to simulate the placement of an auditory cue in a virtual 3D space .The sound localization mechanisms of the...

  • Soundbar
    Soundbar
    A soundbar or sound bar is a special loudspeaker enclosure which creates a reasonable stereo or surround sound effect from a single cabinet. They are much wider than they are tall, partly for acoustical reasons, but also so that they can be mounted above or below a display device e.g...

  • Sensaura
    Sensaura
    Sensaura, a division of Creative Technology, provides sophisticated 3D audio technology for the interactive entertainment industry.Following its origin as a research project at THORN EMI Central Research Laboratories in 1991, Sensaura evolved to become the leading worldwide supplier of 3D audio...

  • Transfer function
    Transfer function
    A transfer function is a mathematical representation, in terms of spatial or temporal frequency, of the relation between the input and output of a linear time-invariant system. With optical imaging devices, for example, it is the Fourier transform of the point spread function i.e...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK