3D sound, HRTF, spatialized sound, object-based audio, binaural audio… There are many names for the technology that recreates 3D sound on stereo headsets. How is that possible? Let’s start at the beginning:
Rather than relying on multiple speakers, 3D sound uses only two speakers, preferably headphones. In order to understand how this works, we need to understand how localization of audio works for people with normal hearing. Our brains figure out where audio is located by responding to, among other cues, small differences in the timing and intensity of sound arriving at our two ears. The illustration below shows these two phenomena.
There are fancy names for these two main effects. The first is the difference in time between when the sound arrives in one ear versus the other. This is the interaural time difference (ITD). The other effect is the difference in sound intensity between the two ears, the interaural level difference (ILD). So for any sound situated in space the human brain figures out approximately where that audio is located by the incredibly subtle differences in when the sound arrives at our ears and how loud the sound is at each ear.
3D audio reproduces those subtle differences for every object in a scene. This requires a lot of computational power, mathematics and sophisticated digital signal processing algorithms. For example, if a sound is over to your left, it should reach your left ear before it reaches your right ear, 3D audio will reproduce this interaural time difference. It should also be a bit louder in your left ear than your right, because your head would block the sound, again 3D sound will mimic this interaural level difference.
To complicate all of this the shape of your outer ear (pinna) also impacts your ability to locate sounds in space. Sound waves bounce off your outer ear into your ear canal. Every ear is different and has a unique signature. The complex shape of our outer ears results in a complex pattern of sound resonances and diffractions that filter the sound. Even small variations in the shape of the pinna will change the spectrum of the pressure entering the ear canal. Thus the unique shape of our ears will introduce a uniquely altered sound wave into our ear canal.
Our brains can decipher the tiny changes that our pinna introduce to the audio to help pinpoint the sound location even further. If you change the shape of your pinna you will have a harder time locating where sounds are coming from.
There is a mathematical function that tries to capture all the ways in which the physical shape and structure of our head and outer ears changes the sounds that reach our eardrums. The mathematical formula is called the Head Related Transfer Function (HRTF). Basically, it is the way in which our head changes the sounds that reach our eardrums.
A technology that people are already familiar with is surround sound (a Sony trademark for multiple channel audio beyond just stereo) Surround sound uses multiple speakers positioned around the listener to mimic the real world. Audio is created in separate channels for each speaker and embedded with the video or music. The diagram below shows the setup of a 5.1 system. The 5 indicates how many speakers there are and the .1 indicates the number of subwoofers. The drawback with surround sound is that it only reproduces audio in a flat plane around the listener as the recommended speaker setup is in a plane around the listener’s head. Here is a good source for more information on the various surround sound formats.
Comparing 3D Audio experienced on a headset to surround sound.
Surround sound is a limited listening experience that relies on a small number of speakers to increase the sense of location delivered by multiple audio channels. The number of audio channels typically cap out at 7 speaker locations for consumer home audio. It requires specialized amplifiers to decode the multiple channels and is reliant on the speaker set up for accurate if limited audio reproduction. To be blunt if you only have 7 speaker locations to place the sound then the incremental accuracy of the audio location will be very low.
3D game audio on the other hand delivers stunningly accurate location information over a stereo headset by using Head Related Transfer Function (HRTF) algorithms to mimic the effects of the pinnae, the head and various listening environments. The sound is injected straight into the ear canal to produce the impression of real 3D audio sources. Headphones are the best way to get the 3D sound into your head. There’s no way for speakers to do the job as well, because there’s no way for them to stop each ear hearing the sound that’s intended for the other.
Drown Earphones reproduce the 3d Audio generated by the best modern games incredibly accurately. We take the highly accurate 3D audio and deliver it directly, through our patented waveguide, to your ear canal without any alteration by reflection or refraction in the outer ear. This allows us to improve your ability to locate items and adversaries within the game. Over the ear headphones broadcast that 3D audio from a short distance away from the ear and the pinna of each users introduces slight variations to the 3D audio. This degrades the signal and degrades the accuracy of the spatial information.
Additionally we transmit the acoustic vibrations directly to the nerves in the pinna to greatly improve the overall audio experience. Bass sounds are improved and overall the immersive experience of the audio is increased.