What Is Ambisonics Recording?
BINAURAL FOR VR APPLICATIONS
The immersive audio space is heading up these days, mainly due to the rise of virtual reality products like Occulus, Magic Leap and Google Cardboard. There’s many different paths one can take when creating immersive audio for VR applications. However one thing is clear, if you’re going to record sound for VR applications, you’re probably going to use an Ambisonics microphone.
Though binaural and ambisonics have a good deal in common, a couple key characteristics set them apart:
Utilizes two microphones and the natural shape of the recordist’s earlobes to create an immersive audio recording.
Utilizes an array of microphones, often 4, in a very specific configuration to produce essentially the same recording.
Result is single 2-channel audio file, which plays well with consumer technology (and is largely the reason binaural is better known than ambisonics),
Results in anywhere from 4-16 audio files oftentimes in a B-Format, limiting its ability to integrate with consumer technology.
Some Quick Hard Facts On Ambisonics (Taken Straight From Wikipedia [What?! They Were Accurate!])
- Ambisonics was developed in the UK in the 1970s under the auspices of the British National Research Development Corporation.
- Despite its solid technical foundation and many advantages, Ambisonics has not been a commercial success, and survived only in niche applications and among recording enthusiasts.
- With the easy availability of powerful digital signal processing (as opposed to the expensive and error-prone analog circuitry that had to be used during its early years) and the successful market introduction of home theatre surround sound systems since the 1990s, interest in Ambisonics among recording engineers, sound designers, composers, media companies, broadcasters and researchers has returned and continues to increase.
Pros and Cons of this crazy recording technique
- All speakers contribute to any one sound in any direction, as opposed to conventional pan-potted (pair-wise mixing) techniques which use only two adjacent speakers. This gives better localisation, particularly to the sides and rear.
- The stability and imaging of the reproduced soundfield vary less with listener position than with most other surround systems. The soundfield can even be appreciated by listeners outside the speaker array, although with reduced localisation performance.
- Only three channels are required for basic horizontal surround, four for a full-sphere soundfield. Basic full-sphere replay requires a minimum of six loudspeakers, four for horizontal.
- Ambisonics can be scaled to any desired spatial resolution at the cost of additional transmission channels and more speakers for playback. Higher-order material remains downwards compatible and can be played back at lower spatial resolution without requiring a special downmix.
- The core technology of Ambisonics is free of patents and a complete tool chain for production and listening is available as free software for all major operating systems.
- Not supported by any major record label or media company.
- Not widely known, since it has never been well-marketed.
- Conceptually difficult for people to grasp, as opposed to the conventional one channel, one speaker paradigm.
- The decoding stage makes it more complicated for the consumer to set up.
- Since any one virtual source will be reproduced by several speakers with strong correlation (a situation which is usually avoided in N.1 production), it is prone to phasing artifacts when the listener moves or turns.
Why Ambisonics Recording Is Great For VR Applications
There are a ton of spatialization software companies popping up these days. These companies take sounds and virtually place them within an environment. This is how the VR sound engine works on an Occulus video game, for example. When creating the video game, an engineer is placing sounds in a room prior to you experiencing that room, as they do with visuals. Then when you play that game, both visuals and audio are there waiting to track with you as you move your head in space and time.
Capturing a recording in ambisonics is like capturing a sound field. Imagine standing in a field and holding an ambisonics microphone in your hand. When you hit record, you’re capturing sonic information specific to that environment right then and there. Unlike other multichannel surround formats (i.e. binaural), its transmission channels do not carry speaker signals. Instead, they contain a speaker-independent representation of a sound field called B-format.
This B-format is huge, because it is highly modifiable.
B-format is often decoded to the listener’s speaker setup. This extra step allows the producer to think in terms of source directions rather than loudspeaker positions and offers the listener a considerable degree of flexibility as to the layout and number of speakers used for playback.
Capturing an ambisonics recording and giving it to a company like Big Ears is like letting a child loose in a candy store. The file format allows software to manipulate it for any playback situation they choose without degrading the quality or effectiveness of the recording.
What happens when VR moves from A consumption tool to creation tool?
The growth will be explosive and the supply will not be able to meet the demand. In order for VR to grow, everyone needs to have access to it. Not just on the playback side but on the creation side. All great technologies have succeeded because the mass consumers have had control. This new frontier is no different.
YouTube has already incorporated a feature that allows a user to upload 360 video and already there are consumer 360 video cameras on the market.
This means I could capture my cousin’s birthday party in 360 video, upload it to YouTube, and in minutes, my buddy wearing an Occulus across the country could be visually transported.
Next, how about a consumer ambisonics microphone that does the same thing?
From One Ear To Another,