This article is about modern multi-channel surround sound processing for home theaters, and how they rely on audio codecs.
Codecs are the key to home theater audio nirvana; without them you're stuck with 2-channel stereo. Codecs are translators. They use mathematical algorithms to organize, compress, and de-compress sound. The subject is complicated, but the gist of it is codecs act as mediums allowing audio to be split into multiple channels on a recording medium, and in some cases permit multi-plexing across channels. Since their birth in the 1970's, Dolby and DTS in particular have been known for their continuous evolution of audio formats and accompanying codecs. Both companies have transcended generations of sound format modernization, culminating with 3-Dimensional (3D) codec solutions introduced to consumers in the mid-2010's. They're not the only game in town, but they are the "big dogs" of the industry.
Dolby Atmos, DTS:X, DTS:X Pro, and Auro-3D are competing so-called "immersive" surround sound techniques that eschew traditional directed audio channels in favor of audio "objects" which are basically free-form when it comes to output channels. Some of these audio formats are backward compatible with channel-based audio, and some are not.
Learn more about these competing 3D audio formats in Epic Battles of Home Theater Audio: Dolby, DTS, and Auro
When it comes to codecs, Dolby Atmos tends to be a particularly confusing subject for many people. The fact is, there is no codec for Dolby Atmos, though it requires one. Since it is not its own codec, you can think of Atmos as a sub-routine that must be called by a codec. Meanwhile, DTS:X, DTS:X Neural, and Auro-3D don't have this problem. They are stand-alone codecs and do not rely on other technology to work (Auro-3D is part of the Auro-Codec).
Secret Decoder Rings
Codecs are specific. If you want to listen to content created with a particular encoder), you'll need the opposite or reverse-logic version of that codec: a decoder. Codecs are intellectual property, but for various reasons some decoders have evolved the ability to process a multiplicity of encoded audio streams, while others have a singular focus. The propensity of computer-based playback devices available to consumers (e.g media servers) has resulted in decoding codecs being commonly available. Many codec owners will distribute the decoder version of their codec for free in order to promote its use.
Modern codecs tend to share a number of properties normally associated with internet security methodologies. For example, some codecs are asymmetric. You cannot create an encoder even if you have the decoder. These decoders are designed to make it impossible for an end user to reverse-engineer an encoder if the user has a decoder. That's what makes it possible for many modern decoders to be released free to the public. If it were possible for anyone to create encoded content with that information, it would likely destroy the economic value of the owner's ability to sell encoders.
Online discussions of codecs are almost always referring to codec decoders.
When it comes to the world of multi-channel audio, understanding and choosing from the variety of codecs can be daunting; and the fact is most of them are only compatible with a single audio format. This means aligning your content audio format with your playback device is essential. If you're building a media server, make sure you will have the means to use any necessary licenses (if any) required for decoding and audio playback before you make your final design decisions.
Channels: How Many Is Enough?
A few consumer-grade audio codecs support up to 32 channels of sound, though most are limited to 14 channels at the upper end. When discussing "channels" in home theater, the meaning is normally referring to the number of output channels. That is, how many speakers are in your system. However, the term can also refer to input channels. When you start talking about input channels, the dynamics are quite different. Legacy audio codecs are limited to a maximum of eight (8) channels or tracks. Keep in mind one (1) of those channels is almost undoubtedly reserved for Low Frequency Emissions or LFE. Bass. That leaves seven (7) channels for everything else. Thus, where the "7.1" surround-sound standard comes from. That is the limit of legacy analog audio multi-channel; eight (8) dedicated channels. Since around 2015, there has been another solution for more complex systems. It involves codecs that utilize a ground-breaking concept called Audio Objects (also known as Object Based Audio).
Let's discuss for a moment digital codecs and maximum number of channels. Regardless of the number of speakers the business-end of your home theater (normally, an A/V receiver) can handle, and which audio codecs it supports, there is also the matter of the content you want to play and how it is formatted. These days, that content is likely to be on some sort of digital, multi-channel format. Well, guess what? Those digital storage formats - whether they are online (streaming), or on a physical device such as a USB stick, or a file, all also rely on codecs to organize audio date.
As you can see in this chart of digital codecs, there are approximately 18 mainstream codecs that support eight (8) independent audio channels or the 7.1 format. If you wanted more channels, you have about 11 options to choose from. That sounds like a lot, however when you dig into the details (see the aforementioned chart), you'll find only a few are natively supported by most operating systems. Now, let's say you're intent on building a 32-speaker home theater system and you expect your source material to have 32 independent channels of audio data. At that point, you can only count on three (3) digital codecs that are natively supported across the board of operating systems. Of those one of them is raw audio (PCM). You are now left with just two (2) practical solutions for audio: AAC (MPEG-4) or AC-3. The point here is do your homework before you commit!
Modern Multi-Channel Audio Codecs
Before delving into a very high-level review of the most common modern multi-channel surround sound codecs for the home theater market, here is a list of related articles that offer more depth on various subjects, should any of them interest you:
- Epic Battles of Home Theater: Dolby, DTS, and Auro
- Epic Battles of Home Theater: Audio Objects
- How Audio Files Work: Codecs and Containers
Now, let's get down to the business of the audio codecs you should be paying attention to if you're building or already have a modern home theater. The key players are:
Dolby
Released in 1992, AC-3 was the dawn of true 5.1 surround sound for home theaters. Today, the current limit of Dolby's most advanced audio codec is 32-channels (though to date no consumer product on the market supports that many channels).
Looking for a more comprehensive guide to Dolby formats? Checkout Epic Battles of Home Theater: Dolby, DTS, and Auro.
Dolby AC-4 (2019)
Dolby AC-4 is the most recent lossy audio codec released from Dolby labs, and is likely to begin trickling into the audio/video world in earnest this year (2020).
Originally announced in 2015, the format's launch window was drastically underestimated by Dolby. Besieged with technical challenges, spats, and in-fighting within the broadcast industry, the format originally anticipated for release a year later has taken nearly five (5) years to get out the door.
AC-4 was strictly built for broadcast streams. Don't allow Dolby's marketing machine to fool you into believing otherwise!
From a structural viewpoint, AC-4 is somewhat unique. It blurs the lines between a codec and an audio container format. AC-4 is the first broadcast format to jump on the bandwagon of audio objects, re-imagined as "frames." Streams consisting of channelized audio content are embedded inside the frames.
It seems very likely Dolby will utilize AC-4 as a stepping stone to move toward codecs packaging audio as audio objects only, a-la Auro-3D (2011) and DTS:X Pro (2019), both of which have proven the concept is favorable in the market. However, content authors and streamers should expect sub-streams using traditional channelized audio will continue to be in high demand for quite some time. Less capable devices with limited audio channel outputs or devices with limited bandwidth may prefer to receive audio data encompassing minimal channel encoding due to their hardware and presentation limits, potentially discarding object oriented and surround sound channelized bit-streams to conserve bandwidth or because the form factor means it's pointless to transfer such data types. An abandonment of legacy channelized codecs is extremely unlikely within the next 5-10 years, given the vast number of Dolby playback products in the field that do not support audio objects.
Dolby Atmos (2014)
Sometimes, being the first to market is not as advantageous as one would think. Of course, that depends on how one measures success. From an adoption perspective, Atmos is the clear leader in so-called "immersive" sound effects. It has the greatest brand-name recognition of all the "3D" surround sound formats and is the most prevalent of such formats on media (e.g. Blu-rays). A quick scan of the Blu-ray Database reveals over 1,300 Atmos enabled titles versus 250+ for DTS:X, and less than 50 for Auro. If you want to create a home theater designed around maximizing its ability to replay content in a native 3D format (versus a simulated format such as DTS Neural:X), Dolby Atmos is the clear winner. However, if you go that route, understand you must play by Atmos' rules. This means your speakers must be placed in specific locations, within particular guidelines. All that glitters is not always gold.
Dolby Atmos is not a codec. It is functionality built into Dolby TrueHD.
Atmos was designed to bridge both a technological gap and a home theater design gap at the same time. It's not a bad thing. Just be certain you understand what you're buying into. First, Atmos functions as a bridge of sorts between the world of discrete audio channels and the emerging world of audio objects. Second, it acts as a go-between in how end users visualize constructing a home theater. Suddenly, the ceiling and "height" speakers are a thing. In the past, ceiling speakers were viewed as inferior garbage facts of audio relegated to shopping malls and large department stores. Little more useful than a public address system. Atmos changed that view. Now, it is not uncommon for a new home theater to be designed in such a way that the room is viewed as a cube, with each side of the cube receiving some sort of audio treatment (for the floor, this typically means potential enhancements such as chairs with bass vibrators, often referred to as "butt kickers"; a term similar to the brand name of one manufacturer).
The subject of Dolby Atmos support is confusing because Atmos support falls within the auspices of the TrueHD codec. While Atmos is part of the Dolby TrueHD standard, not all TrueHD encoder/decoder devices support Atmos. The key to Atmos support lies in the TrueHD codec version. Audiophiles may recall Dolby Laboratories launched TrueHD in 2010, but Atmos did not debut until 2014; and that was on a small number of HD and Blu-ray DVDs only. Atmos didn't hit the home theater receiver market until late 2015, and even then we're talking about slim-pickngs of only the most expensive receivers. Widespread support for Atmos didn't begin to happen until 2017. The end result is there are a lot of devices in use that do not support Dolby Atmos (though they do of course support Dolby TrueHD).
Hopefully, by now you can see where this is going. The codec is effectively the same. The difference lies in the version of the TrueHD codec. If your listening device's TrueHD codec was licensed prior to the launch of Atmos, your device will not support Atmos (and vice-versa). If your device licensed the TrueHD codec after Atmos support was added to the codec, then you'll have the ability to replay Atmos content. The average time a home theater enthusiast waits between a/v receiver upgrades has dwindled from over 10 years to under five (5).1 The reason why is due to exactly what we are talking about there - the rapid advance of audio and video technology - which is pushing consumers to upgrade older equipment more frequently as they seek the capability of taking advantage of new developments in "immersive" home theater technologies. If you're curious, take a look at this timetable of Dolby vs. DTS audio tech advancements.
What happens if a non-Atmos compatible TrueHD device attempts to play Atmos-specific content?The object-based Atmos information will be ignored if the decoder doesn't recognize it.
Although Atmos falls under the auspices of the Dolby Digital TrueHD standard, Atmos functions in a unique way compared to prior Dolby audio. Atmos does not involve any audio channels at all. To the contrary, Atmos provides spatial context to content, and it does this by specifying characteristics of up to 128 audio objects [see What Is Object-Based Audio?].
When you think of a transmission of audio data, any context provided by an Atmos broadcast will be represented by these audio objects, which are transmitted along with the eight (8) distinct channel streams. The connection between Dolby TrueHD and PCM remains a 1:1 relationship (8 simultaneous channels). Embedded Atmos data does not affect channel-based audio streams.
TrueHD
Released in 2005, Dolby TrueHD is the root of Dolby's generation of multi-channel surround sound prior to AC-4. Putting this in perspective, Dolby Atmos is not a codec; it is an extension that rides on top of TrueHD. Knowing this, it becomes easier to fathom how TrueHD became the linchpin of Dolby's audio strategry for nearly 15 years. Unfortunately, many consumers don't understand these nuances. Just as many people mistakenly believe THX is a stand-alone audio and/or video codec. It's not. Both Atmos and THX are simply add-on functionality provided by codecs such as TrueHD; the real work-horses of multi-channel audio.
TrueHD's lossless codec is actually Meridian Lossless Packing (MLP), and was developed by Meridian.
2005 was a busy year for Dolby. Dolby TrueHD debuted alongside Dolby Digital Plus. The codecs differ substantially from one another. TrueHD is a lossless codec based on Meridian Lossless Packing and yields a much higher maximum bit-rate (18 mbps vs. 6.144 mbps).2 Though TrueHD can carry up to 14 audio channels (13.1), in practice it is limited to 7.1. At the time, Dolby sought to future-proof its codecs and get ahead of where the market seemed to be going. As with Dolby Digital Plus, the maximum channel capability of Dolby TrueHD was designed to exceed what was physically possible at the time in order to set the stage for the future. In the case of Dolby TrueHD, its high bit-rate created an additional challenge as well: storage space and information processing requirements well beyond the capabilities of any audio processor and storage medium at the time. In lossy mode, TrueHD's compression is mediocre. It cannot attain better than roughly a 2:1 ratio (50% compression), meaning there is still a huge amount of audio data to be stored and transmitted even as a "lossy" codec. Going from 7.1 to 13.1 almost doubles the space and bandwidth required, making it impractical for content producers to include it as a native format on media. Even if they could, the reduction in bit-rate required would likely position the end-result as inferior to tracks with fewer channels with higher audio resolution.
TrueHD was well ahead of its time. For starters, the HDMI standard at that time did not support the bandwidth TrueHD was capable of (and the current version in 2020 still doesn't, 15 years later). The same was true of Blu-Ray discs until late 2017.Promoted as an 8-channel audio format (7.1) upon release, TrueHD was designed primarily to support the HD-DVD standard (coupled with the aforementioned effort at readiness for greater multi-channel formats anticipated in the future). Most Blu-ray discs also support TrueHD audio. Even with the advent of AC-4, don't expect TrueHD to relinquish its crown in home theaters anytime soon. AC-4 is squarely a broadcast oriented platform.
With eight (8) channels, TrueHD could more than saturate an HDMI cable's bandwidth; let alone what a Blu-Ray disc could handle. HDMI was the primary bottleneck of data throughput in home theater until the release of HDMI 2.0 in 2013, when its capabilities finally matched Blu-Ray discs (18 Mbps throughput speed). Even limited to eight (8) channels, at maximum audio definition, Dolby TrueHD easily exceeded this and could overwhelm a system. Even today, the most recent HDMI version (2.1) released in 2017 maxes out at a data throughput speed of 42.6 Gbps.3 This is still less than what TrueHD is capable of using just half of its available channels. Furthermore, this doesn't take into consideration the requirements of video bandwidth and other forms of data normally included in a signal. Doubling today's maximum HDMI throughput (version 2.1), you'd still be able to completely saturate its bandwidth with just a TrueHD audio stream.
What is HDMI 3?A marketing gimmick; much like 3G/4G/5G in the cellular telecommunications industry. HDMI 3 is a poorly coined slang term meaning "HDMI 3rd generation," which is HDMI 2.1.
Ironically, TrueHD's capabilities are also limited by itself. Its maximum bandwidth per channel is approximately 4,608 kbps uncompressed (24-bit depth x 192 kHz sampling rate = 4,608 kbps). After accounting for its maximum encoded bandwidth of 18,000 kbps, even when compressed, the codec cannot quite handle eight (8) audio channels at maximum fidelity. Thus, there is a trade-off between number of channels versus audio fidelity. Six (6) channels (5.1) can make use of its maximum per-channel data throughput, but a full complement of 7.1 sound (8 channels) cannot. Coupled with the limitations of HDMI and the bandwidth demands of high-definiton video, even six (6) channels of maxxed out audio is not realistic at this time when paired with the best video resolution.
Dolby Digital Plus
Based on the Dolby Digital (AC-3), Dolby Digital Plus (also known as DD+, Dolby Digital Enhanced AC-3, and E-AC-3) is a lossy codec capable of supporting up to 16 channels, including one dedicated LFE channel.
E-AC-3 is incompatible with AC-3.
Dolby Digital Plus also debuted in 2005. Sporting a maximum data compression ratio of 12:1 (4:1 to 6:1 typical), DD+ was designed to be multi-purpose. For instance, its streams are always 16-bit audio depth at either 32 or 44.1 khz. This yields a total bit-rate of 192/448/640 (thousands) per second. Why is this important? Well, it's probably not to the average person, but it's very important to broadcasters. They need a reliable and consistent method of transmitting digital content. Dolby Digital's data throughput rate is about 10 times higher than its predecessor (Dolby Digital).
Now that we've established what DD+ is, why would Dolby release TrueHD - a lossless AND lossy codec - and Dolby Digital Plus at virtually the same time? Why not simply use TrueHD as its lossy platform du jour? After all, TrueHD has a lossy mode. The answer is: they serve different markets. Dolby Digital Plus' predecessor (AC-3) was very popular across the board in all markets. However, DD+ was designed with an eye toward broadcast streaming particularly. AC-3 set a precedent by establishing itself as the de-facto standard in television broadcast audio and home theater, but at the time there really wasn't a better alternative for the latter (even though audio fidelity was reduced from formats released well before it). The release of TrueHD and Dolby Digital Plus mark a turning point in Dolby's audio codec history. It was the point at which Dolby conscientously forked home theater and broadcast audio codecs into distinctive paths.
Dolby Digital Plus was invented prior to Atmos and other object-oriented sound formats and therefore, it did not originally support them. It does now.4
AC-3 was adopted by the ATSC (Advanced Television Systems Committee) in 1995 as the standard for broadcast digital multi-channel audio (ATSC A/53). DD+ was designed specifically to accommodate the transmission of Blu-ray audio across a broadcast medium. TrueHD was designed to deliver a Blu-ray multi-channel audio stream in all its full bit-rate glory within a local environment. Case in point: Netflix was one of the first companies to adopt DD+ across its entire streaming platform.5 A primary benefit of AC-3 for broadcasters was its ability to transmit both a multi-channel and 2-channel stereo signal simultaneously.6
DTS Coherent Acoustics (DCA)
All DTS codecs are owned by a company of the same name (DTS, Inc.), which used to be called Digital Theater Systems. DTS is another company with considerable pedigree in the audio industry, though it did not enter the market until about 10 years after Dolby. Commonly referred to as simply DTS, the home theater variant of the company's codecs are technically known as DTS Coherent Acoustics or DCA codecs. However, in an effort to avoid confusing the matter, let's stick with the common nomenclature of simply DTS for the purpose of this discussion. The average person could care less anyway.7
DTS codecs are based on the adaptive differential pulse-code modulation (ADPCM) audio data compression algorithm. In contrast, Dolby Digital (AC-3) is based on the modified discrete cosine transform (MDCT) compression algorithm. Different flavors of ice cream.
Atmos vs. DTS
There are subtle differences between the two, though most consumers are oblivious to their nuances. Here are a few examples:
- DTS is free of channel restrictions; while Atmos enforces specific channels for height
- Atmos also upmixes, but handles it a bit differently from how DTS does. DTS:X hands-off upmixing requests to an entirely different codec, whereas Atmos upmixing is incorporated into the Atmos portion of the TrueHD codec.
- DTS does not specify speaker locations; whereas Atmos does.
DTS:X and its cousin DTS:X Pro are the first exclusive audio object formats from DTS. The manner in which they operate differs substantially from Dolby's Atmos audio-object implementation. Using an exclusive audio object format (meaning all audio input is converted into audio objects), these codecs produce a decoder stream consisting of audio objects only. Period.
Another key differentiator from its primary rival - Dolby Atmos - is the DTS:X family is the fact it is speaker "agnostic" as DTS calls it. Unlike Atmos, there is no pre-determined speaker position required for any given output type. However, whether or not that is a good thing is the subject of much debate. While it certainly sounds good (pun intended) to consumers, many audiophiles are skeptical that such an implementation is the best method when the end user's goal is immersive realism.
Aside from a pure audio-object model (on the output side), DTS broke ranks with Dolby in terms of its marketing strategy as well. DTS:X is encoded via DTS' MDA (Multi-Dimensional Audio), a license-free codec that allows movie makers to control the placement, movement and volume of sound objects.8
DTS:X (Nemesis of Atmos)
DTS:X is a high fidelity, multi-channel audio codec that competes directly with Dolby's TrueHD.
Dolby Labs was the first company to lay down the audio-object gauntlet. Like Dolby Atmos, DTS:X uses a revolutionary concept called audio objects to identify audio data that doesn't fit neatly into the 7.1 channel sound format.
DTS:X is DTS' response to Dolby Atmos. Just like Atmos, consumers tend to think of DTS:X as simply adding ceiling (or "height") speaker capabilities, however that is in fact an oversimplification. Just like Dolby Atmos, instead of using channels as we've seen historically in surround sound (i.e. for more immersion, add more channels), both Dolby Atmos and DTS:X utilize an object-based model for defining the audio data sent to the ceiling speakers (and potentially others). The whole thing gets quite complicated, but the gist of it is the number of audio channels is unchanged when compared to non-Atmos/non-DTS:X codecs. This is how non-Atmos and non-DTS:X capable devices are able to be "backwards compatible" with Atmos and DTS:X content. They aren't really backwards compatible so much as they simply ignore the object-based data they don't understand.
DTS Neural:X
DTS Neural:X is basically pseudo DTS:X.
DTS Neural:X up-mixes non-DTS:X (spatial) content so that it may be played on a home theater system designed for Atmos and DTS:X (i.e. with ceiling/height speakers). It uses algorithms to make educated guesses as to which channels contain data that should be converted into audio objects (such as ceiling speakers), which are then fed to a DTS:X decoder.
When activated on supporting devices, Neural simulates DTS:X like content by making inferences about how audio content would have been encoded if it were using object-based encoding. In other words, it creates object-based audio content when none exists by effectively guessing at how it should sound. Neural evaluates the data coming in on the normal audio channels of a track and estimates where portions of the content could have possibly been assigned if there was real object-based data. The result is a mixed bag. Some listeners tend to like the effect, and some don't. It takes creative license with regards to what the original author or director intended. That may or may not be a problem, depending on your perspective and the work of art involved.
DTS Neural:X is a separate DTS codec and replaces DTS Neo, unlike Dolby Atmos (which offers the same functionality but under the Atmos compatible TrueHD codec).
DTS Neural:X sounds comparable to what Dolby Atmos does with its up-mixing - and it is - but, with a twist. DTS Neural:X does one thing that Dolby Atmos can't: Neural:X is capable of upmixing discrete mono signals, combining them, and upmixing them for 7, 9, or 11 speakers. Atmos cannot do that (though it can up-mix certain combinations of Dolby surround sound recordings to simulate Atmos capabilities).9 This begs the question, is that important? It's a controversial topic, but on balance the expert opinion seems to be "no." Any way you look at it, both Atmos, DTS:X, and Neural:X offer some really cool features.10
DTS:X Pro
DTS:X Pro is the first exclusive audio object formats from DTS. It is a substantial departure from Dolby's Atmos focus, a hybrid traditional audio channel plus audio-object method. The main difference between DTS:X and DTS:X Pro is the latter is designed for commercial implementations, with a stunning 32-speaker output capability.
Auro Technologies
Auro Technologies is a corporation and recording studio located in Mol, Belgium. Founded in 1980, the company is known for innovative work in high-end commercial environments, such as sound stages and movie theaters. The company is a pioneer in 3D audio, having developed the first 3-dimensional audio concept and format in 2005.11
Auro-3D
Auro-3D is a little-known spatial multi-channel listening codec that deserves broader attention.
An alternative implementation to Dolby Atmos and DTS:X of three-dimensional spatial sound objects, Auro-3D was invented by a Belgian based company named Auro Technologies. Auro-3D has several different implementations, such as Auro 11.1 and AuroMax. Auro-3D is an interesting beast. On the one hand, it offers competition to Atmos and DTS:X; on the other hand, it feels like a step backwards in some respects (one mode is channel based only and another is built on top of DTS). On the other hand, what's not to like about a 26-channel codec?
Dolby and DTS have a lot of built-in clout within the entertainment industry. After all, they have been defining audio tracks in movies since the early 1970's. However, in some respects Auro-3D is arguably the best of the bunch. In spite of its attempts to woo consumers, the fact is there is a chicken-and-egg problem. Why encode with a particular codec if there is no demand for it? Why build a decoder into a product (e.g. a/v receiver) unless there is a demand for it. Thus, champions of new ideas must convince both sides they want it. Dolby excels at this process. They have consistently led the rollout of new technologies since their debut of digital sound encoding equipment in the 1970's.
It boils down to the level of support from your 3rd party codec in whatever product your device uses. Since these codecs are all proprietary, users are dependent on the licensing arrangement and hardware/software implementations of the vendors supplying the codecs. Unfortunately, this means you must be diligent and check whichever particular codec you are using or are considering using. Furthermore, locked hardware (i.e. hardware devices that cannot be upgraded via software/firmware updates) are particularly vulnerable to planned obsolescence when it comes to audio codecs and formats. Caveat emptor.
More capable than Atmos, DTS:X, and DTS:X Pro, Auro-3D has in fact has been around much longer than either Dolby or DTS audio-object based codecs, Auro-3D (also frequently referred to as simply, Auro) is the latest mainstream entrant into the home theater surround sound codec rodeo. Sporting a maximum of 32-channels, Auro-3D is the first audio codec that relies exclusively on the latest concept in multi-channel audio content delivery: audio objects.12 To date, there is no specific file extension associated with Auro-3D.
Auro-3D splits the difference between Dolby Atmos and DTS:X from an implementation perspective. Auro-3D is Auro's only codec. Like Atmos, it understands both channels and audio objects, however like DTS:X it does not require a specified position of "height" speakers (as does Atmos). Now, you might be thinking how can this be? That doesn't make any sense. Well, as mentioned previously, Auro-3D's approach is more-or-less a hybrid or convergence of the strategy utilized by Atmos and DTS:X. While the placement of speakers is not rigid as with Atmos, it's not completely free-form like DTS:X either. Auro-3D tags audio object data directed to a specific channel or channel type (e.g. ceiling speakers) and uses what Auro calls "channelized" or directed audio. In other words, Auro-3D maps audio objects to specific channels when the metadata for the object indicates such. This allows content creators to establish rules for audio playback (again, much like Atmos), while also permitting some flexibility given the fact the creator cannot know how the playback environment will be setup.
The standard Auro-3D configuration at the time of this writing is up to a 13.1 channel (speaker) layout, with 7.1, 9.1, and 11.1 being more common. Auro-3D allows the stream to be encoded as any of these, and the decoder is able to do a very good job of matrixing up or down to fit the user's environment. Speaker configurations beyond 13.1 are possible, but are exceptionally rare at this time. The format supports up to 32 speakers, inline with DTS:X Pro.
Where's the Auro-3D Codec?There isn't one. Auro-3D is not an independent codec. It is layered on top of DTS. While it has its own algorithms, Auro-3D doesn't technically qualify as a codec in the sense that DTS:X and Dolby TrueHD do. This is becauase Auro-3D is reliant upon DTS:X, and adds another layer of processing on top of it. Take away DTS:X, and Auro-3D will not work. Therefore, there is no stand-alone codec for Auro-3D.
Auro-3D is not just another spatial codec; it is a multi-channel solution that eschews the fixed-channel + audio object design of Dolby Atmos and DTS:X. Incorporating up to 26 channels, AuroMax converts ALL channel data containing three-dimensional spatial audio data into sound objects. Therein lies the reason why Auro-3D is in many ways superior to its competitors. Auro-3D is a fresh approach, using a modern framework as its starting position, while Dolby and DTS are saddled with continuing to provide and support legacy solutions. Auro-3D does not so much mimic Atmos and DTS:X as it re-writes the playing field in a way that follows the object-based logic of those formats.12 As of this writing, Auro-3D has a very small footprint in the spatial sound codec consumer market (i.e. home theater systems), which remains dominated by Dolby Atmos and DTS:X. There is no substantial difference to the perception of most listeners when it comes to the sound quality of one of these audio designs versus the others.
What Is Object-Based Audio?
What do Dolby Atmos, DTS:X, and Auro-3D all have in common? They each incorporate object-based logic when it comes to expressing audio content. In a nutshell, the difference between traditional channel-based audio and object-based expressions is channels are mapped 1:1 by channel:speaker positions. Object based audio is a more fluid concept. Artifacts in an audio stream are designated as objects and the decoding device determines (based on a set of variables it controls) how to express the audio object to the end user. For example, the sound of an explosion or screeching tires in a car chase scene in a movie. Channel-based audio defines specific speaker channels those sounds are assigned to. This is a good approach if your intention as an end user is to reproduce sound as the original author intended. However, it is not sensitive to environmental variables that could impact your listening experience, such as the position of speakers in your room. On-the-other-hand, object-based audio hands more control over to the decoding device. The challenge with object-based audio lies in mapping the characteristics of an object to the equipment decoding and expressing it. For example, whether or not your movie room has overhead speakers.
A more detailed explanation of Audio Objects may be found here.
Audio codec decoders not only decode the encoded audio streams on each channel; they also make decisions about the mapping between raw audio data (after it's been decoded) and the speaker configuration in the listening environment. Let's say you've got a 5.1 channel home theater setup versus a 7.1 speaker setup. If the incoming audio stream is an encoded 5.1 channel signal, the decoder will have a simple task when converting that to the 5.1 stereo speaker configuration. However, in the case of the 7.1 speaker configuration, the decoder has to make some decisions about which sounds get sent to the extra 2 speakers (channels) in the room.
All of the "ceiling speaker" audio formats; that is, Dolby Atmos, DTS:X, and Auro-3D are not actually built expressly for ceiling speakers per se. Rather, they represent a redesign in the architectural expression of sound in multi-channel listening environments. This process of re-imagining sound interpretation is conducive to creating new(ish) concepts - such as overhead sound projection - which did not exist before. Object-based sound also has a huge operating advantage from a practical perspective; it gets around PCM's limit of eight (8) audio channels. Using object-based audio programming - especially in conjunction with the traditional channels - allows the expansion of audio definition into a more holistic representation of 3D space. It shifts the focus from 5.1 or 7.1 channel thinking to a 360 degree conceptual approach. Now, the possibility exists to more finely tune where in three-dimensional space a particular sound should be heard from.
It's a game changer.
Audio object encoders (e.g. Atmos) deliver audio object information using the same 8-channels of PCM as traditional digital audio.
Audio objects "wrap" or encode audio information with metadata that provides guidance to compatible decoders by describing how the actual data frame should be presented to the listener. It is effectively a philosophy of applying metadata to pure audio information. By representing audio content as objects, the particular channel in which the content is delivered no longer matters. Any object with any spatial positioning may be transmitted via any channel, opening the possibility of creative utilization of low-channel bandwidth applications yet-to-be invented.
Audio objects allow content authors more granular control compared to linear multi-channel audio recording solutions.
Devices that do not understand the underlying audio object metadata simply ignore it. Even looking at it from a purely digital perspective, the audio object metadata is not transmitted as human-audible content, and therefore there is no risk of "noise" on devices that don't read/understand the object information. This makes the object content safe to deliver to legacy decoder systems, as they will simply ignore the data they don't understand.
End Notes
1 Dominic, Jason. (n.d.). How Long Do Home Receivers Last? https://hometheateracademy.com/how-long-home-receivers-last/
2 Dolby TrueHD uses MLP (Meridian Lossless Packing), a lossless audio compression technique created by Meridian Audio, LTD and licensed by Dolby Labs.
3 HDMI. (12 May 2020). Wikimedia Foundation.
4 Dolby Digital Plus. (2018). Dolby Laboratories, Incorporated.
5 Fleischman, Mark. (20 October 2010). Netflix Adopts DD+ for Streaming.
6 Dolby Digital Plus Audio Coding Technical Paper. (2008). Dolby Laboratories, Incorporated.
7 Patschke, D., Kefauver, A. P. (2007). Fundamentals of Digital Audio. United Kingdom: A-R Editions, Incorporated.
9 . (August 2014). Dolby Laboratories. p. 15.
10 DTS:X® Pro technology puts you there. (2020). DTS, Incorporated.
11 Auro-3D. (20 December 2019). Wikimedia Foundation.
12 Ted. (18 November 2015). What You Don’t Know About Auro-3D May Surprise You. Strata-gee.