Home Theater

Epic Battles of Home Theater: Audio Objects

The very first mainstream manufactured Atmos-compatible A/V receiver was the Marantz SR7009. Released in September 2014, it was not widely available until 2015. Far from a ground-breaking product release, its arrival was hardly noticed. In retrospect, this is not surprising. At the time, very few Atmos-compatible ceiling or height channel position speakers were on the market, let alone any Atmos enabled media content (barely any at all). Almost no consumers had even heard of Atmos, and it would be another three (3) years before demand for spatial sound systems began to take off.

Essential Install is a high end audio/visual consulting firm based in the United Kingdom with an online newsletter. Two years after Marantz's soft launch of Atmos, one of the newsletter's editors interviewed Wilfried Van Baelen (CEO of Auro Technologies) regarding this very subject. During the discussion, which EI published in late 2016, Van Baelen predicted 2017 was the year object-based audio would finally take off.1 He was right. Yet, today in 2020 - a full six (6) years later - most consumers really don't understand what Atmos is exactly, let alone the increasing number of competing formats that are now (or soon will be) available.

Much of the confusion of consumers stems from how these concepts and products are marketed. Between Dolby and DTS - the top-tier players in the industry - Dolby has a tendency to regurigitate the same narrative over and over, while DTS has a tendency to produce slightly different products with very similar names.

Contents

Theater and Home Theater are not Created Equal

Audio codec release dates and functionality follow different timelines between cinema and home audio markets. Most consumers are not cognizant of the fact there is a lag between the release of new technological formats in the commercial (cinema) market versus the consumer (home theater) market. Historically, the cinemas are where the big $ are, and therefore they get the new toys first. For example, the home theater audio version of the DTS:X Pro codec - the latest audio codec from DTS (2019) - is capable of driving up to 32 speakers. Yet, the equivalent decoder in a cinema may drive up to 64 speakers in addition to LFE (Low Frequency Emissions) channels.2

Object Based Audio

Object-based audio introduces a completely new architecture in audio recordings, providing greater flexibility for audio playback environments.

Channel-based audio directly connects specific sound recording channels (tracks) with specific output channels (speakers). Audio objects are a revolutionary concept in audio recording and delivery that turns channel-based audio on its head. How is it different?

Audio objects are dynamic. They are not mapped to any particular playback channel (speaker).

Traditional channel-based audio theory largely dictates where sound should be heard. Audio objects describe audio content. Decoders must interpret this information within the context of any given playback environment, offering a more personalized listening experience for the end user. Audio objects only require a single channel to deliver information, but the more channels available, the more audio data may be transferred at any given time. Audio objects are not tied to a particular channel, but one upside to audio objects is if a decoder does not understand them, it will simply ignore the extraneous metadata that describes the audio content. In other words, the actual sound will still be heard. However, playback will conform to the old association method of track-to-channel (recording-to-playback). This is fine for most people, but if the decoder is object-capable, it takes the listening experience to another level. And of course, a potential downside is if all the audio objects were transmitted on just one or two tracks for example, a non-compatible decoder would interpret the source material as mono or stereo respectively, with a corresponding playback experience.

Data embedded in the audio stream precedes the audio content and provides a description of the content. The playback device then examines this information in the context of audio data it already has (the content it's already received) and the playback system. For instance, an audio object might be presented to your speakers differently by the same decoder if you have an 11.1 sound system versus a 5.1 sound system. Different room dynamics and speaker layout combinations create different experiences. An audio object decoder needs to be able to take these factors into consideration, because it has primary control over the sound playback; while a traditional recording channel format is directive.

Advantages of Audio Objects

There are several notable advantages to object based audio formats over fixed channel-based solutions. Among them:

  1. Content creators need not be concerned with a limited number of channels from which to choose.
  2. Sound engineers and directors are less likely to need to weigh trade-offs between their goals and what is realistically possible for the audience.
  3. Shifts the burden from content creator and director to the end user's decoder.
  4. End users have a much easier time tailoring the multimedia experience to their liking.

If a director wants certain sounds in a movie to appear to emanate from somewhere in particular, a sound engineer must apply those sounds to specific audio tracks, so that certain speaker positions (the rear surround channels, for example) receive those particular sounds. Audio objects turn this concept on its head. The sound engineer need only describe characteristics of the sound in question. On the playback side of the equation, the decoder analyzes the intent of the sound and decides which speaker is most appropriate based on the current environment and configuration. The decoder processes the audio object through various algorithms, and the sounds themselves are then piped through one or more speakers. Only at that moment is the sound converted into channels representative of the physical speaker locations in the current environment.

Audio Codecs and Object Based Content

Why can't I find a codec for Dolby Atmos?

Dolby Atmos is perhaps the most well known audio object format (by name) that isn't an audio codec. The subject of Audio Objects confuses many audio enthusiasts when it comes to understanding how to leverage Atmos and similar, competing technologies within the context of a digital multimedia system. The fact is, there is no codec built for Atmos specifically. Likewise, the same is also true of DTS:X, DTS:X Neural, and Auro-3D. Making matters more confusing, they all require codecs to function properly. How can this be?

Audio object encoders (e.g. Atmos) deliver audio object information using the same 8-channels of PCM as traditional digital audio,
yet they are free of channel restrictions.

Audio codec decoders not only decode the encoded audio streams on each channel; they also make decisions about the mapping between raw audio data (after it's been decoded) and the speaker configuration in the listening environment. Let's say you've got a 5.1 channel home theater setup versus a 7.1 speaker setup. If the incoming audio stream is an encoded 5.1 channel signal, the decoder will have a simple task when converting that to the 5.1 stereo speaker configuration. However, in the case of the 7.1 speaker configuration, the decoder has to make some decisions about which sounds get sent to the extra 2 speakers (channels) in the room.

Audio Metadata

Audio objects "wrap" or encode audio information with metadata that provides guidance to compatible decoders by describing how the actual data frame should be presented to the listener. It is effectively a philosophy of applying metadata to pure audio information. By representing audio content as objects, the particular channel in which the content is delivered no longer matters. Any object with any spatial positioning may be transmitted via any channel, opening the possibility of creative utilization of low-channel bandwidth applications yet-to-be invented.

Audio objects allow content authors more granular control compared to linear multi-channel audio recording solutions.

Devices that do not understand the underlying audio object metadata simply ignore it. Even looking at it from a purely digital perspective, the audio object metadata is not transmitted as human-audible content, and therefore there is no risk of "noise" on devices that don't read/understand the object information. This makes the object content safe to deliver to legacy decoder systems, as they will simply ignore the data they don't understand.

Channel Based Audio

Most audio recordings are channel based. Sounds are recorded onto media via "tracks." When played back, these audio tracks are decoded into "channels." The latter term can be a bit confusing, because its meaning is slightly different depending on the context. For example, 2-channel audio is what we think of as stereo. The two channels correspond to Left and Right, which correspond to human left and right ears. Content recorded in stereo is designed to be played back only across a 2-channel system as well. So, let's say you want to play a 2-channel source on a 1-channel (mono) system or on a 3-channel system. In the first case, the decoder could simply discard one channel or it could combine the two channels according to a pre-defined filter. A third option would be to ignore the content completely because it's not the single channel source our decoder wants. Regarding the latter, 3-channel example, it becomes more complicated. How does one produce or simulate content that doesn't exist? Again, the choices are basically going to be some sort of filter and merging of the existing 2-channels to create a simulated or "virtual" 3rd channel, or don't even attempt to play a 3rd channel and only play the two you've got. Or, alternatively ignore the content altogether.

Legacy codecs map audio recording tracks to specific output channels (speakers).
Audio objects take that concept and throw it out the window.

Traditional, channel-based audio delivery requires explicitly addressed outputs (speaker channels) for every input (recording channel), and each channel is associated with a particular speaker or group of speakers. For example, with channelized audio the majority of human speech in movies is designed to be played back on the center front audio channel.

Number of Channels and PCM

Number of recording channels versus number of playback channels is a dance the audio industry has been performing for decades. Today, the maximum number of recorded audio channels appears to have maxxed out at eight (8). This figure corresponds to the maximum number of channels attainable via analog PCM audio on a computer. PCM or Pulse Code Modulation is a digital representation of analog sounds. It's essentially the raw format computer record and playback sound in, regardless of what sort of device the "computer" is (e.g. your cell phone is a computer).

When the number of channels of audio content coming to the playback device doesn't match the number of output channels, the device between them - usually some type of decoder built-in to another device (e.g. A/V receiver) - must determine how to handle the situation. Multi-channel recordings and number of audio channels on the playback side often don't directly correspond to one another.

At any rate, as you can see this upper limit of eight (8) channels poses some challenges for sophisticated playback systems, such as 11.1 or greater surround sound home theater systems and 32 or 64 speaker systems in movie theaters. Then there are other environments with their own, but similar audio challenges, such as churches and sports venues. No matter your playback environment, it's easy to see how eight (8) channels poses limitations.

Equally important is the intent of the sound director or producer who created the audio content to begin with. Recording Channels are the dedicated portions of media used to store audio data. Audio Channels correspond to playback channels the content creator intended. The problem for the content creator is the mirror challenge of the playback side of the equation. As a content generator, how do I derive the most flexibility from a limited portfolio of eight (8) channels to get across the experience I want the playback user to experience? This is particularly challenging considering the fact one of those channels is likely to need to be used for low frequency bass (usually referred to as LFE or Low Frequency Emissions). Now, as a producer we only have seven (7) channels to work with.

Astute readers will note that 7+1 channels corresponds nicely to a 7.1 channel surround system. Now, you can see why 7.1 became the standard for home theater and remained so for quite some time. With the advent of Blu-ray and HD DVDs, it became possible for movie producers and directors to cram all the data for 7.1 channel sound onto the same medium as a high-definition video movie. Thus, the age of high definition video and more sophisticated audio was born for the home theater market.

Fast forward to today, and we have 11.1, 13.1, and even 32-channel playback systems for home theater. Just imagine the challenge of packing audio data for 32 channels into an eight-track recording. Doing so requires an extraordinary amount of data compression.

Audio objects on-the-other-hand don't define where a sound emanates from using channels. Rather, they describe audio content.

Dolby Atmos, DTS:X, DTS:X Neural, and Auro-3D

What do Dolby Atmos, DTS:X, and Auro-3D all have in common? They each express audio content via object-based logic versus the traditional notion of audio channels.

Audio Objects are a game changer.

All of the so-called "ceiling speaker" audio formats (Dolby Atmos, DTS:X, and Auro-3D are not built expressly for ceiling speakers per se. Rather, they represent a redesign in the architectural expression of sound in multi-channel listening environments. This process of re-imagining sound interpretation is conducive to creating very explicit audio concepts - such as overhead sound projection - which did not exist before. Object-based sound also has a huge operating advantage from a practical perspective; it gets around PCM's limit of eight (8) audio channels. Using object-based audio programming - especially in conjunction with the traditional channels - allows the expansion of audio definition into a more holistic representation of 3D space. It shifts the focus from 5.1 or 7.1 channel thinking to a 360 degree conceptual approach. Now, the possibility exists to more finely tune where in three-dimensional space a particular sound should be heard from.

Most consumers associate these audio formats with overhead or ceiling speakers, but this is a misnomer. Ceiling speakers or "height" channels tend to blur the distinction between traditional channel-based multi-channel audio and object-based audio artifacts. This is because some legacy channel-based audio formats allow certain channels to optionally be directed to a "height" speaker. This is totally dependent on the decoder and of course the end-user's speaker layout. DTS Neo:X is an example of this type of codec. A number of characteristics differentiate true object-based audio architectures, such as Atmos. Their audio streams describe the characteristics of a sound, defining it as an "object." This makes the intent of the content creator very clear. So, in the case of overhead sound, we know it's supposed to be heard from above. Of course, that doesn't mean the end user has overhead sound playback capability, and that is where the decoder comes into play as it is still responsible for deciding exactly where and how that sound will be heard in the playback environment.

A Brief History of Audio Object Evolution

Unfortunately, due to the behemoth's giant (and very effective) marketing engine, most consumers believe Atmos was the first audio object format. However, this is far from the truth.

The concept of "audio objects" was invented by Wilfried Van Baelen - current CEO of Auro Technologies - in 2005. At the time, the company was named Galaxy Studios & Auro Technologies; a professional industry icon formed in 1980.

Auro does not seem to initially have had any interest in the home theater market at that time. And why should they? In 2005, the 5.1 home theater system was still the norm, and Auro's focus was on large scale multi-channel systems used in movie theaters and on professional, high-end stages. The home theater market did not appear to have a need for anything like a potentially unlimited multi-channel system. Besides, it would have been impossible in 2005 for an audio object based system to work on the first place in the home theater market. It was barely feasible anywhere at all form a technological standpoint. Recall high-defintion DVDs didn't exist until 2006. Content delivery is the linch-pin that any high bandwidth solution relies upon. This is still a challenge even today.

At any rate, Dolby Laboratories was the first company to lay down the proverbial audio-object gauntlet, though DTS soon followed suit. Once high-def DVDs began rapidly increasing in popularity, Dolby decided the time was right to begin selling its version of audio object packaging (called Atmos) to the higher-end niche consumer audio market. DTS likely had similar aspirations, and was the first to publicly announce a consumer-oriented solution. Like Dolby Atmos, DTS:X uses a revolutionary concept called audio objects to identify audio data that doesn't fit neatly into the 7.1 channel sound format. However, even though DTS was made the first step forward in the consumer and pro-sumer markets, Dolby had a better marketing campaign and would go on to dominate this landscape in the consumer market. Auro meanwhile, would be left in the dust (where it still is for the most part); missing out on the most significant ramp-up in home theater system interest in history (2014 to present).

Dolby Atmos

Dolby Atmos is not a codec; it is a capability. It adds functionality to Dolby TrueHD.

The subject of Dolby Atmos support is confusing because Atmos falls within the auspices of the TrueHD codec. While Atmos is part of the Dolby TrueHD standard, not all TrueHD encoder/decoder devices support Atmos. The key to Atmos support lies in the TrueHD codec version.

Audiophiles may recall Dolby Laboratories launched TrueHD in 2010, but Atmos did not debut until 2014. Atmos reached the home theater receiver market in late 2015, when it was released on a small number of HD and Blu-ray DVDs. At that time, only a handful of the most expensive home theater receivers were capable of decoding Atmos, and its launch in the consumer market went largely unnoticed. Widespread support for Atmos did not even begin to occur until 2017. The end result is even today, many home theater devices don't support Dolby Atmos (though they do support Dolby TrueHD).

What happens if a non-Atmos compatible TrueHD device attempts to play Atmos-specific content?

You can probably see where I'm going with this. The TrueHD codec is effectively the same. The difference lies in the version of the codec. If your listening device's TrueHD codec was licensed prior to the launch of Atmos, your device will not support Atmos. Vice versa, if your device licensed the TrueHD codec after Atmos support was added to the codec, then you'll have the ability to replay Atmos content. The average time a home theater enthusiast waits between A/V receiver upgrades has dwindled from over 10 years to under five (5).3 The reason why is due to exactly what we are talking about there - the rapid advance of audio and video technology - which is pushing consumers to upgrade older equipment more frequently as they seek the capability of taking advantage of new developments in "immersive" home theater technologies.

Checkout the timetables of Dolby vs. DTS tech advancements.

Although Atmos falls under the auspices of the Dolby Digital TrueHD standard, Atmos functions in a unique way compared to prior Dolby audio. Atmos does not involve any audio channels at all. To the contrary, Atmos provides spatial context to content, and it does this by specifying characteristics of up to 128 audio objects (see Object Based Audio for details).

When you think of a transmission of audio data, any context provided by an Atmos broadcast will be represented by these audio objects, which are transmitted along with the eight (8) distinct channel streams. The connection between Dolby TrueHD and PCM remains a 1:1 relationship (8 simultaneous channels). Embedded Atmos data does not affect channel-based audio streams.

DTS: Alphabet Soup

DTS is a competitor to Dolby, and therefore - as one would imagine - has a platform that competes with Dolby's Atmos product. Unfortunately for consumers (and even audio enthusiasts), DTS' product nomenclature is very confusing. The result is a veritable "alphabet soup" of names that all sounds the same. Let's attempt to clear up any confusion.

Here's the complete list of modern DTS multi-channel codecs:

  • DTS Neo:X
  • DTS-HD High Resolution
  • DTS-UHD
  • DTS:X
  • DTS Neural: X
  • DTS Virtual: X
  • DTS:X Pro

Now, let's separate out those that utilize object-based audio:

  • DTS-UHD
  • DTS:X
  • DTS:X Pro

DTS:X

DTS:X is a high fidelity, multi-channel audio codec that competes directly with Dolby's TrueHD format.

DTS:X is a blend of traditional and object-based audio codecs on par with Dolby TrueHD (with Atmos). DTS:X is a combination of DTS (a multi-channel codec) + Neural (DTS' equivalent to Dolby Atmos).

Atmos and most DTS codecs will up-mix, but handle the process differently.
DTS utilizes unique codecs, whereas Atmos up-mixing is incorporated into a portion of the TrueHD codec.

DTS:X was DTS' response to Dolby Atmos. Just like Atmos, consumers tend to think of DTS:X as simply adding ceiling (or "height") speaker capabilities, however that is in fact an oversimplification. Just like Dolby Atmos, instead of using channels as we've seen historically in surround sound (i.e. for more immersion, add more channels), both Dolby Atmos and DTS:X utilize an object-based model for defining the audio data sent to the ceiling speakers (and potentially others). The whole thing gets quite complicated, but the gist of it is the number of audio channels is unchanged when compared to non-Atmos/non-DTS:X codecs. This is how non-Atmos and non-DTS:X capable devices are able to be "backwards compatible" with Atmos and DTS:X content. They aren't really backwards compatible so much as they simply ignore the object-based data they don't understand.

All DTS codecs are owned by a company of the same name (DTS, Inc.), which used to be called Digital Theater Systems. DTS is another company with considerable pedigree in the audio industry, though it did not enter the market until about 10 years after Dolby.

DTS Neural:X

DTS Neural:X is pretend Atmos. It up-mixes non-spatial (3D) content and converts it to DTS:X.

DTS Neural:X up-mixes non-DTS:X (spatial) content for playback on a home theater system designed for Atmos and DTS:X (e.g. with ceiling/height speakers). It uses an algorithm to parse channel-based audio data and convert it into audio objects (such as ceiling speakers), which are then fed into the DTS:X decoder.

When activated on supporting devices, DTS Neural:X simulates DTS:X like content by making inferences about how audio content would have been encoded if it were using object-based encoding. In other words, it creates object-based audio content when none exists by effectively guessing at how it should sound. Neural evaluates the data coming in on the normal audio channels of a track and estimates where portions of the content could have possibly been assigned if there was real object-based data. The result is a mixed bag. Some listeners tend to like the effect, and some don't. It takes creative license with regards to what the original author or director intended. That may or may not be a problem, depending on your perspective and the work of art involved.

DTS Neural:X is an independent DTS codec and replaced DTS Neo, unlike Dolby Atmos (which offers the same functionality but under the Atmos compatible TrueHD codec). The end result is comparable to what Dolby Atmos does with its up-mixing, with a twist.

There is one area in particular where DTS Neural:X trumps Dolby Atmos: mono signals. Neural:X is capable of combining discrete mono signals and up-mixing to 7, 9, or 11 speakers, which is quite clever. This makes it easier to feed independent source materials - perhaps even from different devices - into an encoder, and create a multi-channel output that no one would know was derived from multiple sources. Atmos cannot do this (though it can up-mix certain combinations of Dolby surround sound recordings to simulate Atmos capabilities).4 Naturally, this begs the question: Is this important? It's a controversial topic, but on balance the expert opinion seems to be "no" from the perspective of home theater. Any way you look at it, Atmos, DTS:X, and Neural:X offer some really cool features.5

DTS Virtual:X

You may think of DTS Virtual:X as Atmos for people with no ceiling speakers.

DTS Virtual:X is designed to offer simulated height speaker channels in environments where ceiling or height speakers do not exist. It accomplishes this via psycho-acoustic algorithms that adjust the sound emanating from standard floor-mount and wall-mounted speakers to provide the audible illusion of changes in height. For proper effectiveness, the speakers need to be mounted at the listener's ear height.

DTS:X Pro

DTS:X Pro is suited for the audio enthusiast who is ready to eschew the old world of channel-based audio. It is the closest competitor to Auro-3D.

DTS:X Pro is very similar to Auro-3D. As mentioned elsewhere in this article, if the audio object metadata does not exist, most audio-object decoders will treat the incoming audio based on which channels the incoming signals are on. This is the only scenario for an audio object decoder when the incoming track channels matter. However, this is not the case with DTS:X Pro and Auro-3D. They do not understand conventional channel-limited signals. If there's no metadata, the signal will be ignored.

Like Auro-3D, the DTS:X Pro codec understands audio objects ONLY. So, what happens if you pipe audio into a DTS:X Pro decoder that is not encoded as objects? The answer is here.

DTS-UHD: The First

The audio object format you've likely never heard of, and why.

Believe it or not, DTS-UHD was the first audio object codec designed specifically for home theater.

Unfortunately, almost no one has ever heard of it. Why?

In late 2013, DTS perfected the first single-chip decoder designed for audio objects aimed specifically at the home theater market. Announced publicly in January 2014,6 DTS-UHD seemed poised to revolutionize the home theater listening environment. Yet, almost no one noticed; except Dolby Labs.

Later that year, Dolby announced a home theater version of Dolby Atmos. Dolby was able to pull the rug out from under its competitor for two (2) reasons. First and most importantly, Atmos already had name recognition via its presence (though fledgling) in cinemas. Dolby also had (and still has) a stronger marketing position than DTS. Dolby moved to simultaneously pressure content providers to begin producing Atmos-encoded versions of cinema movies as they were processed and released in the DVD market. The second factor was that instead of creating a new codec - as DTS had done - Dolby managed to fast-track its development of Atmos by merging it with its existing TrueHD codec. This meant that Dolby didn't have to invest in the R&D to develop a new, stand-alone codec for Atmos. Instead, it only needed to create a wedge in TrueHD to allow Atmos to take over under certain circumstances. Dolby's efforts paid off, and it managed to effectively beat DTS to market. Thus, for these reasons the term "Atmos" quickly became synonymous with this revolutionary method of audio reproduction in the home theater. Just as "Kleenex" is a term often used to refer to facial tissue, even though Dolby was not the first to hit the audio object consumer market, it was the first company to arrive on the scene with a brand name consumers could identify and easily remember. I have to wonder if DTS had made a departure from its history of naming all its variant codecs "DTS-something" whether DTS might have had a better shot at winning the perception battle early on, especially considering the fact DTS has a demonstrably more flexible suite of options from a technical perspective in its incorporation of audio object management (though Auro is even better). Still, it doesn't really matter from a consumer perspective. Brand recognition is crucial.

Auro-3D

Auro-3D is a little-known spatial multi-channel listening codec that is more capable than Atmos or DTS:X.

Auro-3D is an alternative implementation of three-dimensional spatial sound objects, comparable to Dolby Atmos and DTS:X. Invented by a Belgian company named Auro Technologies, Auro-3D has several different implementations, such as Auro 11.1 and AuroMax. Auro-3D is an interesting beast. On the one hand, it offers competition to Atmos and DTS:X; on the other, it feels like a step backwards in some respects (one mode is channel based only and another is built on top of DTS). Then again, what's not to like about a 26-channel codec?

Dolby and DTS have a lot of built-in clout within the entertainment industry. After all, they have been defining audio tracks in movies for nearly 50 years. However, Auro - a relative newcomer - is worthy of attention. Like a diamond in the rough, Auro-3D is arguably the best of the audio object formats, yet in spite of its attempts to woo consumers it remains uncommon. Auro-3D is the victim of the "chicken-and-egg" problem. Why encode with a particular codec if there is no demand for it? Why build a decoder into a product (e.g. A/V receiver) unless there is a demand for it? Champions of new technical ideas must convince both sides of the usability equation they desire a new format. They must convince the enabler and end user. Dolby excels at this process. They have consistently led the roll-out of new technologies since their debut of digital sound encoding equipment in the 1970's. Yet, this upstart debuted an object-based audio codec six (6) years before Dolby Atmos was even announced, and a full eight (8) years before Atmos was released in theaters.7,8

From a device perspective, the equation largely boils down to the level of effort required of implementation, how broadly market demand is perceived, and how prevalent a particular codec is within the content universe. Since codecs are proprietary, users are dependent on the licensing arrangement and hardware/software implementations of the vendors supplying the codecs, and the whims of equipment manufacturers. Unfortunately, this means as a consumer you must be diligent and mindful of whichever particular codec(s) you are using or are considering using. Furthermore, locked hardware - hardware devices that cannot be upgraded via software/firmware updates - are particularly vulnerable to planned obsolescence when it comes to audio codecs and formats. Caveat emptor!

Auro-3D is more than just another spatial codec. Auro-3D was the first codec to make a radical departure from the fixed multi-channel + audio object design of Dolby Atmos and DTS:X by completely abandoning the old-world channel-based design. Incorporating up to 26 channels, AuroMax converts ALL channel data containing three-dimensional spatial audio data into sound objects. Therein lies the reason why Auro-3D is in many ways superior to its competitors. Auro-3D is a fresh approach, using a modern framework as its starting position, while Dolby and DTS are saddled with continuing to provide and support legacy solutions. Auro-3D does not emulate Atmos and DTS:X, so much as it re-writes the playing field. As of this writing, Auro-3D has a very small footprint in the spatial sound codec consumer market (i.e. home theater systems), which remains dominated by Dolby Atmos and DTS:X. There is no substantial difference to the perception of most listeners when it comes to the sound quality of one of these audio designs versus the others.

Why Isn't There an Auro-3D Codec?

Auro-3D is not an independent codec. It is layered on top of DTS. While it has its own algorithms, Auro-3D doesn't technically qualify as a codec in the sense that DTS:X and Dolby TrueHD do. This is becauase Auro-3D is reliant upon DTS:X, and adds another layer of processing on top of it. Take away DTS:X, and Auro-3D will not work. Therefore, there is no stand-alone codec for Auro-3D.

How Do I Play Older Content in DTS:X Pro and Auro-3D?

The short answer is: you don't.

Don't worry though. The average person doesn't have to deal with this. Consumer-grade devices will simply automatically rollover to a codec that can understand content that isn't encoded with audio object metadata.

Now, if you're an audio enthusiast and not just an average consumer, you might be wondering why the heck DTS decided to do this in the first place? The important point here is DTS:X Pro and Auro-3D are the only current codecs that can ONLY handle object-based audio. This is a big shift from Dolby's viewpoint for example. And yet it is also an example of what you can do (for better or worse) when you build a codec from the ground up - as both DTS and Auro did - versus piggy-backing on an older codec (as Dolby did). As mentioned previously in this article, Dolby's approach gave it a home-field advantage at the dawn of object-based audio. However, the downside of that decision is now coming home to roost as of 2019. Meanwhile, DTS and Auro are simply carving out what they had already created into a stand-alone product.

My personal suspicion is when Auro upped the ante in this game, DTS:X Pro was DTS' response; while Dolby remains mute on the subject (for now). And why should Dolby care? After all, they still control the majority of the market, though DTS has made significant in-roads in the last couple of years. Any way you look at it, the reality is this is mostly a marketing ploy. There aren't any signfificant advantages (at the moment) in the consumer and pro-sumer markets to isolate audio object processing completely independently of channel-based processing. However, that is very likely to change over time, so one could posit this could give DTS and Auro and advantage in the future. We shall see. The home theater market is notoriously slow to change, though in recent years we are seeing notable advancements in technology about every five (5) years or so. If that trend continues, I would expect to see another jump in home audio evolution tha piques end users' attention around 2022-2023. We shall see.

Another potential driver of innovation is the current (as I write this) worldwide COVID-19 pandemic, which holds great potential in re-engineering the demand side of the equation for the home theater experience as fewer people are out in cinemas, and are spending more time in their homes. Time will tell, of course, but if this occurs it will be the first time the tail has wagged the dog in this industry.

End Notes

1 Gustafson, Alice. (22 December 2016). 2017 Will Be The Year For Devices Supporting Auro-3D, Dolby Atmos And DTS:X.

2 Arnaud, D. (8 January 2019). TRINNOV AUDIO FIRST TO SUPPORT DTS:X® PRO TECHNOLOGY. Trinnov High-End.

3 Dominic, Jason. (n.d.). How Long Do Home Receivers Last?

4 Jacquel, Tristan. (2 May 2018). DTS Neural:X and Dolby Surround: the new post-processing technologies from Dolby and DTS. Son-Video.

5 Palmer, Michael S. (1 June 2016). Up-mixed: Dolby Surround v DTS:Neural:X. High-Def Digest.

6 DTS Demonstrates DTS-UHD™ Decoder Using Single-Chip Audio DSP at Consumer Electronics Show. (7 Januar 2014). Business Wire.

7 Auro 11.1. (20 April 2020). Wikipedia. Wikimedia Foundation.

8 Dolby Atmos. (3 May 2020). Wikipedia. Wikimedia Foundation.

Other References

Object-based audio. (2020). IRT GmbH.