Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Wallace JacksonDigital Audio Editing Fundamentals10.1007/978-1-4842-1648-4_4

4. The Transmission of Digital Audio: Data Formats

Wallace Jackson¹

(1)

Lompoc, California, USA

Now that you understand the fundamental concepts, terms, and principles behind analog audio and how it is digitized into digital audio, it is time to explore how digital audio is compressed and stored using popular open source digital audio file formats.

You’ll learn about advanced digital audio concepts, such as compression, codecs, bit rates, streaming audio, captive digital audio, and HD audio. Finally, you’ll look at a number of powerful digital audio formats that are supported by open source content development platforms, such as HTML5, Java, JavaFX, and Android Studio. You can use any of the digital audio formats to deliver digital audio content for podcasts, music publishing, web site design, audio broadcasts, or multimedia applications.

Audio Compression and Data Formats

Once you sample your audio, you compress it into a digital audio file format for streaming over the Web or for captive audio file playback within an application. In this chapter, you’ll look at encoding audio using bit rates, and learn about streaming and the new 24-bit HD audio standard, which is now utilized in broadcast and satellite radio. I’ll also cover audio codecs and the audio file formats that they support across open platforms, such as HTML5, Java, and Android. (I cover digital audio data footprint optimization in Chapter 12, after you learn more about Audacity and digital audio editing.)

I want to make sure that you have a deep understanding of these digital audio new media assets so that you can eventually “render” them inside of your target application and attain a professional product that offers an impeccable end-user experience.

Digital Audio Codecs: Bit Rates, Streaming, and HD

Digital audio assets are compressed using something called a codec , which stands for “code decode.” The codec is an algorithm that applies data compression to digital audio samples and determines which playback rate, called a bit rate, it will use, as well as if it will support streaming or playback during network data transfer. First, let’s take a look at how you use digital audio assets in your applications: Do you store audio inside an application or do you stream it from remote servers over the Internet? After that, you’ll consider the audio playback rate or data-streaming bit rate that you’ll want to use. Finally, you’ll learn about HD audio and see if it’s appropriate for your digital audio applications. Only then will you be ready to look at the different audio file formats, which are actually codecs!

Digital Audio Transmission: Streaming Audio or Captive Audio?

Just as with digital video, which you view on the Internet every day, digital audio assets can either be captive, or contained within an application (for example, in an Android APK file), or they can be streamed using a remote data server. Similar to digital video, the upside to streaming digital audio data is that it can reduce the data footprint (size) of your application’s files. The downside is reliability.

Streaming audio saves the data footprint, because you don’t have to include all that data-heavy new media digital audio in your app file. Thus, if you are planning on coding a Jukebox application, you want to consider streaming digital audio data, as you would not want to pack your song library into your app’s file because it would be 10 gigabytes (in a large library).

Otherwise, for application audio, such as user interface feedback sounds, game play audio, and so forth, try to optimize your digital audio data so that you can include it inside your app file as a captive asset. In this way, it is available to your application users when needed.

As you know, I’ll go over optimization in Chapter 12, after digital audio editing has been covered. The reason that I want to cover this topic toward the end of the book is that the last step in the asset creation process is exporting your digital audio data using one of the formats discussed in the next section.

The downside to streaming digital audio is that if your user’s connection (or your audio server) goes down, your audio file won’t be present for your end users to play and listen to! The reliability and availability of a digital audio data stream is a key factor to consider on the other side of the streaming audio vs. captive digital audio decision.

Streaming Digital Audio Data: Setting Your Bit Rates Optimally

One of the primary concepts in streaming your digital audio is the bit rate of that digital audio data. Again, this is very similar to digital video, which also uses the concept of bit rates to determine the size of the data pipe that the audio data streams through. The digital audio bit rate is defined during digital audio file compression by the settings that you give to the codec.

Digital audio files that need to support a lower bit rate to accommodate slower bandwidth networks have more compression applied to the digital audio data. This results in a lower audio-quality level. However, lower playback quality isn’t as noticeable in digital audio as it is in digital video.

Low bit-rate digital audio can always play back smoothly across a greater number of hardware devices. This is because if there are fewer bytes of audio data to transfer over any given data network, then there are fewer digital audio data bytes to be processed by the CPU inside that hardware device.

As a processor gets faster, it can process more bytes per second. As a data bandwidth connection gets faster, it can more comfortably send or receive more bytes per second.

Therefore, it is important to remember that you are not only optimizing your audio file size for network transfers, but you are also optimizing your digital audio assets for the amount of system memory that asset requires, as well as the amount of processing cycles that the CPU uses to process the digital audio asset sample data.

High-Definition HD Digital Audio: 24-Bit 48 kHz Sampling Data

As I mentioned in Chapter 3, the industry baseline for superior standard definition (SD) audio quality is known as CD quality audio, which is defined as a 16-bit data sample resolution and the 44.1 kHz data sampling frequency. It was used to produce audio CD products way back in the 20th century and it is still used as a minimum digital audio quality standard.

There is also a more recent HD audio standard that uses a 24-bit data sample at a 48 kHz or a 96 kHz sample frequency. It is used today in HD radio and HD satellites, as well as in HD audio–compatible Android devices, such as the new Droid X HD “high-fidelity” Android smartphones. These provide the user with an extremely high-fidelity digital audio experience. HD audio is supported by several of the open source codecs.

Digital Audio Storage and Playback: File Formats

There are considerably more digital audio codecs supported in the open platforms (HTML5, Java, or Android) than digital imaging codecs, as there are only four image codecs: PNG, JPEG, GIF, and WebP. Android Studio audio support, for instance, includes MP3 (MPEG-3) files, WAV or AIFF (PCM) files, MP4 or M4A (MPEG4) files, OGG files, FLAC files, and MID, MXMF, and XMF MIDI files, which as you know from Chapter 2 are not really digital audio data. Let’s cover all the digital audio formats that support sampled (digitized waveform) data.

MIDI: Musical Instrument Data Interface’s MID, XMF, and MXMF

Since MIDI was covered in Chapter 2, I will just go over the different file formats supported in open platforms, such as HTML5 (browsers and operating systems), Java (using JavaFX), and Android Studio. There are several MIDI file formats, including MID, XMF, and MXMF MIDI formats. They are exceptionally compact because there is zero waveform data; there is only performance data, such as note on, note off, aftertouch, and so on. You opened and scored a MIDI file named fidelio.mid using Rosegarden in Chapter 2!

MPEG-3 : The Popular MP3 Digital Audio Player Data Format

The most popular digital audio format in history is the MP3 digital audio file format, which is short for MPEG-3. Most of you are familiar with the MP3 digital audio files found on music download web sites such as Napster. Most of us collected songs in this format to use on popular MP3 players or in CD-ROM music collections. The reason this MP3 digital audio file format is popular is because it has a relatively good compression-to-quality ratio, and because the codec needed to play MP3 audio files is found everywhere, including Android, iOS, Blackberry, Windows, Java, JavaFX, and HTML5.

MP3 is an acceptable format to use in your web site or application as long as you can get the highest quality level possible out of it by using the optimal encoding work process (again, this will be covered in Chapter 12).

Because of software patents, Audacity 2 can’t include MP3 encoding software or distribute any MP3 software from its own web site, which is why I showed you how to download and install the free LAME and FFMPEG encoders for Audacity.

It’s important to note that the MP3 codec outputs a lossy compression audio file format. Lossy compression is where some of the audio data, and thus quality, is discarded during a data compression process; it cannot be recovered. This is similar to the JPEG compression algorithm for digital images, which can cause visual artifacts (purple, green, or yellow pixel smudges).

Open platforms do support the open source lossless audio codec called FLAC, which stands for Free Lossless Audio Codec. Support for FLAC is now as widespread as MP3, due to the free nature of the software decoder.

FLAC: The 24-Bit HD Audio Capable Free Lossless Audio Codec

FLAC uses a fast algorithm, so the decoder is highly optimized for speed. FLAC supports 24-bit audio, and there are no patent concerns for using it. It is a great audio codec to use in Android or HTML5 if you need high-quality audio with a reasonable data footprint (file size). FLAC supports a range of sample resolutions, from 4-bit data per sample, up to 32-bit data sampling. It also supports a wide range of sample frequencies, from 1 Hz to 65,535 Hz (or 65 kHz), using 1 Hz increments; it is extremely flexible. From an audio playback hardware standpoint, I suggest using a 16-bit sample resolution and either a 44.1 kHz or a 48 kHz sample frequency, unless you’re targeting HD audio, in which case you should use 24-bit with 48 kHz for HD audio.

FLAC is supported in Android 3.1 and Java. Therefore, if your end users are using current Android devices, you should be able to safely utilize the FLAC codec. It is possible to use completely lossless new media assets in Android application development by using PNG8, PNG24, PNG32, and FLAC, as long as your application is targeting Android 3.1 or later hardware devices. Next, let’s take a look at another impressive open source codec.

Ogg Vorbis: A Lossy High-Performance Open Source Codec

Another open source digital audio codec supported by Android is the Ogg Vorbis format . This lossy audio codec is brought to you by the Xiph.Org Foundation. The Vorbis codec data is usually held inside an OGG audio data file extension, and thus, Vorbis is commonly called the Ogg Vorbis digital audio data format.

Ogg Vorbis supports sampling rates from 8 kHz to 192 kHz, and supports 255 discrete channels of digital audio. As you now know, this represents 8-bits worth of audio channels. Ogg Vorbis is supported across all Android versions or API-level releases.

Vorbis is quickly approaching the quality of MPEG HE-AAC and Windows Media Audio (WMA) Professional, and it is superior in quality to MP3, AAC-LC, and WMA. It’s a lossy format, so the FLAC codec still has superior reproduction quality over Ogg Vorbis, as FLAC contains all the original digital audio sample data. Ogg Vorbis audio and Ogg Theora video are supported in HTML5.

MPEG-4 : Advanced Audio Coding AAC-LC, AAC-ELD, or HE-AAC

Android, with a market share in the 90% range across all hardware devices, supports all the MPEG-4 AAC (Advanced Audio Coding) codecs, including AAC-LC, HE-AAC, and AAC-ELD. Java, which is the development and publishing platform that is nearing 90% market share among developers, also supports these codecs. AAC audio data samples are contained using MPEG-4 file “containers” (.3gp, .mp4, or .m4a file extensions). AAC-LC and HE-AAC can be decoded with all versions of Android. The AAC-ELD is only supported after Android OS 4.1. ELD stands for Enhanced Low Delay; this codec is intended for use in real-time, two-way communications applications, such as a digital walkie-talkies, or Dick Tracy–style smartwatch apps.

The simplest AAC codec is the AAC-LC (Low Complexity) codec, which is the most widely used. This is sufficient for most digital audio encoding applications. AAC-LC yields a higher quality result at a lower data footprint than the MP3 codec.

The most complicated AAC codec, HE-AAC (High Efficiency) codec, supports sampling rates from 8 kHz to 48 kHz, and both stereo and Dolby 5.1 channel encoding. Android decodes both V1 and V2 levels of HE-AAC. Android can also encode audio using the HE-AAC-V1 codec in Android devices later than version 4.1.

Because of software patents, Audacity doesn’t include an MPEG-4 encoder. Be sure to download and install the free FFMPEG 2.2.2 encoder for Audacity, from http://lame.buanzo.org before you start Chapter 5, where you’ll use Audacity 2.1.1. You should have done this in Chapter 1, so just make sure that you have the libraries installed to maximize Audacity’s features!

AMR: The MPEG-4 Adaptive Multi-Rate Audio Codecs for Voice

For encoding speech, which usually features a different type of sound wave than music does, there are also two other AMR (Adaptive Multi-Rate) audio codecs, which are extremely efficient for encoding things like speech or “short-burst” sound effects.

There is an AMR-WB (Adaptive Multi-Rate Wideband) codec in Android that supports nine discrete settings, from a 6.6 kbps bit rate up to 23.85 kbps, sampled at 16 kHz. This is a pretty high sampling rate where voice is concerned! This is the codec to use on Narrator tracks, if you’re creating interactive e-book Android Studio applications, for example.

There’s also an AMR-NB (Adaptive Multi-Rate Narrowband) codec in Android that supports eight discrete settings, from 4.75 kbps to 12.2 kbps audio bit rates sampled at 8 kHz. This is an adequate sample rate if the data going into the codec is high quality or if resulting audio samples do not require high quality due to the noisy nature of the content (for example, a bomb blast).

Pulse-Code Modulation : Windows WAV or Mac AIFF PCM Codecs

Finally, almost all operating systems, including Windows, Mac OS, and Linux-based ones, such as Android, Tizen, Ubuntu, openSUSE, Blackberry, Firefox OS, Opera OS, and Chrome OS, support the PCM (pulse-code modulation) codecs, commonly known as the Windows WAVE (WAV) audio format or the Apple AIFF (Audio Interchange File Format). Most of you are familiar with this lossless digital audio format from one of these two popular operating systems. It is lossless because there is zero compression applied. PCM audio is commonly used for CD-ROM content, as well as telephony applications. This is because PCM Wave audio is an uncompressed digital audio format. It has no CPU-intensive compression algorithms applied to the data stream, and thus decoding (CPU overhead) is not an issue for telephony equipment or for CD players.

For this reason, when we start compressing digital audio assets into various file formats in Chapter 12, which covers digital audio data footprint optimization, you will use PCM as the “baseline” file format.

You probably won’t put PCM into Kindle (MOBI), Java (JAR), or Android (APK) distributable files, however, because there are other formats, such as FLAC and MPEG-4 AAC, which give you the same quality, and do it using an order of magnitude less data.

Ultimately, the only way to find out which audio formats supported by Android have the best digital audio result for any given audio data is to actually encode digital audio in all the primary codecs that you know are well supported and efficient. I show you how this is accomplished in Chapter 12.

Summary

In this chapter, you looked at the digital audio data encoding concepts, principles, and file formats used to compress and decompress digital audio assets, as well as to publish and distribute to end users. You also learned how sample resolution, sample frequency, bit rate, streaming, and HD audio can contribute to your digital audio sample’s quality and to its data footprint.

In the next chapter, you learn about digital audio data footprint optimization concepts, terms, and principles.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 4. The Transmission of Digital Audio: Data Formats

Create new playlist

Sign In

Sign Up