Levels_FEATURED

Levels

Intro from Jay Allison: What's a good level? Not too hot, not too cold. Just right. Our TOOLS editor Jeff Towne spends a lot of time answering this perennial audio question. Transom is proud to present Jeff's Everything You Wanted to Know About Levels feature, and then some--with audio and visual examples, links to products and resources, the works. Print it out and study it, so you don't have to ask again.

from Jeff Towne

Achieving proper levels when recording and mixing is one of the most fundamental tasks in audio production, yet it remains confusing to many producers. There’s a good reason for that: there is a dizzying array of standards, often in conflict with one another. Adding to the confusion, there’s no single answer for what the “correct” audioto level is, but understanding the most common norms is very important. Independent producers and reporters are increasingly responsible for creating the final audio product, whether it’s a podcast, a short feature, or a complete radio program.

For many years, reporters and producers had the luxury of remaining fuzzy in their understanding of the finer points of audio levels, because there were always audio engineers at radio stations who understood this arcane topic, who would make adjustments before it reached the listeners’ ears. In fact there were often several engineers who would make adjustments to the audio signal as it made its way through the distribution and broadcast chain, and there were standard procedures for aligning the levels that attempted to assure that a relatively consistent sound went out over the airwaves.

Photo of levels

But the world is changing. Producers now often deliver the final audio directly to the listener, as happens in podcasts, or with minimal intervention by network or station engineers. When a producer uploads a program to the Public Radio Satellite System’s ContentDepot, or PRX, those soundfiles are delivered directly to stations, where they are often simply placed in a station’s automation system, to air at the volume the producer chose. We’ve all experienced the jarring effect of a commercial blaring out of our radio or television at a volume much higher than the surrounding program, and a similar phenomenon can result if producers do not mix their programs to a standard level.

Most radio stations have hardware designed to control the audio levels of the broadcast chain, which can even-out disparate sources to some degree, but that processing does not always compensate for widely varying sources, so it’s preferable that each program has similar levels.

Peaks vs. Average Levels.

The distinction between peak and average levels is crucial. It’s important to attend to peak levels in order to avoid distortion.  If an audio signal reaches full-scale, and stays there for more than a passing moment, it will usually sound crunchy, blurry and distorted.  On a waveform display, one can see when the tops and/or bottoms of the waves are flattened, rather than displaying the gradual up and down undulations of an undistorted waveform. That’s called “clipping” and it almost always sounds bad. You avoid it by watching peak-reading meters and making adjustments to prevent  the levels from reaching 0dBfs.

Clipping image
Clipping

Recording in the field is largely concerned with peak levels: the basic task is to record at as high of a level as possible, without clipping.

When mixing or mastering, the concern changes from merely avoiding clipping, to assuring consistent levels over time, and an overall loudness that meshes well with other productions. In order to achieve that, you need to pay attention to Average Levels. These can be observed by using a meter that displays average, or RMS levels. RMS stands for Root Mean Square, and it’s a way of averaging loudness values over time, not just the momentary values of short transient  sounds.

The difficult part of mixing a production so that it has the “right” average levels is that there are several competing standards. There are perfectly valid arguments to be made for any of them. Some advocate for average levels of –20 dBfs or even lower, which allows for greater dynamic range (the difference between the average volume and 0 dBfs, the loudest value that peaks can have).  Large dynamic range is especially appropriate for classical music, and for film sound, both of which often incorporate dramatic changes between soft and loud passages. But an overly-wide dynamic range can be problematic if the listener is left straining to hear the quiet parts, or is startled by dramatic peaks.

Compression image
Overly Dynamic/Overly Compressed

On the opposite end of the spectrum, some advocate for average levels at –12 dBfs. This standard is widely used by talk-radio, as well as many pop-music stations. This louder overall level means that there’s less dynamic range: peaks cannot be louder than the average level by as much, but all parts of the program are more easily heard. This can be especially helpful for listening in loud environments, where there’s background interference, like the road noise one hears when driving in a car. Higher average levels, and smaller dynamic range mean that low-level signals won’t get lost in background noise, but sometimes the audio material can sound squeezed and lifeless.

Dynamics processing, such as compression and limiting is often required to tame the peaks enough to get average levels up to -12dBfs, and that kind of processing sometimes sounds unnatural. There’s been an ongoing problem in the music industry, sometimes called the Loudness Wars: the average levels of commercial music have been getting  higher and higher, until there’s little dynamic range left, just loudness. This phenomenon is partly the result of developments in audio processing technology. The advent of “look-ahead limiters,” first in hardware boxes like the T.C. Electronic Finalizer, then software-based processors such as the Waves L-1, made it possible to push average levels higher and higher without causing peaks to clip and distort. By buffering a tiny amount of sound, these processors are able to anticipate incoming audio levels, and apply compression and limiting in a more aggressive way than traditional analog processors can. But these tools, and other digital processing that can manipulate dynamic range, can easily be over-used, making audio louder, but lifeless.

A good middle ground is to mix so average levels are at –15dBfs. That is the standard recommended by the Public Radio Satellite System, so files uploaded to ContentDepot should be mixed so that average levels are at that level. There’s an additional twist to that standard: peaks should not exceed –3 dBfs, so there’s actually only 12 dB of dynamic range. This would be similar to mixing with average levels of –12dBfs and allowing peaks to go to zero, but it’s not quite the same. It’s often smart practice to leave some open headroom at the top, and not let peaks go all the way to zero. This way, further manipulations of the file, which may happen if a radio station’s automation system or broadcast chain applies additional processing, like a level adjustment, or EQ or compression, won’t cause the signal to distort. A peak that goes completely full-scale, hitting 0 dBfs might sound fine in the original file, but could clip and distort if any kind of processing is applied, even something that reduces volume, like a high-pass filter or other EQ.

PRX recommends following the same level standards that the ContentDepot uses, and for good reason: the idea is for radio programs to have a consistent level no matter where they come from. A station can just drop the files from ContentDepot and PRX into their playback or automation computer and they’ll all play out at a similar loudness, without jarring, abrupt jumps up or down in volume.

These suggested broadcast levels align vaguely with another set of standards: the K-System, developed by famed mastering engineer Bob Katz.

Overcompressed image
Overcompressed – too little dynamic range

His main motivation was fighting back against the loudness wars in commercial music. Recordings were getting louder and louder, but sounding worse and worse. He developed the K-System, based somewhat on standard practices in the film industry, to encourage mix engineers and mastering engineers to resist the urge to go louder and louder, but instead to retain dynamic range, for a more natural sound, and to adhere to standard average levels.  The K-System acknowledges that different contexts may call for different standards. It recommends average levels at –20 dBfs for high-fidelity music, -14 dBfs for pop music, and –12 dBfs for broadcast.

The PRSS/PRX standard is kind of in-between: the –15 dBfs mark for average levels is fundamentally the same as the K14 standard,  and the 12dBf of dynamic range is similar to the recommendations for broadcast audio.

But there are still many other conflicting standards, each with a solid rationale for their numbers, and some with different ways of measuring the levels. The BBC in Great Britain uses a system called PPM. Their internal standards designate average levels as PPM 4, which translates roughly to –18 dBfs. They prefer that peaks go no higher than PPM 6, which is roughly equivalent to –10 dBfs, which only allows 8 dB of dynamic range.

Elsewhere in the European Union, the level standards are not completely set-in-stone. There’s a somewhat widespread standard called EBU R-128 that sets average levels at the equivalent of –23 dBfs, but there’s also a fairly new proposal called ITU BS-1770, that may have some success in standardizing broadcast audio levels. It’s related to a US standard called ATSC RP A/85, so there’s some hope of having a standard level that may be compatible across continents. But these matters are still in flux, so the best practice is to ask what levels are preferred by whoever is ultimately receiving the audio.

As mentioned above, for the time being, if you’re distributing your audio to Pubic Radio stations in the US, you should align to average levels at –15 dBfs, and peaks at –3 dBfs.

Measuring Up.

So, how do you get your audio to those specs? It’s partly a matter of having good level meters, and knowing how to use them. The K-System referenced above, also includes the idea of having a fixed listening level of 83 dB(SPL). Even if you don’t adhere to that specific standard, maintaining a fixed monitor level will go a long way toward making your mixes more even, simply by using your ears. Try to avoid turning the volume of your speakers up and down randomly: have an unchanging setting for your amplifier, or mixer, even your headphones, or whatever you use to listen when mixing. Ideally, a volume control with a very precise knob using stepped gradations will allow you to more accurately create a repeatable monitoring level, but that’s a pretty esoteric requirement.  Marking the position of all relevant knobs and/or faders with tape, or a marker, should get you close enough.

Meters are important aids, but remember, what we’re going for is an even loudness over time, but the essential nature of sound is that it’s changing constantly. So, use the meters as a guide, to verify what you’re hearing, but use your ears as your primary tools. There are some hard-to-quantify characteristics of sound that will make some elements feel louder or softer than meters may indicate. In the end, trust your ears.

Bomb Factory Essential Meter
Bomb Factory Essential Meter

Ideally, a 1 khz sine wave tone, played from within your editing program, should make your meters read –15 dBfs on a peak-reading meter, and 0 VU on a VU-style meter. Pink noise, registering those same levels on your meters, should show 83 dB (SPL) on a sound level meter placed where you’ll be sitting while mixing. You can get an SPL meter at your local Radio Shack, or there are inexpensive apps for the iPhone and Android smartphones that do the same thing.

For metering within your digital workstation, there are many choices. For Pro Tools, it’s good practice to create a master fader, and insert a good meter as a plug-in on that track. Pro Tools usually ships with a plug-in called the Bomb Factory Essential Meter Bridge. It can usually be found in PlugIns>> Other>>BF Essential Meter Bridge. It simulates an old-fashioned VU meter, and has several virtual push-buttons to calibrate to different level standards. When mixing for Public Radio in the US, press the –15 button, and make sure the switch is set to read RMS values. That same meter can be used to show peak levels, but it’s much better at showing average levels.

For Pro Tools as well as other editing programs that can use external plug-ins, there are several other good choices.

FreeGMeter image
FreeGMeter

There’s a very capable free meter plug-in called Sonalksis Free-G.

It looks like a typical peak-reading meter, but displays both peak and RMS values at the same time. It’s quite handy to be able to see both modes, and it even allows adjustment of the channel’s gain from within the plug-in.

If you’re able to spend some serious money, the Dorrough Meters from Waves are amazing. Dorrough hardware meters have been standard in the broadcast industry for years, and with good reason. They’re easy to read, and can display a wide range of helpful data. But the plugins are also pretty expensive, street price about $180 at press time.

The K-System that we’ve mentioned a few times also has created a very good metering plug-in. http://www.audiopluggers.com/kmeter/index.html It’s about $50, but with one catch: If you use Pro Tools, it does NOT run natively as an RTAS plug-in. You would also need a VST-to-RTAS wrapper utility, like this one: http://www.fxpansion.com/index.php?page=15 which sadly costs $99 (but will also let you use a wider array of plug-ins within Pro Tools, not just this one metering plug-in). If your digital workstation can use VST or AU plug-ins natively, the K-Meter is simpler to install and use.

There are many other metering plug-ins; if you find one that you like, the only requirements are that it can display RMS levels, and can calibrate 0 VU to –15 dBfs.

Some programs have very good meters built in. Adobe Audition’s meters can be made very large, with fine detail. Hindenburg Journalist’s meter is very good, and can be switched to display different standards. Hindenburg even has a loudness meter that shows loudness over time and can be inserted on any track.

It’s always important to use your ears to verify that things are sounding right, but meters are important aids too.  If you use a reliable meter, it can show you whether your clips have been turned-up too loud, if so, you’ll see the peak meters hit the top of the scale and light-up the red indicators. If this happens, you need to turn down that track, or control the peaks with a dynamics processor.

Spoken Voice Levels image
Spoken voice levels

Using a meter that shows RMS values is just as important. If it’s a traditional VU-style meter, you want the needles to bounce up around the 0vu mark; it’s even OK if they dip over into the red briefly. You don’t want them staying over there, or to be pushed hard all the way to the right, but a short excursion above 0 is fine. For average dialog, it’s normal for a VU meter’s needles to hang around –7, but they should pop up toward zero from time to time. This is hard to get used to at first: normal spoken dialog might look too low on a traditional VU meter, compared to a denser, busier source, like pop music, but it’s OK for the meters to indicate -7 or so for much of the time, as long as your levels do get up around 0 VU from time to time. On a digital scale, you want the levels to hang somewhere near the desired average, in the Public Radio case –15 dBfs, with occasional pops up toward zero. If not, you may need to increase the level of the track or clip, by using volume automation, or by inserting a dynamics processor. In fact it’s almost always necessary to use some compression or limiting to get an unaccompanied spoken voice up to average levels of -15 dBfs.

Volume Automation image
Volume Automation

Understanding one’s meters is not as simple as one might hope. Measuring sound is a complex task – there are many different aspects of it that can be quantified (electrical power, sound pressure levels, relative intensity and more). Those different characteristics of sound are all interrelated, and the units of each measurement are expressed in “decibels” (dB) but they’re actually different values, depending on what kind of dBs are being measured. It’s a hard concept to visualize, but decibels are relative, rather than absolute values, they are always a comparison to some standard –– it’s dB relative to something.

Just to make things even more odd, decibels use a logarithmic scale, so that a relatively manageable range of numbers can represent a very wide range of intensities. If decibels were linear units, like inches or liters, we’d have to resort to cumbersomely large numbers in order to describe the full range of sound intensities that we encounter.

There are two practical consequences of these quirks of measurement: a dB is not a simple unit with a standard value like a mile or a gram, and the loudness represented by each dB gets larger and larger as the values increase. The difference in intensity changes dramatically with even small movement along the scale.

The dBs that the general public might be familiar with are properly labeled as dB(SPL). SPL stands for Sound Pressure Level, and this scale is used for the loudness of audio phenomena in the environment, as perceived by human ears. In this case the ratio is based on the quietest perceivable sound having a value of 0 dB(SPL) and the intensity increases in a complex way. A very rough rule of thumb is that the perceived loudness doubles every 10dB.  (You’ll find different values for that rate of doubling when calculating voltages, or power, or sound pressure – but the lesson to take from it is that small changes in the numbers of dBs of any type can represent large changes in loudness.)

SPL meter image
SPL meter

Loudness perception is more complex than can be represented with a simple number, but it’s generally accepted that 40-60 dB(SPL) is the value for conversational speech, 80-90 dB(SPL) for loud traffic,  and 130-140 dB(SPL) for a loud rock concert, which is also considered the threshold of pain.

As mentioned above, the K System uses the “magic” value of 83 dB(SPL) as a monitoring level, and that has become something of a standard in mixing and mastering studios. If you align your listening environment so that your reference level, the average loudness of your sound, is always 83 dB(SPL), you can often get very consistent levels mostly just by using your ears. But even having a stable monitor volume is not always reliable on its own: your perception of loudness can be altered by many factors, like fatigue, health, the monitoring environment, extraneous noise, and other variables, so be sure to also check your meters.

Conversely, meters are not completely reliable: human perception of sound is very complex, and does not react in the even, linear ways like a meter might. The duration of a sound will affect its perceived loudness, as will the shape and complexity of the waveform, its pitch, and several other attributes. Many attempts have been made to tailor the scale to represent the ways the human ear perceives the sound, and you may see sound levels described as A-Weighted, or any of several other curves that have come into use for specific purposes, all compensating for the non-linearity of the human perception of sound. Just to make things even more complicated: human hearing is by definition about perceiving changes in SPL, not a static value, so all of these measurements are also made over time.

So use both: your ears will give you important data, as will meters.

In order to fully understand your meters, you need to have at least a passing familiarity with the scale they are using. They’re marked in dB, and the kinds of dB that we encounter most often in the modern audio recording world are dBfs, which is an abbreviation for decibels relative to full-scale. Those are the units used on the digital meters we most commonly see on recorders and in computer software. In this system, full-scale, the highest intensity of signal that can be encoded, is represented by the number zero, and all other levels are indicated by negative numbers: how many dB the signal is below that full-scale. As a signal gets more intense, its dBfs value will get lower: -2dBfs is louder than –5dBfs, which is louder than –10dBfs, and so on.

Older equipment often featured VU meters; in most cases they were mechanical needles pivoting across an arched scale. This scale did not use zero to represent the absolute maximum level. That zero does not mean the same thing as the full-scale zero: this zero indicates a value calibrated to a standard output voltage. Using this meter, it was possible, even desirable, to exceed the zero value at times; peaks above 0VU were fine.

Dorrough hardware meters
Dorrough hardware meters

Both kinds of metering are useful in their own ways, and each is better suited for measuring different aspects of an audio signal. The trick is that it’s very hard to represent sound on a meter: the very nature of sound is that it’s the CHANGE in sound intensity and pressure over time, and there’s a very complex relationship between the physical properties of the sound itself and the way our ears perceive it. So any kind of metering is an approximate representation of what we want to know, but if one learns to use them, they can be very helpful in getting clean, predictable, evenly-balanced sound.

It’s generally easier to measure peaks with a digital meter that uses a scale graded in dBfs. And it’s often easier to visualize average levels when using a VU-style meter, whether it’s purely digital, or an analog meter relying on the mechanical action of physical needles. Those old analog VU meters were notoriously inaccurate, partly due to the inertia and momentum of the physical needles used in the displays, but they nonetheless did a pretty good job of representing average levels.

Recording in the Field.

The primary task of field recording is to record as loud as possible, without clipping. The meters on various recorders behave differently, so it may take a little while to get used to properly interpreting what those meters are telling you. The excellent Sound Devices recorders have very accurate and readable meters, but the way they are designed, desirable, safe, recording levels can cause the meters to flash red. You just have to get used to this quirk, and realize that in this rare case, seeing red indicators on the meters is OK, although you still need to avoid lighting up that very top LED light.

Field recorder input meter
Field recorder input meter

Seeing red lights, or other clipping indicators on a meter is usually not good: the generally accepted convention is for red lights to indicate clipping, but any given meter may indicate them in other ways, be sure to learn how your recorder displays clips, and avoid them! Some recorders even have clip indicators separate from the regular level meter. If input levels are high enough to be triggering those clip lights, the first thing to do is to turn down the input gain. If the source is very loud, you may need to engage a “pad” on either the recorder or the microphone, which will knock the level of the incoming signal down by a preset amount.

The tricky part is to not turn your inputs down too low. If you record at too low a level, you risk creating a noisy recording. Increasing the level of the recording when mixing will also increase residual noise from the recorder and microphone. If you record extremely low levels, your sound may also be muddy and indistinct, because you haven’t given the recorder enough signal to convert to useful digital information. It’s like under-exposing a digital photo: you can brighten it back up in an image editing program, but the picture will look grainy and blocky because you used so little of the sensor’s range to encode the data. The same holds true for audio:raising the gain dramatically later in your computer will reveal noise and other imperfections of the recording chain. A common rookie mistake is to turn the input volume down when hearing background noise during recording. That noise is going to reappear if you have to raise the gain when you mix, so it’s better to try to mitigate that while recording, by getting closer to the source, or trying a different microphone, or adjusting the settings on your recorder, such as engaging a high-pass filter.

(There’s one exception to that rule: the microphone preamps on some recorders are slightly more noisy when turned ALL the way up to their absolute maximum. You may need to experiment with your particular combination of microphone and recorder and see if there’s added noise when you set the input knob up at 9 or 10.)

As a rule, record as loud as you can, without clipping. Watch your meters carefully, and be ready to adjust them (gently, gradually!) on the fly if your meters are reading too high or too low. It’s better to err on the low side than the high, quiet levels are easier to fix in the mix than those with distorted peaks, but recordings made at TOO low a volume can also be problematic.

Many recorders offer Automatic Gain Control (AGC) and/or Limiting, which can adjust the levels automatically, but these vary widely in quality. With very few exceptions, AGC will create an unnatural-sounding recording, with background sounds pumping up and down in an unpleasant way. It’s also important to verify how the AGC works: some Zoom recorders, for instance, do not adjust the level over time, instead setting a fixed recording level based on the signal intensity present when placing the machine in record-ready mode. This avoids the pumping sounds of conventional AGC, but it also does not react to changing audio levels, and may therefore still allow distorted or too-low recordings.

Built-in limiters also range widely in their sound quality and effectiveness. In theory, a limiter can tame unexpected peaks by automatically reducing the input gain on signals that exceed a certain threshold level. Limiters tend to be more specific in their action than AGC circuits, and so, often sound better. They can be a lifesaver when encountering unexpected loud sounds, or troublesome sources that have excessively large differences between loud and soft components. But you may need to experiment with your recorder in order to decide whether its limiter sounds good. The Sony PCM-D50 has an unconventional limiter that manages to sound very natural, but many recorders have slow-acting limiters that tend to overshoot, and take too long to return to normal record levels, creating odd-sounding volume dips on the recording following a loud peak.

Ideally, one would record without AGC or limiting, and even-out the levels in the more controlled environment of a digital workstation. But if avoiding clipping means that you’d be forced to record very low signals, you may need to engage one of those tools, or carefully ride the input levels by manually adjusting the gain knob.

In the Mix.

At the mix stage, the engineer’s relationship with levels is completely different from recording in the field. Yes, one still needs to avoid clipping, so that your final mix doesn’t distort, but you’re less concerned with getting your levels as loud as possible, and more concerned with making them even. This is all about getting the average levels even, not the peak levels, and getting those average levels to match accepted standards.

As mentioned above, the expected levels in the Public Radio system in the US is to mix your session so that average levels are at –15 dBfs and peaks do not exceed –3 dBfs. You achieve that by adjusting the levels of each clip in your project, so that your audio level meters, inserted on the main output, show those levels.

The volume of the individual clips can be adjusted in three main ways: by adjusting the level of the clips themselves, by doing volume adjustments to the track by riding the fader or writing volume automation, or by using dynamic processors, like compressors and limiters, either on the track or applying the effect to the region.

Many people start by “normalizing” individual regions or clips. This can be helpful, or counterproductive, depending on the nature of the recordings. If the audio is extremely quiet, performing a process like this can be a handy way to get each clip close to what you’re looking for. The problem is that most Normalizing processes look only at peak levels, and adjust the region so that its loudest point is up at 0 dBfs. That output level can sometimes be adjusted, but keep in mind that most normalizing is done based on peaks.

But peak levels don’t correlate well with how loud the clip sounds, you need to look at average (RMS) levels for that. There are some RMS normalizing utilities out there, but they’re rare. RMS normalizing will get you much closer to even volumes across clips. The downside of it is that the process could then do something bad to the sounds’ peaks; there’s even the chance that you could force a peak to clip and distort. So you need to watch both the peaks and the average levels, as you change a region’s volume.

Another problem with normalizing is that, in many programs, that process writes a new file with the new volume, which eats-up disc space, and also makes it harder to go back and change your mind about where a region begins or ends. Not all programs write a new file when normalizing, but most do, including Pro Tools, Audition and Audacity.

Hindenburg levels image
Hindenburg

One of the attractive things about Hindenburg Journalist is that its leveling functions, including auto-level, do not write new files, the adjustment is completely non-destructive. The amount of boost or cut can be tweaked as much as desired, the boundaries of the volume adjustments can be moved as well, you can even apply, and adjust, crossfades across the borders of regions with different volumes.

Yet another problem with normalizing, at least peak normalizing, is that it does not assure that the clips are at the proper level, only that they’re at an arbitrary level you specify. RMS normalizing can get the clips closer to a target loudness, but any kind of automatic process like that is subject to quirky behavior. If there’s a very loud peak somewhere in the region that you’re normalizing, it will throw off the level for the region as a whole, and you may still have a region that sounds louder or softer than the other regions you’re working with.

If the levels of individual clips are fairly well recorded, therefore close to the right level to start, it’s probably better to skip normalizing or other kinds of gain applied directly to the individual regions, and instead use volume automation and/or compression.

In most mixes, you’ll still want to do some volume automation on some tracks to compensate for momentary loud or soft sections. It’s best to adjust the levels of individual tracks, not the final mixed output. Each track may need unique volume changes, which may interact with one another as each change is made, so listen carefully after you make a change, and watch your meters: make sure that the volume changes you are making are resulting in a healthy level at the final mix, but are not hitting the 0 dBf mark, causing clips and distortion. This is especially likely to be a concern if you are layering sounds; each element on its own might be fine, but two or three tracks playing together may create an overload on the stereo master track.

Volume automation in Audacity
Volume automation in Audacity

Automating the output volume of a track is a straight-ahead process in most editing programs: there’s usually a graphical line one can adjust when in volume mode, or perhaps the upper edge of the waveform image can be dragged up or down, to raise and lower the volume. The visual depiction of the waveform can be helpful in guiding you to the spots that need boosting or cutting, but be sure to use your ears, more than your eyes. Some variation of the levels is natural and desirable so don’t over-smooth it. If you do ramp the volume up and down, be sure to be gradual; in most cases, abrupt changes to a track’s volume will sound unnatural. If a track sounds like it needs constant riding of the levels up and down, it’s a good candidate for dynamics processing.

The third way of evening-out the audio level is to reduce the dynamic range of the audio by using compression and limiting.  Getting your peaks at the right level does not guarantee that your average levels will be correct. Of course the converse is true as well: if you mix so that your average levels are correct, you may find that peaks are too high, perhaps even clipping. If that’s happening to your mix you may be able to address it with volume automation, but the more practical answer is often to use compression and/or limiting.

It’s important to note that the term “compression” is used in different ways in the audio world. One usage refers to data compression; reducing the data-size of a digital audio file by converting it to an MP3, or AAC or OGG, or other such file types, in order to make it easier to store or transmit.

But we’re talking about the other kind of compression: audio level compression, or dynamic range reduction. Compression and Limiting are just two flavors of the same process: Limiting is just extreme compression, usually applied only to the loudest parts of a sound. Both types of processing reduce the levels of loud sounds, while leaving the quieter sections alone. The result is a more-manageable dynamic range, with reduced peak levels, and less difference between the loud parts and the quiet parts. With the peaks reduced, the overall level can be increased, making the average levels louder. Some compressors and limiters do that automatically – that process is sometimes called Upward Compression. If  the compressor you’re using does not do that, you may need to adjust the parameter on the compressor called “make-up gain” in order to return the processed track to an ideal average level.

L1 Limiter
L1 Limiter

In general, you’ll get better results by applying compression to individual tracks, rather than the final stereo master track, although it may sometimes be advisable to do both. In fact, it’s often very handy to put a brick-wall limiter, such as the Waves L1, or the Massey L2007 Mastering Limiter, on the stereo master track, just as a safety, to catch any stray peaks. I always have one of those two plug-ins in place on my master fader, with the output level set to –3 dBfs, so that no matter what I do, no peaks will exceed that level.

If my average levels are too low, I’ll lower the threshold of the limiter, the level at which it starts to take effect. That has the effect of increasing the overall average mix level, while still holding the peaks at –3 dBfs. Those two plug-ins happen to have automatic gain compensation, so as the action of the limiter increases, the average level of the program material increases. Many traditional compressors and limiters work the other way: the more compression or limiting is applied, the lower the overall output levels. Compressors always include a control for make-up gain, allowing the overall signal volume to be raised to compensate for the reduced peak levels.

Compression image
Compression

Compressing or limiting the final master track may not be enough, or may introduce undesirable audible artifacts, like the volume of music or ambience backgrounds jumping up and down. In many cases, it’s best to compress or limit individual tracks, so that only those elements are affected by the processing.

Voice tracks in particular, can often benefit from some compression, just to even-out the natural tendency of the spoken word to have a wide range of volumes.

Gregg McVicar wrote a good column about adjusting voice levels for Transom.org several years ago that still offers good advice.

Getting good results with a compressor takes some practice, but in general, you activate the compressor, play some audio into it, and then adjust the ratio and threshold controls until you start to see some gain reduction on loud peaks. I prefer to insert a compressor as a plug-in on a track and let it affect the whole track, although if there’s only a momentary need for peak control, I may highlight the troublesome area and apply compression just to that section. In Pro Tools, I’d use an AudioSuite effect, which writes the processing permanently to the highlighted region.

If you’re inserting compression on a track, it makes sense to place different voices on individual tracks, and use a compressor on each track, if needed, possibly with a different setting for each.

Compression image
Compression

Looking more closely at the compressor plug-in: the ratio control adjusts the severity with which the compressor acts. A 3:1 ratio means that for every 3dB the original sound level increases, the compressor will only allow the output to increase by 1 dB. A ratio of 3:1 or 4:1 is usually a good starting point; it’s usually gentle enough that the gain-reduction will not be noticeable. Most compressors have a meter that indicates gain-reduction, so watch that meter, and turn the threshold value down, until you start to see gain reduction of a few dB. Taking 3-4 dB off of peaks is a good rule of thumb for gentle compression, and will go a long way toward evening-out the typical spoken voice. There are no strict rules about how much compression is proper; the best thing to do is to listen. Too much compression will make the sound levels pump up and down, or accentuate low-level sounds and breaths in an unnatural way. If it starts sounding weird, raise the threshold setting until it sounds better.

Limiting is simply compression using a very high ratio, perhaps 10:1 or higher. This is too severe to use at low thresholds, but when applied to the very top of the audio range, to treat the very highest peaks, a limiter can control short transient spikes without sounding unnatural. Percussive sounds, an explosively-loud laugh, yelling, and other short sounds that are much louder than the surrounding audio can often be brought under control with careful limiting.

Other tools.

Levelator image
Levelator

Normalizing, automating levels, compressing and limiting, checking meters to see if the levels are hitting your targets…it’s a lot of work! If only someone could develop an application to do all of that automatically…Well, someone did and it’s called: The Levelator.

Even better, it’s free to download and use (although they’d certainly appreciate a donation if you end up using it.) The Levelator only works on .wav or .aiff files, and has been designed to work best on voice. It usually does NOT work well on a mixed piece with music or ambience, but it can work magic on an unaccompanied voice or series of voices. It uses a complicated array of processes that somehow create a very even output loudness from an original source with widely divergent levels. For better or worse, it’s dead simple to use: there are no controls, just drop your .wav or .aiff file on its icon and let it run. I’ve had some recordings come out sounding over-processed and unnatural, but most come out very clean. This is an especially good tool for podcasters, especially ones that integrate several voices.

Pre-Levelated image
Pre-Levelated
Listen to “Conversatonal Levels”
Levelated image
Levelated
Listen to “Conversatonal Levels (levelated)”

In a different way, the editing software called Hindenburg Journalist can achieve a similar result. The program has good meters that can display several common level standards, and features the additional attribute of assessing the average level of each sound clip that’s imported into a track, and adjusting it automatically so that it meets a predetermined level standard. The program does a remarkably good job of making each clip have the same apparent volume as the others. What is usually a tedious task of adjusting each clip so that it bounces the meters in the same way is done automatically for you. Of course one may still need to make some tweaks here and there, but for the most part, a lot of the mixing is done by the program. Hindenburg Journalist also features a “Voice Profiler” which can automatically apply compression and EQ to a voice track, based on an analysis of the contents of that track. The program also includes a very simple, but great-sounding, compressor plug-in that can be used on any track.

AudioLeak image
AudioLeak

No matter what audio workstation you use, and which tools you employ, verifying that you got it right, that your average levels and peaks are in the right range, can be difficult. There’s free software for the Mac that can analyze many aspects of a sound, including the crucial RMS “leq” value which calculates average levels over extended periods of time.

http://www.channld.com/audioleak/

By using a good meter, and trusting your ears, and applying some of the techniques here, you can output a final mix that will sit well next to other professional productions. Stations will thank you, and listeners will thank you. There are ongoing proposals for new standards for audio levels, that extend from commercial broadcasting, through the record industry, and the film world as well, and so there may be new expectations, and tools to help get there. The push is for greater dynamic range on recorded media, but that’s unlikely to penetrate too deeply into the broadcast world; too many people are listening to the radio, and TV, with lots of background noise, so it’s unlikely that the audience would appreciate quieter voices and louder explosions, at least not on their radios.

The beautiful thing about standards is that you can have so many of them, so be sure to find out what audio levels are expected by the consumers of your productions. Perhaps one day there will only be a few established standards, but for now, it’s a bit unsettled. In the interim, remember, submissions to ContentDepot or PRX should be mixed with average levels at –15dBfs, peaks no higher than –3dBfs.

Happy mixing!

Resources and References.

What is a decibel?

http://www.animations.physics.unsw.edu.au/jw/dB.htm
http://en.wikipedia.org/wiki/Decibel#Acoustics
http://en.wikipedia.org/wiki/VU_meter
http://en.wikipedia.org/wiki/Peak_programme_meter
http://www.moultonlabs.com/more/hearing_the_louds_and_the_softs_of_it/
http://www.digido.com/level-practices-part-2-includes-the-k-system.html

Jeff Towne

About
Jeff Towne

Jeff Towne has been producing radio programs since he was a teenager, back then with a portable Marantz cassette deck and a Teac four-track reel-to-reel tape recorder, and now with digital recorders and computer workstations. After honing his broadcasting skills at High School and College radio stations, Jeff has spent over two decades as the producer of the nationally-syndicated radio program Echoes. At Echoes, he has done extensive recording of interviews and musical performances, produced documentary features, and prepared daily programs for satellite and internet distribution. As Transom.org's Tools Editor, Jeff has reviewed dozens of audio recorders, editing software, and microphones, and written guides for recording, editing and mixing audio for radio and the web. Jeff has also taught classes and presented talks on various aspects of audio production. When not tweaking audio files, Jeff can probably be found eating (and compulsively taking pictures) at that little restaurant with the unpronounceable name that you always wondered about.

Comments

Leave a Comment

  • Henry Howard

    5.19.11

    Reply

    What a great summary Jeff.

    One technique that I use when I open a raw session or field file is to scan through quickly for the one to three cycle spikes. My software has a tool for that.

    I then reduce that single cycle back down to average.

    Many times that spike is a mouth noise, tap or plosive. Once these spikes are removed, then running an rms peak normalize will bring the file up to a much better level without applying any compression.
    Very few if any limiters will catch a one to three cycle spike consistently.

  • Jeff Towne

    5.19.11

    Reply

    That’s a great technique Henry! I’ll do a similar thing, usually manually though. I’ll just eyeball the file, find the crazy out-of-the-ordinary peaks, and either just cut them out, or compress them with a “destructive” process: if I’m in Pro Tools, I’ll select just the problematic peak, and apply an AudioSuite compressor or Limiter to bring that spike down to a more reasonable level. One can often get away with fairly severe compression on a single spiky peak. I’m curious, what process, in what program, are you using to do RMS normalization? Thanks!

  • Jay Allison

    5.19.11

    Reply

    henry and Jeff, what tools do you use for finding offending spikes and reducing them. Quickly.

  • Stephan Bisaha

    5.20.11

    Reply

    Hello Jeff,

    I cannot tell you how helpful this article is for me. I am a college student and between audio classes in the Music and Communication departments, as well as working at our radio station, I still did not have a good grasp on how to properly manage levels until now. I very much look forward to putting these tips to good use. Thank you again!

  • Jeff Towne

    5.20.11

    Reply

    Jay – I just eyeball the peaks, I find it easy to see the problematic ones. If there are extremely brief spikes that aren’t apparent to the naked eye, those just get caught by the L1 limiter that I have inserted on either the track, or the master track, or both. Henry mentioned that his software would find peaks, I hope he’ll describe that process. Within Pro Tools, it’s a little tedious, but there is a mode you can set that will jump from transient to transient with the tab key (amazingly enough, called “Tab to Transients.”) It’s the little symbol under the tool icons, in the black band near the top of the edit window, to the left of the a…z. If it’s UN-highlighted, the tab key jumps from edit to edit. If it’s highlighted by a blue border, the tab key jumps from transient to transient. But there’s no preference for especially high transients, so that won’t be especially quick, you’ll end up scrolling across a lot of peaks.

    Once an especially big peak has been found, if it’s not being handled by a compressor or limiter on the track, I’ll click-and-drag to highlight it, then apply an Audiosuite compressor. That will write a new sound file with less dynamic-range, and paste it over the original, in synch. It might take a couple of tries to find the right settings in the compressor, much will depend on the overall levels of the surrounding audio, but if you click the preview button on the Audiosuite compressor, and watch to see the gain-reduction meter indicate that it’s taking off a few dB, that will give you a sense.

    Or you can just apply the effect, listen, then hit un-do if you don’t like the result, then try again with new settings. I usually highlight a little extra time before and after the peak. That usually makes for a smooth transition, because the compressor isn’t actually going to be doing anything in the quieter areas before and after the peak, so the edges of the processed section and the unprocessed section should be smooth. If the transitions are creating clicks or thumps, try adjusting the threshold setting on the compressor plug-in, so that the compressor does not engage until the signal gets louder. DO NOT use make-up gain, the idea in this circumstance is to only bring the peak down in volume.

    If the peak is still getting through, adjust the threshold down so the compressor engages at a lower volume, or increase the ratio setting so the compression is more severe, or reduce the attack time, so that the compressor engages more quickly.

    And Stephan: thanks for the kind words, I hope the article ends up being helpful!

  • Gary Lerude

    5.20.11

    Reply

    Great tutorial, most helpful by going beyond the theory to practical techniques. I wasn’t aware of the -3/-15 dBfs standard and will adopt that for future episodes of my podcast.

  • Rich Halten

    5.20.11

    Reply

    Thanks for the terrific 411, Jeff. Two quick questions:

    1. I knock down offending peaks in PT by reducing the level w/ the volume plug-in. Does the compressor do a better job of that?

    2. Would it be overkill to use the Wave L1 to normalize and maximize loudness (it does both, right) on individual voice tracks AND the Massey L2007 on the Master track?

  • Alain B

    5.28.11

    Reply

    Thank you for a very usefull and quite interesting article.
    Answered many questions.
    I just finished recording a lot of kids and they tend to speak very soft and/or loud.
    I mostly used my ears.
    After reading your article I will still be using my ears but now I know what to look for with the meter !

  • Jeff Towne

    6.05.11

    Reply

    Rich: the compressor usually does a smoother job of knocking-down peaks than a simple volume adjustment, because it will do more or less gain-reduction depending on the original audio. If you just reduce the gain on a region, you’ll have volume dips at the edges of the effected region that may sound bumpy. Of course you can always make crossfades in an out of the gain-reduced region, which might help, or in some cases you just might not be able to hear the trasitions, even if they’re not perfect. But if you have your compressor’s threshold set correctly, and you select a little before and after the problematic spike, you should have a short period of the compressor NOT doing anything, then reducing the gain as the level crosses the threshold, then again, doing nothing at the end, which should result in a smooth, undetectable peak-reduction.

    That said, some audio peaks are hard to tame with a compressor, they may sneak past even very low attack values, or cause weird pumping, regardless of the release setting, and so, sometimes, just reducing the volume will work better. If you get clicks or thumps, try crossfading the edges.

    And no, it’s not overkill to use limiters on both individual tracks and the master track, in fact it’s a very good idea, I do that most of the time. I have either an L1 or that Massey L2007 on the master fader, and individual voice tracks will almost always have a Waves Renaissance Compressor on them, and maybe an L1 if they’re still a little spiky.

  • Jeff Towne

    6.05.11

    Reply

    Gary: thanks for the kind words, I do hope these tips are helpful. I don’t think it’s a bad idea to use the same levels as if you were making a program for broadcast, but it’s worth keeping in mind that there really isn’t any accepted level standard for podcasts. It depends a lot on what your content is, if it’s all voice, you may try to push your averages a little hotter, maybe toward -12 dBfs, and as long as your peaks stay under 0 dBfs, you’re OK. Just remember that most things sound better with a bit of dynamic range, so I wouldn’t recommend squeezing it to have much louder average levels than that. Probably the most important thing about a podcast is to be internally even, and to be consistent from program to program, so that people listening to one or several of your episodes in a row aren’t jolted by big changes in volume, either within a show or among several of them. Good luck!

  • Jay Allison

    6.23.11

    Reply

    Jeff, in ProTools the signal generator gives the option of Peak and RMS. If you set it to -15 as you suggest to establish zero and unity, which mode would you use? Peak is a little lower.

    Also, some stations still broadcast in mono. When they sum both sides of your perfectly leveled mix, are they amping up the combined signal strength to the point that it might overload?

    Finally, have you tried the Waves L2 limiter yet? It comes with the new Broadcast and Production bundle. I wonder if it’s a dramatic improvement.

  • Keith Sjoquist

    9.22.11

    Reply

    Hi,

    It looks like I missed most of the chatter on this, but I wanted to chime in and say what a terrific article this was. I work in Video Game Audio and the level standards in my world are likely even more disparate than in broadcast. It was encouraging to see you outline many of the practices I’ve adopted over time by trial and error. You did an excellent job of articulating the nuts and bolts of why audio leveling is difficult and important and how to use your tools to wrangle it effectively. I wholeheartedly agree with the overall theme of the article: Calibrate first, listen second, process third fourth or not at all.

    Thanks (Jeff) for this article, and thanks (Transom) for providing such excellent content.

  • Shawn

    2.18.12

    Reply

    Just came across this. Great article!

  • Simon Bradley

    4.17.12

    Reply

    Brilliant article – very clear and thorough. But I have a question relating to the use of the AudioLeak analyser. Regarding the -15dBfs Do we go with the A-weighted L+R reading or the unweighted?

  • Sir Kelz

    8.05.12

    Reply

    I have been searching for days for a summary on audio standards for broadcast to help me in my final mix for a music TV program. This article has answered all my questions and has even given me tips on how to improve my mix. Thanks a lot Jeff, nice work. Permit me to use this material if I want to put it up as an article on my blog. I will definitely include your name and a link to this site, that’s if you are okay with it. God bless.

  • Jack street

    1.11.13

    Reply

    There is a sound quality I hear in professionally produced talk radio clips that seems to go beyond what is being discussed here. It’s hard to describe, but it sounds rich, resonant and warm. What other processes are typically done that might achieve this sound?

Leave a Comment