Detailed instructions for use are in the User's Guide.
How to Choose an Acoustic Echo Canceller
Application Note Polycom Installed Voice Business Group
September 2004
Introduction Acoustic echo cancellers (AEC) greatly enhance the audio quality of a multipoint hands-free communications system. They allow conferences to progress more smoothly and naturally, keep the participants more comfortable, and prevent listener fatigue. An AEC solution that is poorly designed or inappropriate for the location will not provide these benefits and can even degrade audio quality significantly. There are many AEC solutions available, ranging in price from several dollars to several thousands of dollars. There is also a broad range in quality and performance. This paper outlines some guidelines for determining the performance needed in a given location, judging the performance of an AEC solution based on specifications and listening tests, and ultimately finding the best AEC solution for any application. Presented first, however, are guidelines for verifying that an acoustic echo canceller is necessary in an application. Why AEC? Acoustic echo cancellation is useful in any hands-free telecommunications situation involving two or more locations. Acoustic echo is most noticeable (and annoying) when delay is present in the transmission path. This would happen primarily in long distance circuits, or systems utilizing speech compression (such as in videoconferencing or digital cellular phones). Even though this echo might not be as annoying when there is no delay (such as with short links between conference rooms in the same building or distance learning over fiber-optic cable), room acoustics will still affect the sound and may hamper communication. Also, howling can occur if the microphone is positioned too close to the speaker whether or not there is transmission delay, and is eliminated by most acoustic echo cancellers. Acoustic echo cancellers can be used in both narrow band (3.5 kHz) and wide-band (7 kHz) conferencing systems. Narrow band applications include teleconferencing and low bit-rate video conferencing. Wideband applications include high quality teleconferencing and videoconferencing, as well as distance learning. Users of wide-band conferencing systems should be particularly interested in using acoustic echo cancellation, as it would allow them to make the most out of the additional audio capabilities of their systems. The primary beneficiaries of an echo canceller are the people at the far (or remote) end of the transmission path. The near (or local) echo canceller prevents the echo of the remote peoples' voices from being returned (i.e. echoed) to them through the audio system. People speaking on the same (local) end as the AEC should not notice the AEC if it is doing its job properly. While the people on the far end receive the benefit of better audio quality, it also enables the conversation to flow more smoothly, benefiting both parties.
2
Near End
Near speech
Far End
XMT
Nonlinear Processing Mic Distortion Echoes
Near speech + echoes + distortion
AEC
RCV
Far speech Speaker Far speech
Figure 1: Illustration of the effects of AEC operation and room acoustics on the transmitted speech. The far speech that travels through the receive path is not modified as it passes through the AEC. In an echo canceller that is poorly designed, there may be residual echoes as well as distortion added to the near speech signal (these effects are described in detail later). This degrades the speech that is transmitted, so that the poor audio quality is noticed on the far end.
Why not just use a speakerphone? Speakerphones are half-duplex, which means that only one person can talk at a time. The speakerphone determines which side is active (or louder) by comparing the signal levels on both sides. It turns off the other side until the louder side is finished. Once one side has "captured" the circuit, most speakerphones do not permit any sort of interruption. This inhibits the natural flow of conversation, especially if one party is long-winded. Acoustic Echo Cancellation vs. Line Echo Cancellation Acoustic echo cancellation and line echo cancellation both address similar problems, and are often based on the same technology. However, a line echo canceller generally cannot replace an acoustic echo canceller, because acoustic echo cancellation is a more difficult problem. With line echo cancellation there are generally only one or two reflections from telephone hybrids or impedance mismatches in the With acoustic echo cancellation, the echo path is complex and can vary continuously as people telephone line. These echoes are usually delayed move around the room. by less than 32 ms, and do not change very
3
frequently, if at all. With acoustic echo cancellation, the echo path is very complex (dozens or hundreds of reflections), lasts 100-200 ms, and can vary continuously during a conversation as people move around the room. Acoustic echo cancellers are therefore much more complicated devices. While line echo cancellers may have smaller price tags, they can't perform under the conditions that acoustic echo cancellers can handle. Steps to Choosing An AEC Now that the need for an acoustic echo canceller is recognized, the best AEC solution for the application isdetermined based on the following four step selection process. · Find AEC products with the features and form factor needed for the application. There may be several acceptable form factors. Even if one seems particularly suited to the application, consider all of them for a broad selection of price and performance. Eliminate products that don't meet G.167 or the tail length requirements of the application. Although these two factors are necessary, they are not sufficient. However, if an AEC solution does not meet these requirements, it will most likely not sound very good at all, so don't waste time arranging listening tests. If it does meet these requirements, further testing and evaluation should be done to ensure that it is appropriate for the application. When possible, find out the testing environment as well as the results of the G.167 testing. Judge audio quality and state machine performance by comparative listening. A panel of several people should listen to the different solutions (preferably the same people, under similar conditions, during a short time span). They should listen for the common problems echo cancellers may have, as well as overall quality. Choose the best solution. Weigh the performance, price, and convenience of each solution, and choose the one that will work best in the application.
·
·
·
Step 1: Find AEC solutions with the features and form factor needed for the application. Features Certain features may be desirable for certain applications. For example, wide-bandwidth may be a necessity for videoconferencing or high quality audio conferencing systems. For integrated systems, the number and quality of microphones and speakers will be an issue. Automatic control of microphone and speaker levels may be desirable. A graphical user interface (perhaps through a connection to a Windows machine) may be needed. These kinds of features are too varied to be discussed in detail in this paper, but will certainly be a consideration in the selection of an echo cancellation solution.
4
Form Factor The form factor of the solution is very important because it determines how useful it is in an application. The performance of the product may not matter if its form factor makes it inconvenient or impossible to use in the desired application. Of course, there can usually be some flexibility in choosing a form factor. Licensing an algorithm or buying chipsets may both be acceptable to an OEM (although one may be more convenient), but a complete AEC solution would be out of the question for the OEM. (Indeed, that may be what the OEM is using the chipset to build!) A list of common form factors for echo cancellation follows. AEC for OEM's Different form factors depend on the volume an OEM plans to produce. Although an off-the-shelf solution may be priced higher, at small volumes a total solution decreases development costs while reducing timeto-market. These are trade-offs to be weighed when making this choice: · Modules are suitable for moderate to high volume products and can speed the process of moving a product to market. They provide full functionality and quick integration into a design. They can save a great deal of resources during the design process, and can provide a value-added feature to systems that may be used in a variety of applications. Chipsets are best for high to very high volume products. They allow tighter integration into a board, but require more effort for the board design. Algorithms are best for very high volume products, especially embedded applications that are sensitive to size and power consumption. Algorithms provide the opportunity to use the processor for multiple tasks. They also can be ported to other platforms. Although algorithms are the cheapest per unit at very high volumes, they require the most system integration work. This includes the supporting code, software interfacing issues, and integration with other resources.
· ·
AEC Solutions for Integrators and End Users Typically, a conferencing application will require one AEC per location. Depending on the size of the room and other factors (such as the number of participants in each room), an AEC solution (packaged product) is presented in a number of forms. The typical forms are: · AEC only (standalone AEC) - this is the least expensive to implement in a system but requires the integrator or customer to supply all external equipment (such as microphones, amplifiers, and speakers) for moving audio in and out of the product. AEC for medium to large rooms - these products may contain microphone inputs and record inputs / outputs in addition to the standard audio inputs and outputs required for AEC operation. AEC for videoconferencing - these products may contain multiple inputs and outputs, or incorporate "phone add" modules to permit the addition of a 2-wire conference (telephone call into the videoconference). 5
· ·
Step 2: Eliminate the products that don't meet G.167 or the tail length needs of the application. It is relatively easy to determine how well an AEC cancels echoes. Most AEC products are based on the same algorithm: the adaptive LMS digital filter. This is a very well-defined algorithm that has been used for years. Since this process is well established, it is fairly easy to determine whether a manufacturer has done an adequate job of implementing it. The performance of the AEC can basically be judged by two criteria. · First, the product must be compatible with the ITU G.167 recommendation for AEC. · Second, the AEC must have an adequate tail length for the environment it is to be used in. Although these criteria are necessary, they are not sufficient to determine whether an AEC is good enough. There will most likely be several AEC solutions that meet these specifications. These are the specifications that can be compared on paper. What remains are the characteristics that can only be evaluated by comparative listening, and will make the most difference in how an AEC sounds. G.167 Compliance or Compatibility The ITU G.167 Recommendation for Acoustic Echo Controllers gives criteria for a number of performance characteristics typically listed on manufacturers' data sheets. These include such specifications as initial convergence time (or rate of convergence), amount of cancellation, and bandwidth. G.167 compliance is a good indication that the LMS algorithm (the actual echo canceling filter) has been implemented reasonably well. It also means that the manufacturer has subjected the product to a series of standard tests, and that the specifications are most likely based on valid experimental data. This makes the selection process easier, because it sums up many different characteristics. Products can be eliminated easily based on G.167 compliance, rather than by evaluating each performance characteristic individually. When an echo canceller is G.167 compliant, the following specifications commonly found on data sheets have met the requirements of the standard in the room in which the echo canceller was tested: · Bandwidth · Weighted Terminal Coupling Loss (or total cancellation) · Initial Convergence Time (or convergence rate) · Recovery Time After Echo Path Variation Since most of the specifications found on data sheets are covered by G.167, it is not important to consider each of these specifications in detail. The manufacturer's equipment should have already been verified to meet the requirements of the standard. If the product exceeds any of the requirements, this may improve the audio quality to some degree. This improvement, however, will not be as significant as the effects of the tail length and state machine. Therefore, all G ...