Loudspeakers circumvent most of these problems, but it is not obvious how one can use loudspeakers to deliver binaural sound. One solution is a technique called cross-talk-cancelled stereo (or transaural stereo).
- Many people do not like to wear headphones. Even lightweight cordless headphones are cumbersome. The headphones that are best acoustically can be uncomfortable to wear for long periods of time. They also attenuate external sounds and socially isolate the user.
- Headphones can have notches and peaks in their frequency responses that resemble pinna responses. If uncompensated headphones are used, elevation effects can be severely compromised.
- Sounds heard over headphones often seem to be too close. Indeed, the physical source actually is very close, and the compensation needed to eliminate the acoustic cues to its location is sensitive to headphone position.

The idea is simply expressed in the frequency domain. In the arrangement
shown above, signals S1 and S2 drive the loudspeakers. The signal Y1 reaching
the left ear is a mixture of S1 and the "crosstalk" from S2. To
be more precise, Y1 = H11 S1 + H12 S2, where H11 is the HRTF between the
left speaker and the left ear and H12 is the HRTF between the right speaker
and the left ear. Similarly, Y2 = H21 S1 + H22 S2. If we were allowed to
use headphones, we presumably would know the desired signals Y1 and Y2 at
the ears. The problem is to find the proper signals S1 and S2 to create
these desired results. Mathematically, this merely requires inverting the
equations:

In practice, inverting the matrix is not trivial.
Done carefully, crosstalk-cancelled stereo can be quite effective, producing elevation as well as azimuth effects. The phantom source can be placed significantly outside of the line segment between the two loudspeakers. However, since cross-talk-cancelled stereo still needs binaural signals, we shall confine our remaining observations to headphone systems.
- At very low frequencies, all of the transfer functions are identical (why?), and thus the matrix is singular. (Fortunately, in reverberant environments low-frequency information is not very important for localization.)
- An exact solution tends to produce very long impulse responses. This problem becomes more and more severe the further the direction to the desired source is from the line between the two loudspeakers.
- The result will depend on where the listener is relative to the speakers. (Proper effects are obtained only near the so-called "sweet spot," the assumed listener location used when the equations are inverted.)