Directional audio coding

TKK Acoustics / Research / Spatial sound

Directional audio coding (DirAC) is a technique for various tasks in spatial sound reproduction. It is based on Spatial impulse response rendering, it is based on the same principles, and partly on the same methods. The processing can be divided into three steps:

Analysis: the sound signals are divided into frequency bands using filterbank or STFT. The diffuseness and direction of arrival of sound at each frequency band are analyzed depending on time.

Transmission: A mono channel is transmitted with directional information, or, in applications targeting for best quality, all recorded channels are transmitted.

Synthesis: the sound at each frequency channel is first divided into diffuse and non-diffuse streams. The diffuse stream is then produced using method which produces maximally diffuse perception of sound, and non-diffuse stream is produced with a technique which produces as point-like perception of sound source as possible.

Synthesis can be implemented in various ways, depending on microphone technique, transmission type, and reproduction system.

Example applications for DirAC

Reproduction of B-format recordings. Demos available for 5.0 loudspeaker setup. Traditionally, B-format recordings are reproduced using e.g. Ambisonics, which produces coherent loudspeaker signals. This produces blurred spatial image and small optimal listening area. In DirAC, the coherence can be avoided since in both diffuse and non-diffuse reproduction, which produces less blurring and larger listening area.

Transmission of spatial information as side band to mono signal in teleconferencing. Demos available for 5.0 loudspeaker setup. The microphone setup is a custom B-format microphone composed of four miniature capsules. Sound is transmitted as a mono signal, with a narrow side band containing the azimuth directions for each frequency band depending on time.

In the teleconferencing demos the application of DirAC as a new type of directional microphone for noisy recording environments. This is implemented by reproducing only the sound coming from the direction of speech source. Although the SNR decreases from 0 to -25dB, speech is still somehow intelligible, although the reproduced speech signal contains lots of distortion.

Upmixing of stereo files to multichannel files. The stereophonic file is recorded with a simulated B-format microphone in simulated anechoic conditions. The sound can then be decoded to arbitrary reproduction systems.

Some selected publications		Short description
N/A	Laitinen MV, Kuech F, Disch S, V. Pulkki "Reproducing Applause-Type Signals with Directional Audio Coding" J. Audio Eng. Soc., 59(1/2) 2011.	Surrounding applause-type signals are very hard signals for many parametric spatial audio reproduction methods. It is shown, however, in this article, that such coding is possible, though the needed time-frequency resolution is very fine for such processing.
N/A	Vilkamo J, Lokki T, and Pulkki V. "Directional audio coding: Virtual microphone-based synthesis and subjective evaluation" J. Audio Engineering Society 57(9) 2009,	The use of virtual microphones in DirAC processing is presented here, and the quality produced by DirAC is shown to be very good in extensive listening tests.
N/A	Pulkki V, Laitinen MV, and Erkut C. "Efficient spatial sound synthesis for virtual worlds" The AES 35th International Conference London, UK, February 11-13 2009.	The use of DirAC in virtual world audio rendering is shown here. DirAC can be used to position virtual sources, to control the spatial extent of the sources and to provide reverberation efficiently. Also, recorded spatial sound scenes can easily be augmented with virtual sources.
N/A	Laitinen MV and Pulkki V "Binaural reproduction for directional audio coding" IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. New Paltz, NY, USA, October 18-21 2009.	It is shown, that DirAC provides a nicely externalized perception of spatial sound, when using head tracking in headphone listening.
N/A	V. Pulkki "Directional audio coding in spatial sound reproduction and stereo upmixing". AES 28th Int. Conf. Pitea, Sweden, June 2006.	The use of DirAC in high-fidelity reproduction of B-format recordings is presented here. Also, the idea of using DirAC in stereo to multichannel upmixing is presented.
N/A	V. Pulkki and C. Faller "Directional audio coding: Filterbank and STFT-based design. In 120th AES Convention, Paris, France, May 20-23, 2006. Audio Engineering Society. Paper # 6658.	The use of DirAC in teleconferencing is presented here, with some discussion on the selection of time-frequency analysis methods in different applications.

http://www.acoustics.hut.fi/research/cat/sirr/
Modified: 23.5.2011
< Feedback >

home