Directional audio coding |
||||||||||||||||||||||
Directional audio coding (DirAC) is a technique for various tasks in spatial sound reproduction. It is based on Spatial impulse response rendering, it is based on the same principles, and partly on the same methods. The processing can be divided into three steps: Analysis: the sound signals are divided into frequency bands using filterbank or STFT. The diffuseness and direction of arrival of sound at each frequency band are analyzed depending on time. Transmission: A mono channel is transmitted with directional information, or, in applications targeting for best quality, all recorded channels are transmitted. Synthesis: the sound at each frequency channel is first divided into diffuse and non-diffuse streams. The diffuse stream is then produced using method which produces maximally diffuse perception of sound, and non-diffuse stream is produced with a technique which produces as point-like perception of sound source as possible. Synthesis can be implemented in various ways, depending on microphone technique, transmission type, and reproduction system. Example applications for DirACReproduction of B-format recordings. Demos available for 5.0 loudspeaker setup. Traditionally, B-format recordings are reproduced using e.g. Ambisonics, which produces coherent loudspeaker signals. This produces blurred spatial image and small optimal listening area. In DirAC, the coherence can be avoided since in both diffuse and non-diffuse reproduction, which produces less blurring and larger listening area. Transmission of spatial information as side band to mono signal in teleconferencing. Demos available for 5.0 loudspeaker setup. The microphone setup is a custom B-format microphone composed of four miniature capsules. Sound is transmitted as a mono signal, with a narrow side band containing the azimuth directions for each frequency band depending on time. In the teleconferencing demos the application of DirAC as a new type of directional microphone for noisy recording environments. This is implemented by reproducing only the sound coming from the direction of speech source. Although the SNR decreases from 0 to -25dB, speech is still somehow intelligible, although the reproduced speech signal contains lots of distortion. Upmixing of stereo files to multichannel files. The stereophonic file is recorded with a simulated B-format microphone in simulated anechoic conditions. The sound can then be decoded to arbitrary reproduction systems.
|
||||||||||||||||||||||
esittely | introduction | contact info | teaching | research | publications | current news | demonstrations | software | links | feedback | intranet | finder
| ||||||||||||||||||||||
http://www.acoustics.hut.fi/research/cat/sirr/ |
home |