Tik-76.115 Individual Project: Guinea Pig
$Id: architecture.html,v 2.5 1996/04/22 14:08:12 hynde Exp $

Sound Player - Architecture

This document describes the general architecture of the Sound Player Module. The general structure of the player is shown in the figure below:

[ Picture of the architecture ]
Fig 1: Architecture of the Sound Player

The Player consists of sound samples. Different samples' audio outputs are mixed together and the result is stored in output buffers. From there the audio driver converts the sound data to the format needed by the audio hardware and writes it out. The interface is used to operate the player.

1 · Sound Samples

Sound samples contain the sound data. Usually samples are loaded from files on disk but there are other types of samples too. Files on disk should be in one of the allowed formats (see: Sound File Format), currently AIFF-C/AIFF and raw data files are supported. Sound samples can contain one or more channels that each have their own volume, 'sample pointer' and output connections. Sample pointer is the index of the next sample word that is played next. Output connections specify to which output channel(s) this channel is connected to and with what weight.

1.1 · Operations on samples

Samples have a set of functions that are used to play the samples and manipulate their parameters. Possible operations are: Internally the player also uses other functions when playing the samples:

1.2 · Sample types

There are different types of samples. The most basic type is the null sample which provides as its output only silence (of certain number of samples). This type is useful as a pause between a pair of samples.

The most important sample type is the file sample which plays audio data loaded from a file as mentioned earlier.

The third sample type is a catenated sample which takes a set of individual samples and plays them in sequence one after the other. The catenated sample functions just like an individual sample. Samples of any type can be catenated. Samples that are a part of a catenated sample should not be used individually. Currently this type of sample is not implemented. It is emulated by the player's python interface.

2 · Sound mixing

The mixing of the different sound sources (samples) together is done by telling each sample to copy (or actually, add. See fill buffer above) a certain number of sample frames to the output buffers.

The mixing happens when the audio driver (or audio hardware) is ready to accept more data. When this happens, the output buffers are first cleared and then each sample that is playing is told to add some data to the output buffers. The audio driver is then responsible for converting the output buffers' contents to the format needed by the hardware.

2.1 · Output buffers/channels

For each output channel there is a corresponding output buffer where the audio data from sound mixing is stored. The audio device uses the contents of these buffers to create the data stream suitable for sending to the audio device.

Samples contains information for each of its channels that indicates to which output buffer(s) (or output channels) the sound data is meant to be copied. Each output channel also has an output volume level of its own.

When data is copied to the buffers, the samples volume, output channel connection weights and the output levels of the output channels all together define the scaling factor used when copying the data from the sample's buffer to the output buffer. Thus when new data is copied to the buffers, the operation goes roughly like this:

   scale = sample_volume * output_conn_weight * output_level;

   buffer += scale * sample_data;
where buffer is the output buffer, sample_data is audio data form sound sample and scale is the scaling factor. Optionally, there may be also a scaling factor for the audio device. For example, when originally 16-bit sound samples are outputted through the 24-bit outputs available on SGI machines, the output data has to be scaled to the 24-bit range. The data in the output buffers is stored as floating point numbers as well as all scaling factors.

3 · Audio driver

The audio driver is dependent of the actual audio hardware and driver software used. In theory this is the only part that needs to be replaced when moving the player to a different audio environment (hardware and software).

When the audio device is ready to accept more input it requests the sound mixing to prepare new data which is then stored to the output buffers. When new data is ready, it has to be converted to the format accepted by the audio hardware (the data in the output buffers is stored as floats). Currenly this is done simply by converting the floating point number straight to an 16-bit integer (there will be distortions if overflows occur, they are not addressed at all at this point).

One or two output channels are supported for Linux with the MSND sound driver. There is also a working version for IRIX (a quick-one hour port). IRIX version supports, in addition to the features found in the MSND version, 4-channel output and digital 24-bit output.

More information: Sound Player for Linux/MSND and Silicon Graphics

4 · Interface

Commands for the player are issued through the interface module. The module uses a simple text based protocol for communication. A simpler way to use the player is to use the sound player module for python. It provides an easy access to the players functions via a set of classes and methods and hides the unrelying protocol from the user.
· Sound Player Index · Document index · Guinea Pig ·