fognl: SDDM Series Intro

I have been following a series of articles on Steve Mitchell's blog about using Amazon EC2, and found the information there useful. Deploying something to the "cloud" is something I've been interested in doing for a while. Steve's articles go into considerable detail about the nuts and bolts of the process, which eliminates that part of the mystery for me.

After reading Steve's entries, I got to thinking about using my own blog for articles more useful than my norm, which consists largely of making fun of crappy movies or displaying pictures of cute dogs. I guess there's no harm in inflicting my writing on people for a useful cause, so I'm inspired to write a series now, too. Some of the information in this post would have been useful when I was looking for it, so why not?

I'm going to write one about a project I've been working on at home lately, for my home project studio. It's interesting enough to keep me, well, interested, and who knows, maybe someone else will find it interesting too.

I guess after that, I'll return to my regularly-scheduled programming with a nice picture of some kittens.

Anyway, the project I'll write about involves an application called SDDM. It's a MIDI-triggered sample player for Linux. The difference between it and the several other MIDI-triggered sample players out there (there doesn't appear to be many) is that this one is designed to meet the following criteria:

Allow the arbitrary mapping of MIDI note numbers to "instruments", and MIDI velocity ranges to individual samples.
Allow completely arbitrary definition of sample sets, with no limits on the number of samples assigned to instruments.
Be able to play samples at least as fast as incoming MIDI messages appear, with no audible latency.
Allow for the playback of one instrument to cancel the playback of an arbitrary set of other instruments.
Allow for the arbitrary definition of "sub mixes", so a single sample set can play back on any arbitrary set of ports, and be recorded in the same manner as a multiplicity of "real" instruments (e.g. a drumset with the snare on one track, kick on another, cymbals on another).
Present itself as a normal "Jack" client, so other audio applications can interact with it through the Jack service (for those of you who don't know, Jack is sort of "SOA for audio").

Speed

A primary concern for SDDM was speed. Above all else, it has to be fast, because my primary use for this is to play back hi-resolution samples of a real drum set recorded in a studio, including ghost notes, fast rolls, double-bass work, etc, in addition to playing multiple samples at the same time. The effect on a complete recording is a set of MIDI-triggered drums, with a sound indistinguishable from real drums.

With those goals in mind, SDDM is written as a native Linux application in C++.

Object Model

SDDM's object model is pretty simple.

For the Drumkit itself, there's a Drumkit class, which maintains a mapping of MIDI note numbers to Instruments. An Instrument maintains a mapping of velocity ranges (e.g. 0-15, 16-32, 33-55, etc.) to a set of Layers. A Layer maintains a reference to a Sample, which contains a buffer for the actual sample data loaded from .wav files on disk. There's more to it than that (sub mixes, etc.), but that's the general layout.

In addition to that, a MIDI driver and an Audio driver class are needed. I defined these as abstract classes, with a starter set of implementations (AlsaMidiDriver and JackAudioDriver, respectively).

These implementations each start their own threads and register themselves as clients of the ALSA MIDI subsystem and the Jack audio subsystem. These communicate to the application through a set of listener interfaces (abstract classes in C++), IMIDIListener and IAudioListener.

Finally, there is the SDDM class, which implements both of the interfaces and handles the details of processing incoming MIDI notes and playing the samples associated with them. The SDDM class fills the role of both "midi client" and "audio client".

Note Queue

Shared between the driver threads is an STL queue called "playingNotes", which maintains a list of the Note objects representing individual sample instances to be played.

When the MIDI client receives a MIDI note-on message, it looks up the Instrument in the active Drumkit object with a matching note number, and finds the Layer in the found Instrument (if any) with a velocity range which includes the velocity of the played note. It extracts the sample data from the Layer's sample, and creates a Note object. It locks the playingNotes queue, and inserts the Note.

The Audio client thread gets a periodic callback from the Jack subsystem. The Jack callback function takes a list of buffers (pointers to floating-point numbers), and a number of "frames" to fill. The audio client's reponsibility is to take all of the current sample data and fill the supplied buffers for the specified number of frames. It is VITAL that this process proceed as quickly as possible. Any delays in this loop are audible as a stuttering sound. Since Jack's callback into the application is synchronous, a delay in any application slows not only the application's
performance, but the whole collection of applications connected to Jack as well. If you want your application to become very unpopular Jack and its friends, print something to stdout for each iteration of your buffer-processing loop. :-)

The SDDM audio client locks down the playingNotes queue, extracts all of the Notes and loops through them (most-recently-played first), mixing all of them, altering their volume, pan, and pitch as it goes. As it plays the samples, it tracks the position of the individual samples so it can later decide when to remove them from the queue and delete them. (It performs this operation right after playing the samples.)

Mixing Audio

The process of mixing audio was, to me, a complete mystery when I started playing with this idea. I had no idea how to do it, but it turns out to be pretty intuitive. A sample buffer (at least in this system) is a pointer to floating-point data, so to mix two samples together and put them in the buffer, you add the two samples' values together, and store them in the buffer. Something like this:

*audioL++ = (sample1 + sample2);

Simple!

Controlling Volume and Pan

This also turns out to be pretty simple. You control a sample's volume by multiplying its value by a number from 0 (silent) to 1 (full volume). So 70% would be 0.7. If you want to "amplify" a sample, just multiply it by a value greater than 1. You can increase the volume of a sound in this way up until the point where the loudest sample in the sound exceeds 0dB, at which point the sample will be truncated. One of these is not noticeable, but too many of these results in a "zipping" sound coming from the speakers.

Since all of the samples are stereo (and if they're not, you simulate it by copying a mono sample's data into both channels), you have to perform the volume-control operation on the "left" and "right" buffers. So you control the pan as expected, by multiplying left and right's value by a number as above, whose value is determined by the pan setting for the instrument:

short pan = note->getInstrument()->getPan(); // A value between -100 and 100

volumeR += pan;
volumeL -= pan;

Controlling Pitch

Pitch control is almost as simple as level and pan. You control pitch by controlling how "fast" you step through a set of samples. Suppose you have 1024 frames you need to play. If you want to play them at their pre-defined speed, you just step through, frame by frame, and perform the operations as shown above. To play at higher-than-normal pitch, you skip some of the frames. Lower-pitch playback involves playing the same frame two or more times before moving onto the next one.

The Note class in SDDM has a samplePosition member that's updated by the audio client as it's playing the samples. This is just a floating-point number, and the audio client's processing loop factors in the defined pitch of the Instrument it's playing, and increments the sample-position counter accordingly. So the pitch-control logic amounts to this:

float step = 1.0f + (((float)note->getInstrument()->getPitch()) / 100);

// populate the main buffers, etc.

note->samplePosition += step;

The End Result

The initial iteration of SDDM worked better than I imagined it would. As I mentioned, my primary concern was speed. I wrote it as a stripped-down Formula1-style app, as fast as I knew how to make it. It appears to have worked. On a worn-out old Pentium 4 running Ubuntu with the Gnome interface, SDDM keeps up without any hiccups while playing 20-25 tracks in Ardour, with the CPU running at about 60%. Beyond that, it starts to show timing problems. The effect of this is a "jerkiness" to the sound of the performance, like a drummer who keeps dropping his/her sticks. On a more realistic machine (dual-core 2.7Ghz Pentium), it does fine while playing back around 60 tracks, with the CPU showing about 10% load.

SDDM is definitely not a "real time" application. It pre-loads all of the audio files it plays, and keeps dynamic memory allocation to an absolute minimum while the audio engine is processing, but there are still processes going on that aren't guaranteed to happen within a certain time. I'll address this at some point, but it's not a high priority at this point. If they're affecting the application's performance, I can't hear it.

Future

There is a lot to do to SDDM. I'm already working on a user interface for it, and I plan to add support for sample-rate conversion (load a 48kHZ sample on a system running at 96kHZ, get playback at the right speed), as well as support for other audio formats in addition to .wav (ogg, mp3, flac, and so on). I'll most likely do this using gstreamer or similar. I'll also open-source it. It should be fun.

Next up

Writing a UI in Qt for SDDM.

fognl

Tuesday, August 05, 2008

SDDM Series Intro