fognl: August 2008

Sunday, August 17, 2008

A Tale of 2 Bucks

I spotted this pair of buck heads mounted on a wooden fixture today at the State Fair. The Ranger sitting beside the display said that the two bucks were found with their antlers locked together. They apparently got tangled up while they were fighting. When they were found, one was dead from starvation, and the other was almost dead. The Ranger said the antlers were so tangled up, they couldn't be pulled apart without breaking one or both of them. Kind of a tragedy, really.

I'm no deer psychologist, but I'm guessing there were some regrets shortly after the impact shown above:

Buck A: Oops.
Buck B: Oh, nice job.

I'm sure animals communicate with each other in their own way, so I'm really curious as to what these two talked about in the hours and days following their little misunderstanding. I'm guessing they probably struggled for a while to get apart, then cooperated trying to get a drink out of a creek, then eventually became pretty good friends.

Friday, August 08, 2008

SDDM: Qt event handling with Signals and Slots

This is the second installment in a series of posts on my pet project, SDDM, coming soon to a Linux distro near you.

I chose Qt as the UI library I would use for SDDM, after reading about its features.

Qt is kind of unique in that it's a pretty good C++ UI class library, but also features a "meta-object compiler" to allow things like runtime reflection, events, properties, and other features not supported natively by C++. It also includes a lot of non-UI things like XML parsing, URLs, filesystem access, container classes, in addition to a large set of UI elements. Custom controls are pretty easy to create as well.

Events
In any desktop UI library, you need the notion of "events" so that UI elements on a window can communicate their status to the parent window, where the application-level logic is written. For example, take a window with a button on it. What you need is some mechanism for a button (or any other control) to tell its parent window (in effect): "Hey, this thing just happened, take whatever action you need to." This is basic "observer pattern" stuff, and there's a variety of approaches used by the various UI environments.

The "Java approach", used in AWT, Swing, and SWT, is to define interfaces for "listeners", which respond to events posted by controls. In effect, what you do is create a class that implements a specific "listener" interface, with the implementation providing the application-level event handling, for each event you need to intercept. As you might imagine, this adds up to a lot of classes, since a separate class is needed in most cases to handle every different event. However, Java provides a way to define anonymous inline classes. This makes it unnecessary to actually create a new class, decide on a name for it, and so on.Typically, the code you write to listen to an event from a control is pretty concise, looking something like this (fictional) example:

myButton.addListener(new ButtonListener() {
public void clicked(Event evt)
{
// process the click.
}
});

This is still kind of messy, but much cleaner than having to write a discrete class for each case where you need to handle an event from a control.

Languages like VB and Delphi are tailor-made for desktop UI programming, and incorporate the notion of "events" right into the language. Delphi's approach involves assigning function addresses to event pointers on controls, so the controls can call the functions directly to communicate their status. This is probably the most efficient approach of all, since there's no intermediate layer between the control and the function it's calling.

VB's approach, I don't really care about. Last time I used VB, it achieved events through some "mystery meat" process involving COM and OLE, which probably resembles the manufacture of sausage. It wasn't nearly as flexible as Delphi's approach. Delphi allows one event-handling function to be assigned to potentially many event-handler pointers in one or more controls, whereas VB didn't. In my experience, it (like VB itself) was minimally useful at best.

In the case of C++, none of these approaches is practical. C++ doesn't allow the inline definition of an anonymous class like Java does, so if you take the Java approach to events with listeners and the like, you end up with lots of little single-purpose classes lying around. You can adopt the function-pointer approach that Delphi uses, but it gets messy pretty quickly too. One problem is the fact that C++ member functions don't actually have an address that you can assign to a pointer. Actually, they do, but in order to take a member function's address, you have to include the name of the class in the address, using this syntax:

// simple class with a function to point to
struct someClass {
int f(int a, int b) { return a+b; }
};

someClass inst;

// take the address
int (someClass::*func)(int,int) = &someClass::f;
// call the function through the pointer
int result = (inst.*func(3,4));

Straightforward enough I guess, but notice how the name of the class has to be included in the pointer declaration. The pointer named "func" isn't just a pointer to a function, it's a pointer to a function in someClass. An important distinction. In an application with a UI, it's a pointer to a function in your application's parent (window/dialog/widget) class. Which means a reference to the parent widget class has to appear in the class of the control you're listening for events on. This is something you definitely don't want in a reusable component.

There might be some kind of macro-based voodoo you could perform to get around this limitation. It would work fine I'm sure, except for the "macro-based voodoo" part. I can't think of how it would work, in any case.

That leaves templates. Essentially, you define the control as a template class, with the type parameter referring to your application-level class the control will call through function pointers.

This definitely works, but also adds a lot of bulk to an application. Consider a button class like this:

template MyButton: public BaseWidget {
...

// function pointer
void (T::*clickFunction)(const MyButton*);
};

Consider a nominal-sized application with, say, 30 classes representing various kinds of parent widgets in it, doing this sort of thing:

class MyWindow: public Window {
...

MyButton okButton;

MyWindow(const char *caption): Window(caption)
{
// Assign the "click" event to our handler
okButton.clickFunction = &MyWindow::okButtonClicked;
}

void okButtonClicked(const MyButton *button)
{
// handle button click
}
};

The compiler generates a separate class for each different window class which uses the MyButton template class. In the example above, you get a "MyButton_MyWindow" class (or a class with some similarly-mangled name), and 29 or so others, one for each class that makes use of the button. Follow that example for every other kind of control you can have, and you get the idea.

In case you somehow didn't get the idea, it's this: You end up with Mt. Everest-sized executables. See Visual Be++ for an example of this approach. In that GUI designer's host environment, you can create an application the "native" way with a couple of buttons and some edit controls, and the finished executable will be around 20kB in size. Use Visual Be++'s templated-based UI library (which adds events, properties, and some other Delphi-isms) for the same application, and you end up with an executable about 240kB in size.

Put simply, C++ doesn't support a convenient way for controls on windows to notify their parent windows of something. To do this sort of thing without using an old-skool Windows-style window procedure (driving yourself nuts in the process) or any of the approaches noted above, you almost have to resort to some sort of language augmentation and specialized tool set.

That is exactly what Qt does. Qt's approach to the problem falls deeply into the "mystery meat" zone. Trust me, there is a lot of sausage manufactured here. It involves a combination of specialized tools (the aforementioned "meta-object compiler" and qmake), and some Qt-centric quasi-keywords.

I'm not generally a fan of language augmentation. It ties you to a specific tool set, I'm also not crazy about specialized tools being required for a build, since it limits support for a certain type of program by available build tools.

But, hey, so what? If I stopped with that, this would be a short and pointless blog entry where I did nothing but gripe about the suckiness of C++'s support for UI events. Qt has a lot of features I want to use. The tool set, though specialized, is easy to use. The benefit I get from Qt far outweighs the annoyance of having to stick with a specific tool chain. The "sausage" in this case is tasty enough that I'm willing to overlook the manufacturing process and keep my mind off of what (or who) might have fallen into the sausage press.

Signals and Slots
The Qt Meta layer adds the notion of "signals" and "slots". A "signal" is something emitted by an object to whoever might be listening. A "slot" is something employed to listen for signals. So, for example, a button emits a "clicked" signal, and a dialog box with a button on it can define a slot to capture that signal.

Here is a simple Qt control class declaration, showing the additional Qt keywords and macros in use:

class SomeWidget: public QWidget
{
Q_OBJECT

public:
SomeWidget(QWidget * parent = 0, const char *name = 0);
virtual ~SomeWidget();

// Respond to a Qt mouse event.
virtual void mouseEvent(QMouseEvent *evt);

signals:
void clicked(int);
};

In the implementation, you "emit" an event signal when something of interest takes place:

void SomeWidget::mouseEvent(QMouseEvent *evt)
{
// blah blah, decide whether to emit a "clicked" event, because the user clicked in this control's client area.

// This is basically it. The keyword "emit", followed by a function call.
emit clicked();
}

The Q_OBJECT macro marks the widget as something the MOC (meta-object compiler) should generate extra MOC code for. You'll find the extra MOC code in your project directory as .cpp files with names beginning with "moc_". One of these files is generated for each C++ translation unit where the Q_OBJECT macro is used. (Take a look in there. There's a sausage press, a large cage full of frightened-looking cats, and a crew of sweaty men in stovepipe hats, furiously smelting.)

A parent widget with the SomeWidget control on it would be defined something like this:

class ParentWidget: public QWidget {
Q_OBJECT
public:
ParentWidget(QWidget * parent);
...
private:
SomeWidget *someWidget;

private slots:
void someWidgetClicked();
};

The slots Qt-keyword indicates that the methods below are intended to be called by controls emitting events with the emit Qt-keyword.

In the constructor, you another quasi-keyword, connect, to join the signals emitted by various controls to handlers in the parent widget.

ParentWidget::ParentWidget(QWidget *parent): QWidget(parent)
{
// initialize someWidget
this->someWidget = new SomeWidget(this);

// connect someWidget's clicked signal to our slot
connect(someWidget, SIGNAL(clicked()), this, SLOT(someWidgetClicked()));
}

// This gets called whenever someWidget emits a clicked() event.
void ParentWidget::someWidgetClicked()
{
// someWidget was clicked
}

And that's basically it. It's pretty clean, and probably more convenient than the approach used by any UI library I've used to date.

Next up: Custom controls in Qt.

Tuesday, August 05, 2008

SDDM Series Intro

I have been following a series of articles on Steve Mitchell's blog about using Amazon EC2, and found the information there useful. Deploying something to the "cloud" is something I've been interested in doing for a while. Steve's articles go into considerable detail about the nuts and bolts of the process, which eliminates that part of the mystery for me.

After reading Steve's entries, I got to thinking about using my own blog for articles more useful than my norm, which consists largely of making fun of crappy movies or displaying pictures of cute dogs. I guess there's no harm in inflicting my writing on people for a useful cause, so I'm inspired to write a series now, too. Some of the information in this post would have been useful when I was looking for it, so why not?

I'm going to write one about a project I've been working on at home lately, for my home project studio. It's interesting enough to keep me, well, interested, and who knows, maybe someone else will find it interesting too.

I guess after that, I'll return to my regularly-scheduled programming with a nice picture of some kittens.

Anyway, the project I'll write about involves an application called SDDM. It's a MIDI-triggered sample player for Linux. The difference between it and the several other MIDI-triggered sample players out there (there doesn't appear to be many) is that this one is designed to meet the following criteria:

Allow the arbitrary mapping of MIDI note numbers to "instruments", and MIDI velocity ranges to individual samples.
Allow completely arbitrary definition of sample sets, with no limits on the number of samples assigned to instruments.
Be able to play samples at least as fast as incoming MIDI messages appear, with no audible latency.
Allow for the playback of one instrument to cancel the playback of an arbitrary set of other instruments.
Allow for the arbitrary definition of "sub mixes", so a single sample set can play back on any arbitrary set of ports, and be recorded in the same manner as a multiplicity of "real" instruments (e.g. a drumset with the snare on one track, kick on another, cymbals on another).
Present itself as a normal "Jack" client, so other audio applications can interact with it through the Jack service (for those of you who don't know, Jack is sort of "SOA for audio").

Speed

A primary concern for SDDM was speed. Above all else, it has to be fast, because my primary use for this is to play back hi-resolution samples of a real drum set recorded in a studio, including ghost notes, fast rolls, double-bass work, etc, in addition to playing multiple samples at the same time. The effect on a complete recording is a set of MIDI-triggered drums, with a sound indistinguishable from real drums.

With those goals in mind, SDDM is written as a native Linux application in C++.

Object Model

SDDM's object model is pretty simple.

For the Drumkit itself, there's a Drumkit class, which maintains a mapping of MIDI note numbers to Instruments. An Instrument maintains a mapping of velocity ranges (e.g. 0-15, 16-32, 33-55, etc.) to a set of Layers. A Layer maintains a reference to a Sample, which contains a buffer for the actual sample data loaded from .wav files on disk. There's more to it than that (sub mixes, etc.), but that's the general layout.

In addition to that, a MIDI driver and an Audio driver class are needed. I defined these as abstract classes, with a starter set of implementations (AlsaMidiDriver and JackAudioDriver, respectively).

These implementations each start their own threads and register themselves as clients of the ALSA MIDI subsystem and the Jack audio subsystem. These communicate to the application through a set of listener interfaces (abstract classes in C++), IMIDIListener and IAudioListener.

Finally, there is the SDDM class, which implements both of the interfaces and handles the details of processing incoming MIDI notes and playing the samples associated with them. The SDDM class fills the role of both "midi client" and "audio client".

Note Queue

Shared between the driver threads is an STL queue called "playingNotes", which maintains a list of the Note objects representing individual sample instances to be played.

When the MIDI client receives a MIDI note-on message, it looks up the Instrument in the active Drumkit object with a matching note number, and finds the Layer in the found Instrument (if any) with a velocity range which includes the velocity of the played note. It extracts the sample data from the Layer's sample, and creates a Note object. It locks the playingNotes queue, and inserts the Note.

The Audio client thread gets a periodic callback from the Jack subsystem. The Jack callback function takes a list of buffers (pointers to floating-point numbers), and a number of "frames" to fill. The audio client's reponsibility is to take all of the current sample data and fill the supplied buffers for the specified number of frames. It is VITAL that this process proceed as quickly as possible. Any delays in this loop are audible as a stuttering sound. Since Jack's callback into the application is synchronous, a delay in any application slows not only the application's
performance, but the whole collection of applications connected to Jack as well. If you want your application to become very unpopular Jack and its friends, print something to stdout for each iteration of your buffer-processing loop. :-)

The SDDM audio client locks down the playingNotes queue, extracts all of the Notes and loops through them (most-recently-played first), mixing all of them, altering their volume, pan, and pitch as it goes. As it plays the samples, it tracks the position of the individual samples so it can later decide when to remove them from the queue and delete them. (It performs this operation right after playing the samples.)

Mixing Audio

The process of mixing audio was, to me, a complete mystery when I started playing with this idea. I had no idea how to do it, but it turns out to be pretty intuitive. A sample buffer (at least in this system) is a pointer to floating-point data, so to mix two samples together and put them in the buffer, you add the two samples' values together, and store them in the buffer. Something like this:

*audioL++ = (sample1 + sample2);

Simple!

Controlling Volume and Pan

This also turns out to be pretty simple. You control a sample's volume by multiplying its value by a number from 0 (silent) to 1 (full volume). So 70% would be 0.7. If you want to "amplify" a sample, just multiply it by a value greater than 1. You can increase the volume of a sound in this way up until the point where the loudest sample in the sound exceeds 0dB, at which point the sample will be truncated. One of these is not noticeable, but too many of these results in a "zipping" sound coming from the speakers.

Since all of the samples are stereo (and if they're not, you simulate it by copying a mono sample's data into both channels), you have to perform the volume-control operation on the "left" and "right" buffers. So you control the pan as expected, by multiplying left and right's value by a number as above, whose value is determined by the pan setting for the instrument:

short pan = note->getInstrument()->getPan(); // A value between -100 and 100

volumeR += pan;
volumeL -= pan;

Controlling Pitch

Pitch control is almost as simple as level and pan. You control pitch by controlling how "fast" you step through a set of samples. Suppose you have 1024 frames you need to play. If you want to play them at their pre-defined speed, you just step through, frame by frame, and perform the operations as shown above. To play at higher-than-normal pitch, you skip some of the frames. Lower-pitch playback involves playing the same frame two or more times before moving onto the next one.

The Note class in SDDM has a samplePosition member that's updated by the audio client as it's playing the samples. This is just a floating-point number, and the audio client's processing loop factors in the defined pitch of the Instrument it's playing, and increments the sample-position counter accordingly. So the pitch-control logic amounts to this:

float step = 1.0f + (((float)note->getInstrument()->getPitch()) / 100);

// populate the main buffers, etc.

note->samplePosition += step;

The End Result

The initial iteration of SDDM worked better than I imagined it would. As I mentioned, my primary concern was speed. I wrote it as a stripped-down Formula1-style app, as fast as I knew how to make it. It appears to have worked. On a worn-out old Pentium 4 running Ubuntu with the Gnome interface, SDDM keeps up without any hiccups while playing 20-25 tracks in Ardour, with the CPU running at about 60%. Beyond that, it starts to show timing problems. The effect of this is a "jerkiness" to the sound of the performance, like a drummer who keeps dropping his/her sticks. On a more realistic machine (dual-core 2.7Ghz Pentium), it does fine while playing back around 60 tracks, with the CPU showing about 10% load.

SDDM is definitely not a "real time" application. It pre-loads all of the audio files it plays, and keeps dynamic memory allocation to an absolute minimum while the audio engine is processing, but there are still processes going on that aren't guaranteed to happen within a certain time. I'll address this at some point, but it's not a high priority at this point. If they're affecting the application's performance, I can't hear it.

Future

There is a lot to do to SDDM. I'm already working on a user interface for it, and I plan to add support for sample-rate conversion (load a 48kHZ sample on a system running at 96kHZ, get playback at the right speed), as well as support for other audio formats in addition to .wav (ogg, mp3, flac, and so on). I'll most likely do this using gstreamer or similar. I'll also open-source it. It should be fun.

Next up

Writing a UI in Qt for SDDM.

fognl