Tuesday, December 22, 2009

MIDI on Linux is not easy

I've put a little tutorial in the KDE userbase wiki about MIDI on Linux because Linux users have very little help setting up their systems for MIDI usage.

First, I would like to apologize for a joke, that was not my original idea. I've found it at the renaissance music web site of Alain Naigeon. He says that MP3 is like fast food, WAV is tasting the meal, and MIDI is cooking the recipe. He also says that playing a MIDI file is not reproducing a performance, it's a new performance. I agree, but on the other hand each MIDI file is a recording of a musical performance. Even more than a MP3 or a CD, because it contains no sound, only the musician actions. For instance, the folks at the Minnesota International Piano e-Competition distribute MIDI files of the competitors' performances, recorded in Yamaha Disklavier concert grand pianos, as standard MIDI files. And you can play these files in KMid2. Try this one: Igor Stravinsky's, Petrouchka played by Alessandro Taverna, first prize winner of this year's contest.

Returning to the subject. I would like to show you a screenshot of the about dialog from the software synth included (and ready to be used out of the box) in all MS Windows operating systems:

The Roland Sound Canvas samples are also included in Mac OSX. In both cases it is a small sound font  file in DLS format. It is not the full and great Roland Virtual Sound Canvas, but a light version. Anyway, the point is that both software manufacturers provide resources to the users wanting MIDI support out of the box. If you download and install VMPK in any Windows version, you can start playing and getting sound at once without having to configure anything at all, and without reading a single word of system documentation.

What is the situation on Linux? There are MIDI applications. There are many operating system MIDI drivers (ALSA). Some software synths and free SoundFonts are also available. What is missing? Distributions gathering all the required pieces, putting them together and easing the task of setting up Linux for MIDI usage out of the box. It is their job, after all.

Monday, December 7, 2009

Precompiled Headers

Hello, KDE Planet!

Let me insist with another post about time. But in this case about the time spent compiling programs; always too much, because vita brevis est.

I've added to KMid2's build system a CMake script enabling precompiled headers (PCH) for the GCC compiler. It is optionally enabled using a  CMake option "-DWANT_PCH=yes". I've written this script initially for Rosegarden, and later it has been refined and included in several other projects with the goal of reducing the build time.

Enabling PCH for a project is not automatic. You need to write a header file including enough library headers, and give this big header to the ADD_PRECOMPILED_HEADER macro. It is also a good idea to use ADD_DEPENDENCIES in the targets, to ensure that a PCH file is generated before the target components. The macro automatically adds an "-include" argument to each compilation step in the directory,  so PCH can be enabled or disabled for a project without needing to modify the sources.

The trick is creating the big header file with enough #includes to be worth, but not too much. Compiling the PCH file takes not only time, but also many megabytes of disk space. For KMid2 this header is named "qt_kde.h" and includes mostly Qt and KDE headers, and also several Standard Library and ALSA includes.

From my development machine, here are some measures comparing compile times, with and without PCH.

Without PCH
real    1m44.137s (104137 ms)
user    1m25.417s
sys     0m8.149s

With PCH
real    1m21.081s (81081 ms)
user    1m10.732s
sys     0m8.641s

KMid2 is a very small application. For comparison, Rosegarden 1.7 needs between 11 and 15 minutes to build in the same machine. The savings in KMid2 using PCH are 23 seconds, 22.14%. Not too bad! But KDE's build system has another way to save compiling time, and it is ready out of the box for any KDE project using the CMake option   "-DKDE4_ENABLE_FINAL=yes". This setting creates a Single Compilation Unit for each target. Times measured for KMid2:

With ENABLE_FINAL, and without PCH:
real    1m19.808s (79808 ms)
user    1m14.025s
sys     0m4.932s

Savings: 24.329 seconds, 23.36%

Conclusion: ENABLE_FINAL is slightly better than PCH, and it can be  automatically used by all KDE4-based applications. So why bother with PCH? I think that PCH may win if you can use the same PCH file when compiling a whole set of KDE programs at once, but I've not tested this hypothesis yet...

Friday, December 4, 2009

Tempus Fugit

Time is the dimension of music, likewise width and height are the dimensions of pictures. When transmitting MIDI data on a wire, events don't need to be marked with time labels, because the time simply flows. MIDI is a real-time protocol. When the events are stored on data structures, like SMF (MIDI files), the events must be timestamped. In SMF, each event has a delta-time, the elapsed time since the previous event, measured in ticks. The time formats on SMF closely resembles the conventions used by the written music tradition, the musical score.

Let's start with the metronome. A metronome is a device that signals musical tempo. It is used by musicians to keep the rhythm and speed while practicing. Metronome units are beats per time unit, for instance crotchets per minute. MM=60 means 60 crotchets in one minute, or one crotchet equal to one second. The crotchet, also known as quarter note, is the double in length of a half, and the half is the double of a whole. Conversely, the half of a crotchet is an eighth, and this is the double of a sixteenth. These figures are the time lengths of the musical notes, and they are relative measures. Music is written in terms of relative times.

In SMF, the time is measured in ticks. The division value, which is declared into the file header, is the number of ticks in a quarter note. Common division values are 96, 120, 240, 384, 960. The numbers are usually divisible by 3 and 4, and big enough to avoid decimal numbers when measuring the length of very short notes.

Tempo changes are SMF meta-events. They may happen at any point in the time line and are represented in microseconds per quarter note. This magnitude is inverted with regarding to the usual metronome units (quarters per minute), because it can be represented by integer instead of decimal numbers. for instance, mm=60 would be represented as SMF tempo=10000000, and mm=120 as tempo=500000.

As note start and length properties are represented by relative magnitudes instead of absolute times, and tempo changes are flexible enough to be placed at arbitrary points, it is easy to modify them without collateral effects. If you insert a tempo change in a SMF using a MIDI editor software, you don't need to change the length or the starting time of the following or previous notes. Of course, when the SMF is finally rendered to be listened, the sequencer engine needs to do all the time calculations. But not so fast, wait a minute!

When you play MIDI files, it is a common requirement to perform slower or faster than the nominal tempo changes encoded into the file. It is like zooming in picture viewers. For instance, a student may need to render a piece half or slightly slower than it normally plays, or a dancer may need it a bit faster. The ALSA sequencer engine provides a very handy mechanism for programs to do this: the time skew property. You can enjoy this vary-speed functionality in KMid2 and KMidimon, when it is used to play MIDI files.

Thursday, December 3, 2009

Parallel Lives

Plutarch was a Roman-Greek historian and biographer, well known because of his work "Parallel Lives", a series of biographies of famous Greeks and Romans, arranged in pairs to illuminate their virtues and vices. I wanted to try a similar approach pairing some digital audio and digital image technology citizens. But the vices and virtues only apply to the computer users and developers, because the technology, like the science, is neutral.

Let me start before the computers epoch. At some point in the past, painting was the only way to fix an image in a 2D surface. The sound, on the other hand, was impossible to fix in a literal way. Only using a symbolic music language, the same used today (musical scores). In the 19th century appeared the photography and the phonograph. At this point there were two artistic approaches for taking images: painting and photography. Music has a similar duality: it is still possible to “draw” scores, and also to take snapshots of musical performances to be preserved and later reproduced literally. Of course there are non-artistic usages of photographic and phonographic recordings, as well.

Let's stop here a moment. After the photography birth, nobody really has any doubt about the future of the plastic arts. Today, we all enjoy with paintings, children learn to draw and paint, and architects and engineers use technical drawings in their daily jobs. And there is also photography, with artistic and non-artistic branches. We have specific tools in our computers to fulfill all the tasks related to the images. There is Krita and Karbon14, GIMP and Inkscape, Photoshop and Illustrator. For each digital image program there is a vectorial drawing one. There is also CAD software, of course, and ray tracing. More about this later.

Bitmap (raster) graphics are digital representations of 2D images. A digital bitmap is a matrix of pixels (dots), each pixel representing the color (and sometimes transparency) for a point. Different resolutions, or pixel density, are measured in pixels by length unit. Common resolutions are 100 dots per inch, for display devices, and 600 dpi for printers. We can calculate the size in bits of a picture in bitmap format if we know the pixel size and the with and length in pixels of the picture. For instance, an image of 50x20 pixels in true color, 24 bits per pixel, weights exactly 50*20*24 = 24000 bits = 3000 bytes. It is not possible to make a similar calculation for a vectorial image, as the size depends on the complexity and details of the drawing, and not the dimensions.

Digital audio is represented in PCM (pulse code modulation) format as a stream of samples. Each sample is a measure taken by a microphone in equal instants of time. Resolution, or sampling frequency, is measured in samples per time unit. For instance, in Hz (samples per second). Common resolutions are 44100 Hz for CD quality recordings, or 96000 for higher quality. We can calculate the size of an audio recording knowing the number of channels, the sample size, sampling frequency and time length. For instance, a recording of 1 second of monaural (1 channel) sound, using 24 bit samples at 44100 Hz takes 24*44100*1*1= 1058400 bits = 132300 bytes.

Both bitmap graphics and PCM digital audio recordings are usually compressed to reduce it's weight. Some of the compression methods also discard the less relevant information. The JPEG and the MP3 formats are examples of lossy compression. Uncompressed formats are for instance BMP and WAV. Lossless compression examples are PNG and FLAC.

Vector graphics are stored in several file formats. A modern standard is SVG, based on XML. There is also a XML-based music representation, called MusicXML. A very common music format, part of the MIDI standard is SMF (Standard MIDI File) using the .MID filename extension. Vector graphics need to be rendered into bitmap graphics before being displayed, using a rendering graphics library as Cairo. Same happens with the MIDI music, rendered into audio with a synthesizer.

Vector graphics are usually schematic, not very photo realistic. But there are some ray tracing programs allowing photo realistic rendering of a symbolic textual vector graphics source. Something similar happens with MIDI rendering using samplers, producing orchestral music with high level of realism. But many MIDI synthesizers usually leave a very characteristic electronic timbre.

When you render an empty vector graphic (say, a simple white surface) into a bitmap, you can realize that the size of the resulting rendering doesn't depend on the image contents, while the size of the original vector graphics image does. Same with MIDI. If you create a MIDI file of John Cage's composition 4'33'' (which is absolute silence), it will take only a few bytes. The rendering into digital audio weights several megabytes uncompressed. The same size taken by 4 minutes and 33 seconds of a jazz song.

About the transformations. The name Scalable Vector graphics already hints one of its strengths: scalability. You can resize a vector graphic without losing quality, in contrast to a bitmap graphic resize operation. In digital audio, the only dimension is time. You can change the speed, the tempo of a musical composition in MIDI format without losing quality at all. Or you can transpose a song, lowering or rising the notes' pitches. In digital audio you can do those transformations (time stretch and pitch shift) using a FFT (fast Fourier transform) algorithm, but it usually creates artifacts. You can do more transformations in MIDI: mute one instrument, or change it for another one. Adjust the volume of an instrument, or even individual notes.

There are also deep differences in computer support for these technologies. Digital audio is very well supported, like bitmap graphics. Vector graphics are also very well supported, and since the adoption of the SVG standard, it has been very beneficial for Graphic User Interfaces. MIDI is, by contrast, really bad supported, both for hardware and software. Modern graphics cards include 2D and 3D acceleration, and graphics libraries like Cairo and Qt4 obtain a real benefit of the hardware enhancements, and the graphics acceleration is being increasingly adopted by most computer manufacturers and software vendors. On the other hand, audio hardware interfaces have been dropping MIDI support, which was never very good (remember the old cheesy sound of MIDI files?) And Linux distros don't include an easy to use software synthesizer. Both Windows and Mac OSX include a software synthesizer ready to be used out of the box, including the Roland SoundCanvas (lite) SoundFont, which is not a beast but has reasonable quality. Meanwhile, the Linux distros don't care about that. And what about MIDI software? OSSv4 dropped MIDI support. The ALSA sequencer is very good, but needs more good applications.

Wednesday, December 2, 2009

VMPK supported platforms

I've read a debian bug complaining because VMPK fails to compile in FreeBSD (by the way, it will fail also in other Unix flavors, except maybe SGI Irix).

When I was planning VMPK, it was going to be a Linux only program, ALSA sequencer based. Indeed, there is still a test program in my aseqmm library using the same piano widget. Because I only needed limited functionality, I decided to try RtMIDI instead, which offers Windows and Mac OSX support. As Qt4 is also supported in these operating systems, the program was born multi-platform.

Maybe the number of supported platforms is going to grow in the future. That depends on RtMIDI. Any volunteer out there? The developer of RtMIDI, Gary P. Scavone, is very collaborative and kindly accepted the patches I've sent him in the past. About OSS, there is a note in the RtMIDI documentation saying that a decision was made to not include support for the OSS API. And there is no MIDI support in OSSv4. But there are more MIDI APIs, like Jack MIDI...

Tuesday, December 1, 2009


So, Why I am writting this blog?

I am a software developer to make a living and also as a passion, vocational activity, developing free software pro bono. Almost always for Linux, using Qt and KDE platforms, and MIDI technology. That is: music software. Some of my programs: KMid2, a MIDI/karaoke player, VMPK: a virtual piano, KMetronome: a MIDI metronome, KMidimon: a MIDI monitor. I will talk here about these, and other computer programs.

Sometimes people is surprised by my interest on MIDI. Isn't it a dead, ancient technology? Why should anybody be interested on MIDI, having better alternatives like MP3? What is really MIDI about? I will try to answer these questions in future posts.

Meanwhile, I've released a few days ago a preview of KMid2, a new incarnation of the classic KMid software. Here is a little demo. "Ay, linda amiga" is a renaissance anonymous Spanish song from the "Cancionero de Palacio".