Making MIDI Art from Text

When it’s someone’s happy birthday, you usually sing them happy birthday. When you’re in quarantine, this is not feasible, and sending a recording is too cheesy. The text “Happy birthday! 🎉”, doesn’t really do the music much justice. So I’m left with one, and only one option. Make MIDI art that spells out “happy birthday”, and make sure it sounds nothing like the original melody. Here’s what the script produced:

Happy Birthday

A normal person might just do this using the marker tool in the piano roll section of a DAW, but where’s the fun in that? So guess what, I decided to build this generic tool, which nobody asked for, that would take in any string, and turn it into MIDI.

Before we go there though, what about existing solutions? I’ve heard of Algoart and similar software, but they don’t really do what I want them to. So here we are.

Interfacing with MIDI

MIDI used serialized data to represent music. MIDI has channels and tracks that go on these channels. For what I’m doing here, we don’t really have to go into the depths of how MIDI works. All we really have to know is that MIDI files can have multiple tracks, and you write out notes to a single track.

mido was the quickest way to get this done in Python. Here’s how you’d write out a single note to a MIDI file with Python

mid = mido.MidiFile()
track = mido.MidiTrack()

track.append(mido.Message('note_on', note=60, velocity=100, time=32))
track.append(mido.Message('note_off', note=60, velocity=100, time=0))

# Save the file'test_single_note.mid')

The note=60 specifies which note you want to play, the velocity=100 specfies how hard you want to hit the key, and time=32 specifies how long you want the note to last. From this, it’s easy to see how the code above creates a note (by turning it on and off).

What’s the sound of a character?

Although we see the isomorphism between a visual representation of a string and how it appears, a computer doesn’t. s = 'Hello World' represents an array of bytes, it does not tell the computer that the H is being rendered with this font, at this size, etc. So we actually have to render this text into an image to be able to write it out to MIDI.

Since a typeface typically is represented by parametric curves we can’t directly just create an image of a string. We have to rasterize our string using a particular font to get our image. Visually, this is what that process looks like,

A Closer Look At Font Rendering — Smashing Magazine

Image taken from here

We can convert this into a binary image, and lower the resolution enough so that each pixel can be converted into a note. This note can then be written out to a MIDI file.

Writing to MIDI

Now we just have to carefully iterate over this array of number and write it out to MIDI to get our MIDI art back. Here’s the for-loop that does exactly that,

new_col = True
for i, col in enumerate(note_vels.T):
    for j, note_vel in enumerate(col):
        if note_vel != 0:
            time = 20 if new_col else 0
            track.append(mido.Message('note_on', note=60 -
						 2 * j, velocity=note_vel, time=time))
            track.append(mido.Message('note_off', note=60 -
						 2 * j, velocity=127, time=0))
            new_col = False
    new_col = True


Technically this is not that exciting, but I can still do something that I couldn’t quite get done with software out there. I find it quite awesome that all it took was a really small chuck of code and some bored minutes.

Here’s the original MIDI art I made by hand,


If you want to try this out for yourself, you can do it as follows,

pip install text2midi

Once you’ve installed the script, the following is a generic template for using text2midi,

python -m text2midi "<message string>" path/to/output/file.mid

For example, you could try running something like this,

python -m text2midi "Hello, World\!" hello.mid

To view the generated MIDI file you can use the DAW of your choice. Something like Logic Pro X, Reaper, Garageband (?) will work just fine.