My first Ruby program, and converting Evolution maildirs to mbox

The Pragmatic Programmers famously recommended to learn one new programming language every year. Last year I learned AppleScript (page in German). My plan to learn Objectice C this spring didn’t work out for lack of time, but now I got closer to fulfilling my 2005 obligation: I wrote my first useful Ruby program.

Email archival woes: The background is this. When I left HP Labs two weeks ago, I wanted to take the email folders from my HP-provided Linux box with me. Stupidly, I just copied the Evolution folder, without thinking about how to import them into my email client, Apple’s Mail.app. The right thing would have been to export the folders in mbox format because Mail can import mbox, but cannot import the maildir format used by Evolution.

Luckily, maildir to mbox conversion didn’t look too hard. Both store the emails in plain text. Maildir stores each message in its own file (or multiple files for MIME-multipart messages). Mbox stores an entire folder in one file.

Shellscripting: So I set out to do the conversion. I started with a bit of cat and grep and sed on the terminal command line. That looked as if it could work. When the commands didn’t fit on one line any more, I moved them into a bash script. Converting simple plain text emails took about five lines. After half a screen, HTML emails and simple attachments worked. I believe this is actually the most complex shellscript I’ve ever written. Mendel Cooper’s Advanced Bash-Scripting Guide was an excellent companion.

But some emails, e.g. with other emails attached inside, turn out to be quite complex and require some form of recursion. This is where I decided to switch tools; when you start to need recursion, bash is quite definitely not the right tool for the job. Normally I would do something like this in PHP (I have a Perl aversion), but having been impressed by Ruby recently, I decided to give it a try.

First steps with Ruby: Starting out with Ruby is simple. I had already read some Martin Fowler posts and I went through Vincent Foley’s excellent Ruby on Rails tutorial a couple of days ago, so I already knew basic syntax (calling and defining methods, doing conditionals and loop) and some nice block-and-closure-fu. Together with the Pragmatic Programmer’s online version of Programming Ruby and the online class library reference and some 1337 G00g13 ski11z, that’s enough to get going.

65 lines later, all emails were translated. I’ve used file IO, directory listing, array filtering and regexes.

The script: Here it is. Run it inside the maildir folder, pipe stdout into an mbox file: ruby maildir_to_mbox.rb > output.mbox

maildir_to_mbox.rb

If you know Ruby, please have a look. Did I do anything stupid? Any useful idioms that could improve the code?

Ruby is nice: So far, I like the language. The syntax is pleasantly devoid of ASCII noise, the libraries make life easy, the array handling is superb. My main peeve so far: There’s too many ways to do the same thing (No, I don’t like Perl’s “There’s more than one way to do it” mantra). I’m looking forward to do more with Ruby.

So what should I learn next year?

This entry was posted in General. Bookmark the permalink.