I love Ruby, The Language. Love it.

Sometimes, code seems to just pour out of my fingers and I often get almost childish delight in the way the code turned out.

I don't always love ruby, the implementation. It's slow, even compared to other interpreted languages. Fortunately, things are getting better, but we're not there yet.

REXML certainly has a nice API; better than just about any XML API in any other language I've used. But it is not a strict parser, which kind of gives me the willies. However, since it's wonderful to use, that what I usually use.

I needed to compare two XML files. I have some running code that generates a 250K XML (no whitespace!) file every hour that contains a list of foo items. Each foo item contains hundreds (possibly thousands) of bar items. I performed major surgery on the code and I wanted to make sure that the two were generating the same XML. I'm running the two code bases side by side for a while, so I wanted to be able to compare the XML easily.

The order of the foo items in the XML doesn't matter and the order of the bar items don't matter, just that the same bar belongs to the same foo in both XML files.

So I whipped together a ruby script using REXML and fired it up on the MacbookPro. The fan on the MacbookPro starts to whine noticeably. I hate running CPU intensive tasks on the MacBookPro because I'm always afraid it will catch fire. So I copy the script and the XML files to another machine and run it there.

After waiting for a couple of minutes, I start to reconsider using Ruby for this task, especially since I know I'm going to have to run this comparison on multiple files.

So I fire up Eclipse and bang out a java version (using dom4j). I find a bug or two, but it's all done. So I copy it over to the server where the ruby version is still running.

How slow is ruby + rexml? So slow that I can actually write this particular tool in java and then run it in less time it takes for the ruby version to run.

P.S. Apparently, if you want speed with REXML, you are supposed to use the StreamListener, which is all well and good, but the same is true in java for DOM versus SAX and I was still able to use dumb ole DOM for this dumb ole task.

P.P.S. I still love Ruby.