(If you're interested in this topic, I have more detail about this project in the notes/slides/video of my May 2009 presentation at the Rails Project Night)
Part of the reason that I haven't been blogging for the last few months is that I've been very busy working with a startup called Careerious. They had been trying to build a business around a crufty and badly-implemented C#/ASP application, and I was called in (along with Paul Doerwald) to rewrite it in Ruby on Rails. The new version of the application is now live and is a significant improvement - and that's not just because I say so: some parts of the application run 50 times faster than before.
Much as I'd like to make a Reddit-friendly headline such as "Rails Runs Fifty Times Faster Than .NET" that wouldn't exactly be fair. A more precise title might be "Competent Rails Developers Replace Crappy Unoptimized .NET Application And Get Rid Of Many N+1 Problems To Improve Performance" - which isn't quite so bait-worthy.
The original version of the application was started several years ago, built on top of an even older application that was used for municipal planning. This meant that not only was there a lot of unused code and database cruft, but the core was built in 2001-era C# and .NET, back when 3-tier architectures and object-relational mappings were esoteric 'leading edge' technologies. Not only was the foundation weak, but over time the project was taken over by junior staff - so the architecture was never cleaned up and legacy quirks turned into "we can't change it now because we don't really know how it works and we might break something" standards.
Deciding to rebuild an application from scratch is usually a bad idea. Joel Spolsky has a famous essay about how rebuilding from scratch lost Netscape the Browser Wars with Microsoft. At the last Presentation Night, Steven Baker repeated the advice: If it ain't broke, don't fix it. I've learned this myself - sometimes the hard way.
This was a special case for two reasons. Firstly, it wasn't a complete rewrite: we kept the basic database structure, but cleaned it up a lot (from 250 tables to about 50), and we hardly made any changes at all to the pages, forms, and navigation. Secondly, the middleware code was broke: logging in as an employer used to take up to a minute, maxing out the server for that entire time.
The new Rails version we built does the same process in slightly over a second.
The new version also, for the first time, allowed the Careerious management to see how their proprietary algorithms actually worked in the application: we built a test interface that let them run their algorithms against real data and track every step of their operation. They also had easy access to manage and maintain the data for the first time - the C#/.NET application had a slow and complicated proprietary command-line admin tool (that included 30 database tables of meta information), which we replaced with a nice scaffolding-style web interface. Paul Doerwald did some very slick metaprogramming to make this work with a minimal amount of custom coding - hopefully he'll present on it soon.
Some interesting points:
- C# is completely readable to someone who is familiar with Java. In fact, I did the entire rewrite in Emacs on my MacBook Pro - I never installed Visual Studio or any other Microsoft tools. Careerious provided us with all of the source code as well as the entire data exported into PostgreSQL format. C# is so similar to Java that I was able to use Emacs' "java-mode" for colouring and formatting of the C# code.
- Even though much of the original code was messy, it was easy to figure out where most things went. The .NET framework is pretty clear. It may not be three-tier, but at least it is two-tier: there are no PHP-style database calls directly on the web pages. Even without the fancy Visual Studio tools, it's easy to see what's going on, where different page components come from, and where the form submissions go. It may be because the original programmers weren't very sophisticated, but there was a refreshing lack of 'invisible magic' that makes some Ruby and Rails code so hard to figure out.
- MS SQL Server doesn't use serial integers as primary keys - or at least it didn't in this application - every row in the database is given a long alphanumeric ID that is universally unique. I can see how this would be useful in a situation where databases need to be replicated, synchronized, or sliced - but it was terribly abused by the original developers: in a column called 'candidate_id' only half of the entries would actually be keys to candidates, while the rest would be keys to other tables entirely! The original code had to check foreign keys against multiple tables - using multiple queries each time. You can see why some things took a while to run.
- Many people complain that object-relational frameworks like Hibernate or ActiveRecord are inefficient. I can certainly see how this would be the case for certain kinds of retrieval, such as getting totals to complex hybrid datasets, or when you only need one or two column values from a large table. However, ActiveRecord was one of the keys to the massive performance improvement we were able to make. Logging in as an employer used to take so long because the first screen after logging in is a list of job candidates sorted by a set of very complex criteria. In the original code, without an object-relational framework, each matching criterion that was checked against a candidate (and the many tables that join on the candidate table) generated its own SQL query. For an employer with several hundred candidates and a complex set of matching criteria, this could easily lead to tens of thousands of queries being run just to load this page. When we redid this logic in Rails, we made sure to preload all of the pertinent data before doing the comparisons - that way all of the checks were against objects in memory. Half a dozen queries up front run a lot faster than tens of thousands of queries throughout the process.
- Ruby is a more succinct language than C#, and Rails is a more succinct framework than .NET. We also cleaned up a lot of poorly written code (in one place a chunk of code was copied-and-pasted 27 times rather than simply put into its own function). At the end of the day, our version of the application has roughly one fifth of the code of the original. Interestingly, the new version also has about one fifth of the database tables as the original.
- Parts of the application requirements were to generate graphics and PDFs. The original used a Microsoft framework for the graphics, and a translated version of a Java framework for the PDFs. For Ruby, we used the (infamous) RMagick library for graphics and the venerable PDF::Writer library for PDFs. From comparing the code calls, I would have to say that .NET/C# seems to have the better frameworks. RMagick and PDF::Writer were certainly useable, but the C# code was able to simply call methods for, say, 'pie slice' while we had to build our own versions. (And yes I know about Prawn, but I didn't want to commit to pre-1.0 software on a project with a tight timetable like this one).
- Timeframes for changes and fixes to the original C# version were measured in weeks or months. Now changes are measured in terms of hours or days. Since the new version has so much less code, fewer tables, and is better organized, it's much easier to work with. Careerious have a lot of plans for the future, and now they aren't being held back by their code.
- I spend a lot of my time these days working on my own applications, where every single decision, from interface to database schema, is my own. It was refreshing to work on such a tightly-constrained project. I didn't have to worry about the fine points of the navigation or user experience, or weigh the consequences of deep architecture decisions - I just needed to make it do what it was already doing, but just do it better. Ruby on Rails vs. a crufty old C# application was hardly a fair fight, but it certainly was fun!
So if you have a messy old legacy C# application, seriously consider switching it over to a modern application framework like Rails - especially if you're considering a rewrite anyhow. You won't have to pay license fees or depend on proprietary tools from a single vendor (I can tell you many tales about that from my Lotus Notes days) - and the new version is likely to be easier to maintain and improve.
Also, you should definitely check out the new Careerious. It's a unique tool for job seekers, employers, or HR staff. It uses some unique and powerful technology - trust me, I rewrote most of it!