As a consultant, there’s a very common complaint that I hear from clients. The complaint is along the lines of, “It’s all such a mess,” or “We need to re-write it from scratch.” They’re almost always right in the first case, and almost invariably mistaken in the second. A messy codebase is a pain, but learning the wrong lesson from it just means that they’re going to experience the same pain all over again once they’ve done their re-write - if they’re still in business when they finish.

The first question to ask is simple: why is it all such a mess? If it’s a mess because you made a completely wrong technology choice (e.g. classic ASP for a point-of-sale application, or a thick client where a web client was required) or the team that wrote it simply didn’t have a clue and have all been fired found opportunities elsewhere, then perhaps a re-write is in order. Other than that, there’s almost no good reason to do a complete re-write. Regardless, that’s not the point of this post.

The point of this post is that it’s usually such a mess because people don’t know how to fix it - or, more probably, people don’t know how to even decide on a strategy to follow and are drowning in technical debt as a result. Here’s a simple one:

Fix what you know is broken.

If you honestly have no idea where to start, ask for help. Plenty of people will. Firstly, though, try this:

  • Do you have source control? No? Then download Git and fix that.
  • Do you have continuous integration? No? Then download TeamCity and fix that.
  • Do you have unit tests? No? Then go and write at least a “Hello, world!” test to get yourself started.
  • Do you have an issue-tracking system? No? Then try YouTrack, JIRA or similar. There are countless. The main thing is to start writing problems down. Use sticky notes if you have to.
  • Do you have an automated deployment solution? No? Octopus is your friend.
  • Do you have log aggregation? If not, Seq will make a world of difference to you.
  • Are you afraid to change the code? Well… work out at least one reason why, and fix it.

There’s really no excuse for not having these sorts of tools. Moreover, there’s no excuse for not having the agility that these sorts of tools offer.

Once you have a build, a rudimentary test suite and a deployment solution, the next step is clear:

Fix what you know is broken.

What’s at the top of your issue-tracking list? Does it make sense? If so, then that’s what’s broken. Go and fix it. If not, then its priority is what’s broken. Fix it by re-prioritising it so that it does make sense.

I visited one client recently that had a test automation task as a “drop everything and fix now” priority - but below that were cases that were costing parts of their business money every single day. In this case, the prioritisation was broken. So… fix it and move on.

Once you’ve fixed something that was broken, release it. That’s right: release the thing. “Oh,” you might say, “but it has to go through n levels of QA, UAT and sign-off first.” Guess what: that’s the next thing that’s broken. So… given that you know what’s broken, what now?

Fix what you know is broken.

I’m starting to sound like a broken record here, but I’m also starting to sound like a broken record whenever I have to deliver this lesson in person :)

You need to get your release cycles down to something manageable, and if you’ve had a messy codebase for a while then I guarantee that you’re afraid of releasing to production because of what might have changed while you weren’t looking.

The solution is to start releasing earlier and more often. Get used to the idea that a production release is boring and routine, not unfamiliar and scary. Releasing to production should be scripted and ideally entirely automated (but that’s the subject of a squillion other blog posts) so I’m not going to re-hash it here. Just accept that if you’re afraid of releasing to production then that’s the next thing that’s broken. After all, if you haven’t changed the code since your last release then what’s likely to go wrong? If you have changed the code, then what you’re really afraid of is your testing regime, not deployment per se.

Once you have your releases automated and none of the above things are scary any more, you’re down to the boring, menial task of just chipping away at your technical debt. Identify the highest-priority item to fix; fix it; release it.

Can’t find the actual bug but have found another one nearby? Fix that bug. Then ship it. To production. And watch the logs. You’ll be surprised by how many times you discover that they’re related - or, at least, that fixing one bug makes the other one easier to find.

If you’re struggling to pick the highest-priority item to fix, here’s a tip: if it’s hard to choose then it probably doesn’t matter which one you pick. If you’re struggling to pick then they’re close enough to equal that it doesn’t matter. It’s more important to pick one and start work.

It really is that simple, ladies and gentlemen :)