Legacy code is code written by other people

Published on in Miscellaneous

Legacy code means unfamiliar code. Legacy code has much more to do with unfamiliarity than the quality of the code itself.

From the Prioritizing Technical Debt as If Time & Money Matters talk by Adam Tornhill (28:42), from 19:53–22:14:

Legacy code is related to technical debt, but the two are not the same.

So what is legacy code?

There are many definitions out there. The definition I prefer is that:

  • Legacy code is code that somehow lacks in quality,
  • and more importantly, it's code that we didn't write ourselves.

Right.

And that second part is really important, so let me share a story with you. This is something that happened to me five years ago, and I've met it several times since then.

What happened here was that I was visiting our client. I did a workshop for one of their teams. We did a hotspot analysis of their code bases. And they had two code bases. So we analyzed them. We talked about them, discussed how we could address the findings.

And then, after a while,

  • someone in the team mentioned that "You know what? We actually have a third code base as well."
  • So I start to think "Okay, should we analyze that one too?"
  • And everyone kind of looked at each other and laughed a little bit like "Uh, no, we don't really have to do that."
  • So I said "Why not?"
  • "Because we know the third code base is a mess."
  • And I was like "Yeah, we really have to look at it," right?

So we did. And we looked at the hot spot[s], we looked at the code health metrics, cyclomatic complexity, all that stuff...

And guess what? Objectively there was absolutely no difference in code quality between the first two code bases and the third one.

And you know, when you bring up something like that, everyone starts to question it, that "Hey, maybe the tool is measuring the wrong thing; maybe you are measuring the wrong thing." Right?

So we had to spend a lot of time actually comparing code samples, and after a while everyone was fairly convinced – even though they didn't like it – that code base number three was indeed in no worse shape than the other two.

So why did everyone think that it was such a mess?

The reason of course was that code base number three turned out to be developed in a different part of the organization, in a team that has since been disbanded. And this team had simply inherited that code base, meaning they were now responsible for a piece of code that they didn't write themselves. They didn't understand the problem domain, and they didn't understand the solution.

So [legacy code] has much more to do with unfamiliarity than any properties of the code itself.

Makes sense! Low-quality code can be okay to work with if it's familiar enough. On the other hand, unfamiliar code can feel smelly and intimidating at first, even if it turns out to be of high quality.

When someone leaves a team or an organization, they leave behind legacy code (read: code that is unfamiliar to others), unless someone else happens to be familiar with all the current code authored by the leaving person.

A snapshot from the video (21:15), with Adam Tornhill on the left and an illustration on the right labeled "The Technical Debt That Wasn't"