Every code base has its dark corners where methods are too long, and the logic is hard to follow. Let’s not look to place blame on the engineers that came before us, but instead assume they did the best they could with the knowledge they had at the time.
At Jana, we do believe in spending time to tackle technical debt. We try to be as practical about this as possible, refactoring code when we know it needs to be changed for a new feature. We have a great set of unit tests and integration tests that can help ease our anxiety and raise our confidence that we didn’t just break the entire system for our millions of monthly active users.
I plan on covering refactoring techniques in a series of follow up blog posts, but I wanted to take a stab at an abstract description of how I see medium to large sized refactoring efforts.
Before I Begin
As software engineers, we are notoriously bad at sizing and estimation. One valuable habit is to time box your initial efforts. I might give myself a couple hours to make some meaningful progress. I might throw away that effort if I now feel the task is bigger than the team can absorb at this time. Maybe I’ve made a bigger mess without getting a glimpse of where I want to go. It’s better to communicate this clearly than to get lost in a black hole.
The more complex, and the more unfamiliar I am with the code, the more overwhelming it can feel. Methods with several hundreds of lines of code stare back at me, challenging me to untangle them. Let’s assume that this graphic is the code. Each line is a line of code. Since I don’t understand them at this point, they’re all the same to me.
I spend some time reading the code, giving myself a general high level overview of all the functionality included. It serves as a starting point, but now my understanding of the code is just a lot of seemingly loosely related logic in one long method. The colors loosely represent how the code relates to each other. In the graphic below, they don’t seem to be related to each other at all. At the beginning, they all blur together. There’s no rhythm to it, no obvious patterns.
My next step is to start small, with my automatically running unit tests guiding our efforts. I start with a handful of techniques to make the code more readable. My favorites include early returns, extracting small methods, and removing dead features (I will cover these in a future blog post.) It can be slow going with not a lot of obvious gains. Good thing for time boxes.
But as I repeat the process, revisiting the refactored code, my understanding deepens, and some patterns slowly start to appear. Removing dead code makes the method simpler to understand. I don’t consider this cheating. I absolutely believe removing dead code is a key component of refactoring.
With the dead code removed, and small refactors, lines of code that seem unrelated suddenly have a strong relation to other lines, and I can start to rearrange the lines to group them together logically.
The Home Stretch
Repeating the same steps, we continue to rearrange the logic, remove dead lines, and extract large blocks of functionality to other methods.
Around this time, it feels like I’ve reached a tipping point. My small brain can now understand all the heavy lifting this method was doing.
By repeating a few simple techniques, I try to improve the code with each pass. Until the logic is grouped together in a way that makes sense for how I understand the codebase, the current business, and where we anticipate the business going.
Thanks for reading my abstract description on refactoring. Let me know in the comments if this resonates with you or gives you hope that your codebase can be improved without declaring bankruptcy. I know this post was abstract, but subscribe to our technology blog RSS feed to get notified when I follow up with detailed descriptions of the refactoring techniques I find most common.