Politics, Power, and Science: Refactor or Fail!

There are many great development styles. These are merely theoretical for the majority of shops where I have worked in the past. If you work in one of these shops that maintains their code base, then you are very lucky indeed. Most shops use the "House of Crap" pattern for their software development style.

What is the "House of Crap" development style?

Imagine a house where you have tools, parts, dirty dishes, dirty clothes, clean dishes, clean clothes, half eaten food, stacks of unsorted magazines, books and mail, unsorted piles of random stuff stacked in every room. If you need electric somewhere you just run a wire from the mains to the place you need the power. If you need water somewhere, just run a pipe. Why waste time with any planning or cleanup, we need the power and water today. The wires and pipe are ran right over top of the stacks of boxes and parts and tools. If an outlet or spigot is covered up by a junk pile, don't bother taking the time to dig it out, just run a new line or pipe from the nearest other source.

Nobody ever cleans or sorts anything, or takes out the trash because there is just no time. They are way to busy just trying to keep the existing stacks and piles from falling over and running new wires and pipes to do anything else. You obviously can't move the piles of stuff, because then you would have to rerun all the pipes and wires. Way too much work, best to just throw some more stuff on top of the piles and string up a few more wires and pipes.

Eventually the developers get tired of living in the "house of crap" development style (sty?) they are "being forced" to live with. They clamor to burn the old house down and build a brand new house that won't have any of the problems of the existing house.

What they don't recognize is that the problems they are having with the existing code base is a symptom of their own poor development habits. The house being messed up is not the houses fault, it is the fault of the people living in that house. If nobody will straighten anything up or pick up a broom, then the house is going to get filthy in no time. If nobody does any planning or work to organize the rooms, or to run power and water in a house to the proper places, then the house will be a horrible mess.

The urge to rewrite is very common. All too often the existing programmers are all working on code they inherited from others. This code has grown organically over the years and has developed many warts. There are many bugs and bizarre features that nobody knows why they are there. The natural inclination is to throw out the existing code and start over with a clean slate. To develop an even better system.

Resist this urge at any cost. There is never any need to ever rewrite an entire code base from scratch. Never. Not Once. Really. Why would I lie to you?

Many of the features of an application will have been developed through painstaking trial and error over time. Just because you don't know why a certain line of code is in a file doesn't mean that line isn't important, because it fixes a bug that caused data loss 7 years ago. Throwing away all this code is like throwing away decades of programming experience. A code base it the institutional memory for your company or project.

Additionally you get what is known as the second system effect kick in when rewriting an existing system from scratch. This effect can make the second system perform much worse than the current code base, and be much buggier. Too often a second system is an opportunity to throw in every possible feature under the sun. An attempt to please all people all the time usually fails.

Over the years I have observed projects both open source and commercial attempt to perform a complete code rewrite. Best case is that the project is delayed by a few years. Worst case is a complete failure.

Examples of projects that have stalled out for years or failed entirely because of a complete rewrite: Vista, Mozilla, DBase IV, several video games that have been in vapor ware release cycle for a decade now.

And you know, let us face reality. Your coding practices were so horrible in the first place to get a code base into a completely unmanageable state; why do you think you won't make the same mistakes again in your next code base. Better to just face up to your failings now and fix the existing code base. Because the system you write will quickly degrade to the same point as the existing code without using good re-factoring techniques.

So, the existing code base is too horrible to let suffer anymore and I am suggesting never rewriting a project from scratch, so what is it that you should do?

The good news is that it is not too difficult to fix your existing code base. It just takes time and effort. There needs to be a constant, in place re-factoring of any any code base. Most people feel that they don't have time to maintain their code and make it better because they feel too busy to do anything. They also hold out hope for the new system, so they don't do anything in their existing system because that would just waste time.

1. Look for patterns in the code that repeat.

When you are cleaning up a house you would start by at least putting similar categories of times in a single box.

Do you have areas of the code that are cut and paste variations of the same code? If code exists in more than one place you should really think about making that into a function call. Typically bug fixes or changes might have been applied to different sections of code in different ways. This code will also tend to defuse out into the surrounding code in the same section. Abstract out that code into a function.

An example of this is a set of procedures in a client server message router for a hospital interface system that I worked on. There were about 20 places in the code that the parent was talking to a child process, or the child process was talking to the parent. There were strange things happening and nobody knew why it was so inconsistent. Every area that did the communication had been uniquely changed and fuzzed into the surrounding support code a few lines, until it was difficult to even tell where the parent and child communication was. It wasn't even actually explicit that the functions were actually talking to the child or parent from the names used.

I standardized this code into just a few functions, replacing hundreds of lines of code with a few dozen lines of code all accessed with very clearly named functions.

2. The next step is to group related functions into a single code module

Once all the similar items are grouped into boxes, then you would want to put each category of boxes in the proper area of the house. Tools and parts go into the garage. Dishes go to the kitchen. Clothes go to the laundry room, or to the dressers, if they are clean.

In the same way as you would organize a house, you should organize similar functions together into separate "rooms" or code modules.

The child/parent inter-process communication I was just talking about was all similar. I created a separate code module for the functions, with a way to create a new handle for the communication, and child and parent functions to send messages to each other and to check for existing messages from each other.

This code module was totally opaque outside itself. Because this module was completely isolated from the rest of the code base it would have been trivial to switch this module out for a totally different method.

3. Make interfaces opaque.

Now keep the tools in the garage and the dishes in the kitchen. There is a tendency when you are working on projects to let things get messy again. This is OK, as long as part of your plan is to clean everything up and put it in it's proper place again before you go to bed.

Making your code modules opaque to each other is a very effective method to keep things from getting messy and all the different parts getting intermixed again.

I cannot stress this enough. Do not let any private data leak out of a code module. Don't give raw access to any data that is inside a code module. Always use accessor functions to give access to data fields. Programmers use what they can access so best to just not give access to data structures that can change later.

Sure it makes it tougher to debug a program when you can only see what data is in a pointer in that modules code. If you think about it though, this makes perfect sense. If you are outside the scope of a code module, then it doesn't matter what data is in a code module.

Clear opaque interfaces make it clear exactly where that code exists and makes it possible to only work with that data through the modules interface. But it makes changes to a stable .h file not make the rest of the code base have to recompiled with every minor change.

It also makes it impossible for a recompile of just that code module from having strange interactions if you change a data structure in the module. This is a bizarre bug to have bite you, when your data structures are out of date between areas of the code. You can easily lose a day to this until you figure out hat you just needed to do a clean recompile to everything.

I fixed a problem with a string module that had the data structure being accessed directly from hundreds of places in the code base. This required moving a dozen functions into the code module that did the string functions that were being done in raw code before. Without the interface the code that was directly accessing the string data structure in all those places couldn't be tested to ensure that they were functioning correctly. This also resulted in bug fixes having to be applied in as many as 50 places.

4. As you build the interfaces, build test harnesses for each interface.

Wait, I thought this was on re-factoring, not boring testing. Yes, I know. But it is essential to test what you have so that you can change it later and have assurances that the code is correct. Having untested code in your program makes the program very fragile, just waiting to come crashing down like a house of cards on some data that hits a corner case you didn't test for.

Use tools like gcov to ensure that every code path in the modules are walked by the unit testing.

A place where I used this method was in a memory management module that I put into our code base at a start up I worked for. Because this module was so general purpose and it was going to go into so many places in the code base I wanted to ensure that there were no issues. So I wrote a test harness that walked every code path and threw pure garbage into the functions, making sure that where ever possible a sensible action would occur. I made sure that any exceptions that the library itself couldn't handle were bubbled back to the caller so they could take proper action on their side. Then it did that a million times in a row while I observed for memory leaks and profiled the performance for any bottle necks I could find. I fixed a dozen strange areas and added in a dozen more tests to make sure that every code path was exercised for every condition.

Exception handling. A great programmer once said that a programming error is an exception that isn't properly handled. IEEE math floating point author.

5. Test again at higher levels.

Integration testing at each level is crucial to making a program that is smooth and polished and a joy to work with.

Automate this testing as much as possible. An automated nightly build would be great and would give QA new builds to look at whenever they want, so they wouldn't bug the developers so much. Even if there is no QA in your environment, getting an error email when a check in breaks the build inside of a day would be really handy.

6. Re-factor as needed to add new functionality.

If you find that you are getting saw dust in the engines because you are doing wood work and engine work in the same shop, you might consider building a wood shop next to your garage.

Figure out that features you need to add over the next few iterations of your program. Re factor as much as you need to do to add the requested features. This requires having a goal and some planning. This level of organization can be difficult.

It makes it easier to justify the time required for re-factoring your program if you can justify it in terms of feature adds. In order to accomplish X, we must re-factor Y, which will take 1 week.

Only re-factor as needed or during some down time.

7. Don't re-factor an interface and change it at the same time.

Just like you wouldn't be building a wood shop while you were doing wood working in the same shop at the same time, don't think you can do the same thing in a programming environment. You don't hammer on a board to the wall with your left hand while you are running the table lathe with your right. You would focus on each project in turn so that you can give your best effort to each in turn.

We humans have limits. We can only juggle so many balls at the same time. If you are re-factoring code, just re-factor it and keep the exact same functionality. Once you are sure the re-factor is successful and everything works as is, _then_ add in the new functionality. Doing it in two steps keeps it clear exactly which part is causing an error so you don't waste hours trying to figure out what caused a brand new bug.

You don't have to completely re-factor a code base all at once. Every day every programmer on your team can just create a single function from several similar chunks of code that were scattered about the code base. Every week or two these similar functions could be pulled out into a code module and an interface created and tested.

Just like sweeping a floor and picking up a little bit one day and doing the dishes or a load of laundry another day can clean up a house, doing a little bit of house keeping on your code everyday can clean up your code base in a very short time.

8. K.I.S.S.

Keep your code as simple and as strait forward as it can possibly be to work, but no similar than than that.

Reading code is harder than writing code. If you write code that is so complicated that it is very difficult for you to write then nobody that isn't a lot brighter than you can read and understand that code without a lot of difficult time consuming study to even make a minor change. Writing complicated code is a sure sign that you do not understand what the code is doing. If you understood the problem, then your code will be clean and simple.

If you cannot solve a simple general example problem or use case yourself with pencil and paper, then you cannot instruct a computer how to do it without a lot of trial and error. Trying to solve a problem in code without figuring out how you would solve it first without a computer is a guarantee of an overly complex solution that solves the problem in a round about way, like a Rube Goldberg machine.

9. Rewrite bad code sections to be good.

Rewrite sections of the code that are particularly buggy or difficult to maintain to be simple and clear. I think that the masters of programming said it best:

"Code should be clear and simple-straightforward logic, natural expression, conventional language use, meaningful names, neat formatting, helpful comments-and it should avoid clever tricks and unusual constructions. Consistency is important because others will find it easier to read your code, and you theirs, if you all stick to the same style." Kernighan and Pike, the Practice of Programming.

Every month have each programmer on the team rewrite one bad section of code. If a variable has a poor name, then change the name.

10. Delete old code and old comments from your project.

Trust in your revision control system to track the old sections of code. Make the code do only what it says it does. Having large sections of code knocked out with ifdefs or comments or having deprecated code modules still in your code base is very confusing and can lead to bad errors if the old code is accidental switched on with a stray editing of a comment, or an inadvertent define change.

Trust in your revision control system to tell you who made changes to the system and why those changes are being made. Train your developers so they can become comfortable with how to use the revision control system features.

Get rid of the comment section in your source code from your revision control system. Nothing hides code better than 100 log entries at the beginning that everyone always scrolls past without ever looking at it.

Don't let people own sections of the code. There is no need to put in names or initials to code modules or to bug changes.

11. Clean up all your compiler warnings.

Most of them are easy to eliminate. The reason you want to get rid of the warnings is because they are hiding a few bad things right now. Seriously. At least 1 in 100 warnings that you are ignoring right now have very bad consequences. More than likely there are code paths that hit the end of a function without returning. You have borked up some commands and are just worked around it.

The big reason to get rid of the warnings is that when you make a change and recompile you don't want new warnings that could be serious to be hidden by a huge scroll of warnings.

Don't worry about the last one or two if it is too much effort to get rid of every warning.

13. Code reviews. Project Presentations.

Teach others about your code. Learn from others about their code. Make the reviews fun and educational for everyone. If you have having problems in a section of code, then pair program for the tough sections. This is where programmers get to socialize and show off their mad skillz with their peers.

Part of the reason that the code base is so messy is because nobody has made any effort to learn more about the code base than they have to, in order to accomplish their tasks and projects. Because nobody has taken any effort to educate anyone else about the code. I don't know about anyone else, but I love reading and talking about code. I love learning new techniques and learning about how others implement software.

And you don't have to wait until the end of a project to get feedback from your team members. You can give a presentation to the team about how you are planning to implement a feature or complete a project and get feedback on the design, maybe get some pointers on how to implement the project better or faster. It's better to take a couple of extra days planning, than to waste a lot of time trying to implement an incomplete specification.

I find that presenting code and project plans to people helps me clear up problems and leads me to a greater understanding of the issues myself. Nothing makes programming and projects more clear than serializing it into a stream of language.

14. Lead by example.

Don't force these strategies onto your team. Try just doing a few things yourself and showing people in code reviews how much better things are. Clean up the code in the section where you are working that day. Communicate with people and get their buy in ahead of time for changes you would like to make. Communication is very good. Explain to them some variable or function name you would like to change, or some section of code that you would like to change into a function. Just do it a little bit at a time, slow and steady. Teach them what you did afterwards by showing them what the code looked like before and what it looked like afterwards.

It may take a few months or a year but most people just need a little education to learn how to maintain a code base. A few others will hop on board and help you out, they were just waiting for someone to lead them.

At no point have I said that these changes are going to be easy. It took years of neglect to get the code base into the shape it is in, and it will probably take a few years to get it back into shape. This is going to take effort from everyone.

There will be resistance from team members. They may hate all or part of any change.

1. Some people think they own sections of code and resist any changes to that code ever.

You may have to sweet talk some prima donas. Don't get upset or angry over things like bracing style or white space. Life is too short. Worst case, you just work around that section of code that those people own and leave them in peace.

Ask those people about their code. Ask for code walk throughs for team members. Complement them on interesting bits, give them input on how they can improve some areas. The reason they are gripping that code so tightly is because they think that only they care about it and want it to be right. If you can reassure them that others also care then they may relax their grip on the code.

Or they could be insane, in which case just run away.

2. Some people hate opaque interfaces.

They want access to the internals of data structures from any place in the program. This may be for debugging purposes, or they may want direct access to a string data type so they can create a data operation on it.

Talk with them about why opaque interfaces are a good idea. Show them how it moves all the code into a single module that can be tested. Give them real life examples of problems it will solve in your environment. Have them remember a half dozen issues that having opaque interfaces will solve. Show them how to put break points into the code module to catch changes to the data. Show them how to extend the string object so that it adds the functionality they need, as well as adding the tests to the testing framework to test the new functionality, instead of giving them access to the data structure.

3. Managers will insist that there is no time to maintain the code.

To overcome this objection you have to show them that over time this will actually speed up development. You can talk about specific problems you are having in the code base and demonstrate how doing these simple methods will fix those issues on an ongoing basis.

You can also show how not rewriting the code base from scratch can save years of development effort.

If you are the manager, then you already know that you have to take time to clean the code base. If you can roll in the maintenance effort into each project you are expected to finish. What you are going to find is that adding in a time estimate for each project to re-factor the existing code is going to actually going to give more accurate time estimates for the projects.

As the code actually starts looking good again, then development will speed up because everyone now understands the existing code better and

Think how much your bosses are going to love you for saving the existing code base so that they don't have to pay for a whole new project and extra team members so you can do both what you are doing now and the replacement project.

4. Some people won't like any changes.

Change is bad. This is a basic truth. You must write up clear instructions on the new procedures around the code base. What is the command with your revision control system to see all the log entries over time? What is the command to look at who was the last person to change a line? You must train people in the new way of doing things so they don't feel lost and afraid. Setting up a wiki or a twiki really helps.

If you are the manager you may need to insist on the new way of doing things. But try to persuade and educate people first. Get their input on things and ask them why things are being done the way the way they are now. Actually listen to what they are saying and take it all into account. If there are good reasons to keep doing things the way they are, then keep them the way they are. If you need to implement some of these changes, then justify it with logical reasons as to why you are changing the old methods to new methods. Then make sure everyone is properly trained on the new way of doing things. Change is very scary and walking people through the new ways is very important to giving them a good feeling about work.

Once your code base is clean again and much more easily maintained and extended and you are practiced with these development techniques then the way to keep your code base clean it so continue refactoring and cleaning with each new bug fix or each new feature add. The good news is that this doesn't take any where near the effort it took to clean up your existing database. You may even look forward to a re-factoring effort to add in a major new feature.

Politics, Power, and Science

Thursday, April 19, 2012

Refactor or Fail!

No comments:

Post a Comment