Sunday, March 13, 2005

Stop the rot: Can't stop the rot

Entropy, software rot, broken windows; call the effect what you will, but I'm convinced that the natural state of any piece of software is chaos, and every project will tend towards it. Even with solid design guidelines and vigilance, every piece of software will eventually fall foul of entropy and become an unworkable, unenhanceable, unmaintainable mess.

So we're all doomed then?

Well, no. We just have to be pragmatic about it. By applying good practices we can push that date back, and all we need to do is make sure that date is far enough in the future to ensure that we get enough good value out of the software before we need to throw it away and work on the next project.

Because of the way XP works, it's ideally suited to delaying that date for as long as possible, whilst also being on the brink of making sure that date slams into your project early and without warning.

We develop only for today's problems and worrying about tomorrow's when the next day starts; we guarantee that we'll always need to make changes to our code.

We develop tests for our code before we write the code itself, and we maintain those tests so that they're as exhaustive as possible; we have the confidence to change the system as needs arise.

We release early and often and encourage our customers to change direction when we're going the wrong way, throw away things they don't need and enhance things they do; we change what the system needs to do.

These things coupled can mean that our code is changing more rapidly than in most 'traditional' projects. The core of our system may never be complete, classes appear and disappear, whole areas of the system morph. With people confidently applying fundamental changes, their differing ideas of how things should work can battle for attention.

Therefore, the potential speed of entropy within the system is high, which is one of the reasons why we pair program.

Pair programming is intended to ensure that good practice gets applied whenever code is designed, tested or written. By having that sanity checker on hand at all times we hope to ensure that the design decisions are good. We have someone to tell us when we're going the wrong way, to look at the bigger picture whilst we focus on the nitty gritty of knocking the code out. They get involved in the detail, pointing out when we miss a test case and even telling us that our indentation pattern is wrong. Our pair is there to make sure the quality doesn't slip, and to make sure that when we cut corners, we're cut short.

But pairing isn't enough. When two people pair together often, they get familiar with each other's idiosyncrasies. Developers are (mostly) people after all, and whilst one gets tired of telling the other the same thing every day, the other gets tired of always getting it in the neck. Not only that, but you have the same problem the other way round.
As people pair together more and more often each gets less and less likely to police their partner's work.

Also, when people pair together all the time they're more likely to stick to the areas of code they already know to the exclusion of the rest of the team. That means that the knowledge of that part gets stuck in that pair and never moves outside it. Also they don't collect the same level of knowledge in other areas of the system. You end up with experts in particular areas that get less and less inclined to move away from that area.

So, if you don't cycle your pairs, and these two things start to happen you find that pockets of the system will start to deteriorate. If constant pairs are always working on the same areas then the quality of work in those areas will start to drop, maybe even without other team members being aware of the fact.

By cycling the pairs we hope to reduce the chances of this happening. Whenever a person pairs with someone they haven't paired with for a while they 'perk up'. They push the quality up as they spot things that the 'normal' partner doesn't. Even if they're pointing out the same problems that the other pair spots, it feels less like 'badgering' as it's someone new doing it. They also point out any bad habits that the normal pair has decided they can get away with.
In addition, if pairs cycle it isn't possible for the same pair to work on the same area all the time, so the knowledge has to spread through the team. As each new person looks at a section of the system there is a certain amount of explaining, or education that's needed. The design is vocalised, diagrammed, made clear. By doing this, flaws in the design are more likely to get spotted. If the flaws are spotted then they're infinitely more likely to get fixed. The rot is less likely to take hold.

However, whether you cycle the pairs or not, it must be recognised that in all cases it is every team member's personal responsibility to ensure that rotten code is cut out before it infects the rest of the system. By cycling the pairs we just make it easier to both spot the rot, and remove it.

Sometimes the changes that are needed can be too fundamental to make 'on the fly'. Maybe the fatal design flaw didn't seem so bad last month and so it wasn't fixed when it was easy to do so. Sometimes it's not that there's a single large change that's needed, but that there's too many little ones. It's at these times the whole team needs to take a break, look at what's gone wrong and work out what needs to be done.

This may be a test-audit, code-audit, a large scale change of architecture, or some other set of tasks.
Whatever it is, if you're going to take that break its purpose needs to be clear.

As a team you need to get together, decide what needs to be done, produce task cards, estimate them, prioritise them. Work on them like any other piece of work. You need to set the boundaries in the same way as you would for the customer. The work should be tracked in the same way as usual and, importantly, the customer needs to agree to the work and understand what's being done and why it's needed. This means that you are publicly responsible for the work, and that means that it's more likely to be completed.
When you ask the customer for time to 'tidy up' and there's nothing formal on the plan, then that time is never well spent. If you're responsible for showing that things are tidy, then you make sure that you put things in their place.

So, you want to stop the rot?
  • Accept the system's going to become rotten, just make sure it happens far in the future.
  • Cycle the pairs, so you don't get too comfortable.
  • If things start getting too bad to fix on the fly, stop, work out why, and take time off to fix them.
  • And, most importantly: Take responsibility for the rot that you find.

1 comment:

Anonymous said...

spot the rot - always spot the rot!