Wednesday, May 04, 2005

Software development companies need not prevail

Jay Fields suggests in a blog entry that the idea of in-house development is flawed and needs to die here... (I'm paraphrasing)

Sorry Jay, but I really disagree. There's nothing inherently wrong with the idea. The problems you cite are problems that any short sighted organisation will find when they're attempting to build software. I've worked for a couple of software houses and those I've worked for suffer from exactly the same problems described. Short sighted recruitment processes, cost cutting exercises, low quality code.

True, when you're working in-house these problems may be more common. The terms 'cost-centre' and 'profit-centre' are two that really cut to the bone.

However, I'm now working as an in-house developer with a team where the staff turn over is extremely low (around 10%), the team is built of very high calibre developers, morale is high, uptake on new technology is measured in relation to reasoned benefit, code quality is higher than I've ever seen before and customer satisfaction is through the roof.
We don't have any external people / consultants / software houses involved and I don't see any reason to have any.

In-house development isn't necessarily doomed to failure. Short sighted, cost driven organisations are the problem, and software houses are by no means immune to it.

Tuesday, May 03, 2005

Don't fancy yours much

I little while ago I was looking for a broadband connection. I almost went with Bulldog. Glad I didn't. Someone out there's really not happy with the service...

Bulldog Hell

Friday, March 25, 2005

Assertive Documentation

I've been reading another Pragmatic Programmers book: Pragmatic Unit Testing - In Java with J-unit [PUT] Yet another excellent book that covers topics relevant outside of the core of its material. That is, anyone who's using or thinking of using unit tests will get a lot out of this book irrespective of their choice of language. Knowledge of java helps, but is by no means mandatory and the advice it gives is universal.


It talks about strategies for devising complete tests, reminds us that test code is production code like any other and states that test code should 'document your intent' (pg 6). By this I take it to mean that there should be a clear statement of what that tested code should and should not do. If the test code is to be documentation then it has to be clear, since unclear documentation is useless. However the book skips over an important means to that end: the assertion text.


The assertion text is one place where you can state your intention without having to worry about the general guff involved in code. You get to write a single sentence saying what it is that is being tested, outside of the constraints of the programming language. In order to write a good assertion, think about what it is that assertion is for. It is a statement that should clearly say, completely, without support from the context, what is being tested. When that assertion fails, the text is taken out of its context and stated without its supporting code. When the assertion text is stated on its own, which would you prefer to read (using a Junit like format):



Testsuite: sortMyListTest
Testcase: testForException:
FAILED: Should have thrown an exception

or

Testsuite: sortMyListTest
Testcase: testForException:
FAILED: When passed an empty list,
exception is thrown


The first statement (taken from the example code below) is incomplete. Whilst it can be easily read in the context of the code, taken away from its origin it looses so much that investigation is required to find the fault. The second statement tells you enough to be able to track down the problem without even looking at the test code.
As an aside: A well named test procedure can help as well. One option would be to change the name of the test procedure to testForExceptionWhenPassedAnEmptyList.


In addition, taking a little bit of time over the assertion text can help in clarifying the meaning of the tests. When you're writing the tests, if you can't state the assertion in an easy to understand manner then it may be a clue that there's a problem with the scope of the assertion. Are you testing for multiple things in one assertion? Is it clear to YOU what it is that you're testing for? That is, a producing a smoke test that ends in the assertion 'bigProcThatDoesLoads does loads and loads of stuff, then some stuff and a bit more stuff, except when there's this condition when it does this stuff instead' is as much use as assertTrue( false ) when it comes to finding out what's wrong with the code.


Once you have a clear set of assertions sitting within well named procedures you can take the concept of the test code as documentation another step. That is, it should be possible to generate plain text documentation from the test results. Whilst the test code is accurate documentation, it's not quite as clear as plain text. Well written tests are pretty easy to read, but I always find that it's the assertion text I look at first. The generation process could remove duplicate assertion text as it goes along to ensure the result is a concise set of documentation.


Going back to the example given in [PUT] page 36, one could image the generated documentation along the lines:


sortMyList


  • When passed a list of integers, the list is mutated into ascending

  • When passed a list of strings, the list is mutated into ascending order according to ASCII values

  • When passed a list of objects, the list is mutated into ascending order according to the return from the method getValue

  • When passed a list all of the same value, the list is not mutated in any way

  • When passed an empty list, an exception in thrown



In order to do this you need your assertions to be made both in the negative and positive cases. That is, the [PUT] example is no longer valid:




public void testForException() {
try {
sortMyList( null );
fail ("Should have thrown an expection");
} catch (RuntimeException e) {
assertTrue( true );
}
}



Rather the test needs to be formed (in “as close as I care about” Java):




public void testForException() {
bExceptionThrown boolean = false;
try {
sortMyList( null );
} catch (RuntimeException e) {
bExceptionThrown = true;
}
assertTrue('When passed an empty list,
an exception is thrown'
, bExceptionThrown );
}



Also, when writing tests in the file driven manner there needs to be someway of reading the assertion text in from the file. Maybe those #'s cease to be comments and become assertion text signifiers. On a sole line the assertion is for a family of tests, when placed at the end of the test data it becomes an additional qualifier. Since your test code is production code, you may even be using a generic file reader that will replace tags e.g.{x} with the test data...


Applying this to the example from [PUT] pg 40, describing a getHighestValue function:



#
# When passed a list of ints, returns the highest value
#
9 7 8 9
9 9 8 7
9 9 8 9

...
#
# When passed a single value, returns that value
#
1 1
0 0
#
# Works with the boundaries of 32bit integer values
#
2147483647 # (positive maximum: {x})
-2147483648 # (negative minimum: {x})


We would get the documentation:


getHighestValue


  • When passed a list of integers, returns the highest value

  • When passed a list of negative integers, returns the highest value

  • When passed a mixture of positive and negative integers, returns the highest value

  • When passed a single value, returns that value

  • Works with the boundaries of 32bit integer values (positive maximum: 2147483647)

  • Works with the boundaries of 32bit integer values (negative minimum: -2147483648)



The resulting process does sound an awful lot like JavaDoc, PHPDoc and the rest of those helper applications. Now I don't go for such things. I think the chances of the comments being kept up to date is pretty slim, it's very difficult to police and the resulting documentation is patchy at best. However, the assertion text lives within the code in a much more symbiotic manner. It may be plain text, but it is still part of the code, and when it's right you get an immediate benefit from it. You're writing your tests before you write your code aren't you? Surely that means that you're seeing those assertions fail pretty much as soon as you've written them. You can write a test that covers a lot of the functionality of the code before you run it. When you do you get immediate feedback that tells you how good your assertion texts are. Good assertion text makes it easy to use the failures as a specification for the next bit you need to write. It takes diligence to do it but it's a lot more likely to happen than accurate JavaDoc comments. And if the generated documentation isn't clear then it may be a pointer to the fact that the unit tests aren't clear either.

Sunday, March 13, 2005

Stop the rot: Can't stop the rot

Entropy, software rot, broken windows; call the effect what you will, but I'm convinced that the natural state of any piece of software is chaos, and every project will tend towards it. Even with solid design guidelines and vigilance, every piece of software will eventually fall foul of entropy and become an unworkable, unenhanceable, unmaintainable mess.

So we're all doomed then?

Well, no. We just have to be pragmatic about it. By applying good practices we can push that date back, and all we need to do is make sure that date is far enough in the future to ensure that we get enough good value out of the software before we need to throw it away and work on the next project.

Because of the way XP works, it's ideally suited to delaying that date for as long as possible, whilst also being on the brink of making sure that date slams into your project early and without warning.

We develop only for today's problems and worrying about tomorrow's when the next day starts; we guarantee that we'll always need to make changes to our code.

We develop tests for our code before we write the code itself, and we maintain those tests so that they're as exhaustive as possible; we have the confidence to change the system as needs arise.

We release early and often and encourage our customers to change direction when we're going the wrong way, throw away things they don't need and enhance things they do; we change what the system needs to do.

These things coupled can mean that our code is changing more rapidly than in most 'traditional' projects. The core of our system may never be complete, classes appear and disappear, whole areas of the system morph. With people confidently applying fundamental changes, their differing ideas of how things should work can battle for attention.

Therefore, the potential speed of entropy within the system is high, which is one of the reasons why we pair program.

Pair programming is intended to ensure that good practice gets applied whenever code is designed, tested or written. By having that sanity checker on hand at all times we hope to ensure that the design decisions are good. We have someone to tell us when we're going the wrong way, to look at the bigger picture whilst we focus on the nitty gritty of knocking the code out. They get involved in the detail, pointing out when we miss a test case and even telling us that our indentation pattern is wrong. Our pair is there to make sure the quality doesn't slip, and to make sure that when we cut corners, we're cut short.

But pairing isn't enough. When two people pair together often, they get familiar with each other's idiosyncrasies. Developers are (mostly) people after all, and whilst one gets tired of telling the other the same thing every day, the other gets tired of always getting it in the neck. Not only that, but you have the same problem the other way round.
As people pair together more and more often each gets less and less likely to police their partner's work.

Also, when people pair together all the time they're more likely to stick to the areas of code they already know to the exclusion of the rest of the team. That means that the knowledge of that part gets stuck in that pair and never moves outside it. Also they don't collect the same level of knowledge in other areas of the system. You end up with experts in particular areas that get less and less inclined to move away from that area.

So, if you don't cycle your pairs, and these two things start to happen you find that pockets of the system will start to deteriorate. If constant pairs are always working on the same areas then the quality of work in those areas will start to drop, maybe even without other team members being aware of the fact.

By cycling the pairs we hope to reduce the chances of this happening. Whenever a person pairs with someone they haven't paired with for a while they 'perk up'. They push the quality up as they spot things that the 'normal' partner doesn't. Even if they're pointing out the same problems that the other pair spots, it feels less like 'badgering' as it's someone new doing it. They also point out any bad habits that the normal pair has decided they can get away with.
In addition, if pairs cycle it isn't possible for the same pair to work on the same area all the time, so the knowledge has to spread through the team. As each new person looks at a section of the system there is a certain amount of explaining, or education that's needed. The design is vocalised, diagrammed, made clear. By doing this, flaws in the design are more likely to get spotted. If the flaws are spotted then they're infinitely more likely to get fixed. The rot is less likely to take hold.

However, whether you cycle the pairs or not, it must be recognised that in all cases it is every team member's personal responsibility to ensure that rotten code is cut out before it infects the rest of the system. By cycling the pairs we just make it easier to both spot the rot, and remove it.

Sometimes the changes that are needed can be too fundamental to make 'on the fly'. Maybe the fatal design flaw didn't seem so bad last month and so it wasn't fixed when it was easy to do so. Sometimes it's not that there's a single large change that's needed, but that there's too many little ones. It's at these times the whole team needs to take a break, look at what's gone wrong and work out what needs to be done.

This may be a test-audit, code-audit, a large scale change of architecture, or some other set of tasks.
Whatever it is, if you're going to take that break its purpose needs to be clear.

As a team you need to get together, decide what needs to be done, produce task cards, estimate them, prioritise them. Work on them like any other piece of work. You need to set the boundaries in the same way as you would for the customer. The work should be tracked in the same way as usual and, importantly, the customer needs to agree to the work and understand what's being done and why it's needed. This means that you are publicly responsible for the work, and that means that it's more likely to be completed.
When you ask the customer for time to 'tidy up' and there's nothing formal on the plan, then that time is never well spent. If you're responsible for showing that things are tidy, then you make sure that you put things in their place.

So, you want to stop the rot?
  • Accept the system's going to become rotten, just make sure it happens far in the future.
  • Cycle the pairs, so you don't get too comfortable.
  • If things start getting too bad to fix on the fly, stop, work out why, and take time off to fix them.
  • And, most importantly: Take responsibility for the rot that you find.

Friday, March 04, 2005

What's your view?

We've known for a little bit that we need to formalise they way in which we think about the implementation of our UI.

We've got pretty good at making sure our PHP business objects are well defined, and we refactor at quite nice pace now. New business rules, classes and structures drop in easily. All in all, pretty well thought out, we like to think.

Our UI isn't quite so clear cut. We've had a bit of a go at an MVC based model, but it was put together before we really knew what MVC was. It's a little messy. Not unfathomable, but a little messy.
So, we've decided that it's worth taking good hard looks at Struts, and PHP implementations to see how it should be done. Don't get me wrong, we're not about to jump in and throw everything we've already done away. And we're certainly in no rush to get it started. We figure we can look, learn, experiment with some small areas of the application, refine our knowledge and finally see where and when we can refactor our more complex stuff into the new framework. More evolution than revolution.

So we started reading up on PHP MVC. It appears to be fairly well respected in the community, so it seems a good place to start.
Only, in the middle of its 'User Guide 101' it says the following:
"The php.MVC framework consists of a complex arrangement of many classes. Luckily we do not need to know in great detail how all these classes work, in order to use the framework."

Now if we were talking about a calendar class, a spell checker or some other 'drop in and use' set of classes, then yes, I'd agree. But we're not, we're talking about the fundamental framework that is the basis for the design of a whole slice of our application. If we don't understand how that framework fits together and how every component should be used then we're going to be in serious trouble. We do need to know in great detail how all these classes work, in order to use the framework effectively. If we don't then we'll end up with the same problems we have now. It worries me that such a project would suggest anything else.