Wednesday, December 06, 2006

Repeating a point

The other day I mentioned the principle "Don't repeat yourself". I think it may have inspired Andy Clarke to write this up, and he's quite right. It comes from the Pragmatic Programmers.

APC's spot on in his description as it relates to writing code, but he doesn't go far enough.

DRY relates to every part of software development, not just the bit where you're knocking out code.

If, in any part of the process, you find you have a duplication of knowledge then you have a potential problem.

Anyone ever read that comment at the top of a procedure and found its description doesn't match the code that follows?

Watched that demonstration video and found that it's showing you an utterly different version of the system to that which you've just installed?

Looked at that definitive ER diagram and found it's missing half the tables?

Well, don't put a comment at the top of the procedure, instead document the behaviour by writing an easy to read unit test for it. Whilst the knowledge might be duplicated (the test duplicates the knowledge inside the procedure), at least those pieces of knowledge are validated against each other (if you have to repeat, put in some validation)

Don't have a team writing automated functional tests and another producing videos, write your video scripts as automated tests and have them generated with every build.

Instead of manually creating a set of ER diagrams and documentation on what the system will be like, write some documentation generation software and have it generated from the current state of the database instead.

You might notice that there's a running theme here... generation. Well yup. One of the best ways of reducing the chances of discrepancies between sources of knowledge is by ensuring there is only one representation of that knowledge and generating the others.

It's one of the reasons why I've been working on the new Open Source library 'P-dd' (Php Database Documentor). It's intended be a simple library for the production of database documentation from a number of different sources - the ultimate aim is to be able to read from any of the major RDBMS systems, Wikis, XML files and suchlike and be able to simply output many different forms, HTML, GIF, PDF, XML, Open Office Doc. Over the next week I intend on letting people know where they can find it, in its early form...

Wednesday, November 29, 2006

Worth repeating...

A mantra for all elements of software development...

Repeat after me:

Don't Repeat Yourself

Don't Repeat Yourself

Don't Repeat Yourself.

Monday, November 06, 2006

Testing doesn't have to be formal to be automated

Something I hear quite a bit from people who don't 'do' automated testing is the set of excuses that goes something like this:

"Look, we know it's a really good idea and everything but we just can't afford the start-up time to bring in all these automated test tools, set up a continuous integration server and write all the regression tests we'd need in order to get it up and running. And even if WE thought we could, there's no way we'd get it past the management team."

Now let's for a second assume that the standard arguments haven't worked in response: How can you afford not to; It'll save you time in the long run; Yada yada yada. When people are in that mind set there's not much you can do...

For reasons that I'm going to explain to you right now, we have a project on the go that isn't covered by automated tests. It's an inherited system that can't have tests retro-fitted in the kind of time we have. In reality, most of the work we're doing is actually removing functionality, with a few cosmetic changes and a little bit of extra stuff in the middle. No more than a couple of weeks work.

It turns out our standard automated test tools can't just be readily fit onto the system we have.

But that's OK. It certainly doesn't mean we're not going test, and I'm damn sure that we're going automate a big chunk of it.

First of all we've separated all the functionality that can be delivered in a new module and that part WILL be fully unit and story tested. That leaves a pretty small amount of work in the legacy system. Small enough that we could probably accept the risk associated with not doing any automated testing.

But that would be defeatist.

So instead we've picked up Selenium.

Not the full blown selenium server and continuous integration hooks and whatever. Just the Firefox based IDE.

It's simple, and requires absolutely minimal set-up... it's an xpi that just drops straight into Firefox like any other extension. Having installed the IDE you get action record and playback, and nice context sensitive right click options on any page that allows you to 'assert "this ole text" appears on page' or 'check the value of this item is x'. Basically it's almost trivial to get a regression test up and running. Then you can use the IDE to run the test.

So having got that up an running, before we set about deleting huge swathes of functionality we create a regression test that ensures that the functionality we want to keep stays there. We've found that a decent sized test that covers a fair few screens and actions can take us as little as an hour to get together. To put it into perspective: we put together a test script today that took about 20 minutes to run manually. That test took us about 45 minutes to sort out in the Selenium IDE and then a minute or two to run it each time after that. So by the time we'd run it 3 times we'd saved ourselves 15 minutes!

Running it might involve executing a SQL script manually, running a Selenium script, then checking some e-mails arrive, then running another Selenium script. In short, it might involve a few tasks performed one after another. And yes we could automate the whole lot, but like I say... we just don't have the time right now. But using the tool to add what tests we can right now, to help us with our short (a few hours each) tasks means that we're building up a functional test suite without ever really thinking about it. We'll keep those scripts, and maybe in a couple of weeks we'll realise that we DO have the time to set up everything else we need for proper functional testing.

Yep, it could be a hell of a lot better (and on our other projects it is), but some informal testing using an automated runner is an order of magnitude better than no automated testing at all.

Technorati Tags: , , , , , , ,

Wednesday, November 01, 2006

Going Dotty

There's a new big thing in my sphere of interest: Dotty.

For the uninitiated: Dotty, Neato and Lefty are a family of products from Graphviz that take pretty simple text files and generate directed or un-directed graphs.

For the initiated: No, I can't believe I've never found it before either!

It was name checked during the Google LTAC by at least one presenter, and they reminded me that I'd heard its name quite some time ago and meant to look it up. When we were looking at a diagramming problem just a few weeks ago and I figured I should track it down. I was by no means disappointed.

Basically we wanted something that would graph our MVC workflow configurations to make them more readable.

That is, our MVC structure allows us to string arbitrary tasks together: perform x, if result is y, go to task z, if result is h go to task i.

The idea is to keep these configurations as simple as possible; they're only really receiving user input and then prodding objects, but still, there are some complexities. This is especially true when branches split and rejoin. For some reason, XML files or a PHP arrays can be difficult to read ;-)

Quite a long time ago we wrote a small application that would graph them in HTML, but we never liked its results. When paths split and rejoin, the HTML representation wouldn’t show the rejoin.

So, as I say, we picked up Dotty.

Simply genius.

For directional graphs the Dot output is stunning. We can pass it a file in (the trivially simple) Dot language and it'll produce great looking diagrams.
For example, the file:

digraph finite_state_machine {
node [ fontsize="12", fontname="arial"]
edge [ fontsize="8", fontname="arial" ]
EntryPoint [ label="EntryPoint (BuildSheepFromInput)", shape="diamond" ];
EntryPoint->SaveEditedSheep [ label="DEFAULT" ];
SaveEditedSheep [ label="SaveEditedSheep (SaveEditedSheepTask)" ];
SaveEditedSheep->SaveCheese [ label="DEFAULT" ];
EditSheep__EntryPoint [ shape="box" ];
SaveEditedSheep->EditSheep__EntryPoint [ label="ERRORS" ];
SaveCheese [ label="SaveCheese (SaveCheeseTask)" ];
SaveCheese->GetCheeseType [ label="DEFAULT" ];
GetCheeseType [ label="GetCheeseType (GetCheeseTypeTask)" ];
DisplayWensleydale__EntryPoint [ shape="box" ];
GetCheeseType->DisplayWensleydale__EntryPoint [ label="WENSLEYDALE" ];
GetCheeseType->DisplayCheddar__ComposeMessage [ label="CHEDDAR" ];
}

Would produce:
SaveCheeseWorkflow – Example DOT image

Stunning!

It's not difficult to write code to generate the DOT files, and the output from neato (the same as dot, but for undirected graphs) is just as high quality.

Of course, as soon as I saw the output the cogs started moving in my mind... I'm now on a bit of a brainstorm on what can come next: How about ER diagrams generated from the database schema and published on an internal site. Generated documentation is never out of date, and it's a damn site easier having it generated of the fly than it is to load up Visio and get THAT monstrosity to do the job for you.

Anyway, the ER diagramming library will be open source, and it IS on its way... I promise.

(Note: if you want more info on dotty, take a look here)

Technorati Tags: , , , , , , , , ,

Tuesday, October 31, 2006

Run Done, and other news

Well, it's done. Finally the running season's finished for me after completing the Rainforest Foundation's 10km run.

My target for the year was to beat 55 minutes, and I managed it twice. Sunday's run was finished in 54:01, but I managed to beat that 3 weeks ago in the Nike Run London event... 53:23. So say I'm pleased is an understatement!

Happy Bob

Aside: To those people protesting against Nike at the event: You have my sympathies, you really do, but it was a Nike 10km 2 years ago that go me running. I'd not long before given up smoking and the Nike run gave me a real incentive to get myself fit again (I say again, but really I'm not sure I've ever been that fit).

Now that doesn't mean that Nike is a great company that gave me my health, or anything like that. But it does illustrate a point... the vast majority of the people at the event were runners, and Nike took the time to organise a great run, probably not an audience that's going to be swayed by your argument. As I said to the protesters on the day, and I really mean this: You organise a 10km run, and I'll run it. An anti-globalisation 10km run in the middle of the most multi-cultural city in the world...

Anyway, next year I'll be stepping it up a touch more and the aim is to get under 50 minutes. The first run's booked already.

And the other news? Well, over the last few weeks I've been trying to get some coding done on a small open source project. It's a PHP library that will generate documentation on arbitrary relational databases. It's in its early stages right now, but I thought it was time to mention it and see if there's any interest out there.

It'll produce bog standard HTML documentation as well as dotty files that you can use to generate diagrams like this:

ExamplePddDiagram

It'll be available soon...