Tuesday, August 16, 2005

Database Upgrade Scripts – Why all the effort?

On Wilfred van der Deijl's blog, Chuck (amongst other things) effectively asks:

"How much effort do you dedicate to total automation when you still need someone smart enough to [deal with failures]".

In order to address this I think I first want to address the question, what are we trying to do with automation.
The aim of is to make it easy to re-run the upgrade with as little effort as possible, with as many configruations as possible, thus testing the upgrade as much as possible and hopefully making the upgrade solid and reliable. Of course we can't make sure that the upgrade works every time with every configuration of data, but we can make sure it runs successfully a lot more often than it doesn't.

This gives us confidence in our ability to change the databases, and in doing so makes us more likely to re-work difficult data structures rather than work around them. This means that our database is better to work with, which makes our jobs easier.

Also, whenever we make a problem free roll-out we increase the confidence the rest of the business has in our ability to do our job. It makes them more open to receiving more upgrades more often and makes them easier to roll them out when we do. This means our customers get their bug fies earlier and their enhancements earlier, and can therefore give us feedback on where to make more enhancements earlier in the software's lifespan.

In short, making the upgrades solid helps us write better software by allowing us to focus on the software rather than the software roll-out, and the customer on the possibilities of new software rather than software failure..

How does automation give us this? Well, the ability to automate builds does not, it's the way in which those automated builds are used that does: Why not make it possible for an overnight process to perform the upgrade to as many different production databases as you can gather the resources to do?

In putting together an automated build plan we want builds on test databases that occur every night, using recent (that day's?) backups of production systems. We want to be able to run through the upgrade as it would occur on live as many times as possible, and we want to do that without draining our resources to the point that it's all we are doing.

If we can have unattended backups of our live system, unattended rebuilds of that live system on a test server, unattended upgrading of that database and then unattended testing of the resultant application, then we can eaily test the upgrade of the live system from the moment the project starts to the night before the production upgrade takes place.

If we do this, then the odds are that by the time we decide to upgrade the production system we know about that last minute data problem that will cause the build to fail. We'll have fixed the problem before the upgrade started, and we'll have dealt with in when we're not in a panic, we'll have done all this without teams of users waiting for the production database to come back up.

Our aim is not to have a completely unskilled worker to perform the task of upgrading the production database, but to make sure the upgrade will go smoothly far more often than not. We want that standby DBA to stay on standby.

That seems like worth a bit of effort to me!

Technorati Tags: , , , , , , , , ,

Friday, August 12, 2005

A seminal essay and a difference of opinion

Thanks to Wilfred van der Deijl's post here, I've just read a January 2003 article by Martin Fowler and Pramod Sadalage... and I can't believe that I had previously missed it!

Martin and Pramod very clearly discuss the overarching issues that appear once you start to look at database development in an agile manner, and you can very easily see from their discussion how important a clear and solid patch runner structure is to successful agile database development.

Not long after this article was written we were just starting to think about the same practices and many of the conclusions we came to matched. Not least the need to implement sandbox databases for each workspace and the need for consistent test data across those databases.

However, our conclusions did differ in one fundamental way, and I find it difficult to understand Martin and Pramod's (M&P) approach in this particular area.

My issue relates directly to the propagation of database changes across the development team. In M&P's approach, database changes are automatically applied to the development workspaces as they occur, managed by the team's DBA. They point out that:

'people are usually concerned that automatically updating developers databases underneath them will cause a problem, but we found that it worked just fine'

I am one of those concerned people. I am concerned in the same way that I would be if I was told that the source code I was working with would be automatically updated at regular intervals as changes occurred. Let me rephrase the above statement with that in mind:

'people are usually concerned that automatically updating developers source code underneath them will cause a problem, but we found that it worked just fine'

Suddenly the statement does not seem quite so reasonable!

In order to develop I need to clearly understand what is within my control and what is outside of my control. I need to be able to rely on the known state of my own workspace. For me, this is a founding principle behind the idea of the development workspace as a sandbox.
A given version of an application is developed to run against a given version of the database. Because of this, the database is as much a part of the workspace as the rest of the source code.

There is also the issue of the database change being applied in advance of the related code change being checked into the version control system, and the manual process therefore required. In M&P's process they will notify the DBA of the changes required once they are decided upon. At some point in the near future the DBA will then make the changes to the central databases and the changes will propagate to the development workspaces. As this happens there are three clear risks that I can see:

1.The database schema change is made a significant amount of time before or after the associated code change is committed to version control.
2.The change required by the developer is miscommunicated to the DBA and an incorrect change is applied.
3.The change is mistakenly applied to development databases that are following a different branch to that in which the change is made.

In all the above cases, the effective result is that the change may invalidate a particular component of the application due to the development (or integration) database being out of step with the source code being ran against that database.
I believe all this risks can be mitigated against by making the following changes to the described process:

1.Make the developers responsible for producing the actual patches that will be eventually be ran against all databases, development, through integration to live. These patches should form part of an clear structured patch runner, and should be placed in version control alongside the rest of the application source code.

2.Make each developer responsible for the maintenance of their own database workspace. Make it easy for a developer to upgrade their database, and make them do so using those scripts that will be ran against all other instances of the database. Make it part of the routine that immediately after updating their source code workspace they upgrade their database.

3.Put the database upgrade into the automated build / continuous integration test suite and ensure that the build being performed by the automated build is exactly the same as that which would be ran on the live system.

These practices have the added advantage of minimising the amount of work required by the DBA, who can exist in a more advisory role within the team.

Of course, there are times where data migrations will take a non trivial amount of time to complete. In these cases it is important that developers are warned of the fact and are able to schedule times when these changes take place so as to minimise the impact on their work.

Having said this, agile database development is a new venture for most everybody involved, and work such as this should not be underestimated, and cannot understated. There will likely be many different ideas on this topic, coming from many different people, and it's a pleasure to be involved. It is very easy to forget just how much database development has evolved in the last 10 years!

Technorati Tags: , , , , , , , , ,

Wednesday, August 10, 2005

Named Notation Parameters in easy to read unit test shock!

One of the better kept secrets of Oracle is the ability to use 'named parameter passing' over 'positional parameter passing'. Oracle covers it here.

That is, if I want to call a procedure that has three parameters that have defaults, and I want to a value pass into the third parameter, I can do so by referencing the name of the third parameter….

Give the function definition:
    FUNCTION insert_employee( pn_department_id  NUMBER := NULL
, pn_manager_id NUMBER := NULL
, pc_employee_name VARCHAR2 ) RETURN NUMBER;

Can be called by performing the following:
    vn_employee_id := insert_employee( pc_employee_name => 'Rob Baillie' );

OK, so in the example given it may make a lot more sense to just order the parameters a little better, like have the employee name first, but you get the idea.

So, it looks cool, and you may be able to think of a few instances where it may prove useful. It can be handy if you think your procedure signatures are likely to change and you have an out parameter you want to keep to the end of the procedure definition...

    FUNCTION insert_employee( pc_employee_name IN     VARCHAR2 
, pn_department_id IN NUMBER := NULL
, pn_manager_id IN NUMBER := NULL
, pc_error_message OUT VARCHAR2 ) RETURN NUMBER;

It can be regarded as tidy make sure that error message parameter is at the bottom, but without having named parameter notation you would never get the benefit of the default parameters, and would have to change all your calls if you ever added a new field between the department and manager id.

Quite handy, but nothing really earth shattering.

The place where we’ve found the most use is in unit tests...

Lets say we’re writing a test for the function:

    FUNCTION insert_holiday( pn_employee_id   IN     NUMBER
, pd_from_date IN DATE
, pd_to_date IN DATE
, pc_error_message OUT VARCHAR2 ) RETURN BOOLEAN;

Lets say that this function has many different ways it can fail: Any of the parameters being NULL, the from data being later than the to date, the total number of days over the allocation...

In each case the function will fail in the same way. Fail to insert the holiday record, return FALSE and set pc_error_message to the reason why it failed.

In order to test for this conditions we write a simple failure check procedure with a signature along the lines of:
    PROCEDURE check_insert_holiday_fails( pc_context      VARCHAR2
, pn_employee_id NUMBER := 7438
, pd_from_date DATE := TRUNC( SYSDATE ) + 10
, pd_to_date DATE := TRUNC( SYSDATE ) + 17 )

Each of the insert_holiday’s input parameters are duplicated on the check procedure, and set to be valid values for that procedure.
Check_insert_holiday_fails calls insert_holiday with the passed in values and goes on to check for the correct three error conditions: No increase in number of holiday records, returning false and passing back an error message.

The pc_context is appended to each of the assertion texts in order to give a nice readout. So, for example, the check false assertion may be (using UtPlsql):

    utAssert.this( 'When ' || pc_context ||', insert_holiday returns false', NOT vb_result );

The idea is that if the procedure was called with only the context value specifed, the call to the tested insert_holiday function would be successful (and the test would fail, if that makes sense!).

So, we have a single procedure that will check that a failure state is returned, covering all the components. We can then call this with our error states, using the named parameter notation in order to only change the parameters we are interested in changing. E.G.

    check_insert_holiday_fails( 'invalid employee_id'    , pn_employee_id => -1 );
check_insert_holiday_fails( 'NULL employee_id' , pn_employee_id => NULL );
check_insert_holiday_fails( 'from date after to date', pd_from_date => SYSDATE + 5, pd_to_date => SYSDATE + 2 );
check_insert_holiday_fails( 'from date in past' , pd_from_date => SYSDATE - 1 );
check_insert_holiday_fails( 'to date in past' , pd_to_date => SYSDATE - 1 );

The use of the named notation means that only the parameters important to the failure are stated, rather than the full list. This makes it easier to see each individual test case clearly. If the procedure being tested has a lot of parameters, then the advantage becomes very clear!

Technorati Tags: , , , , , , ,

Update: Sorry for the chnage in permalink address. I just couldn't handle the typo in the title!

Saturday, August 06, 2005

Haloscan Ping 0.3 (alpha 2) released

For those that are interested, a new version of Haloscan Ping has been released. Thanks go to Andrew Beacock and Ryan Cullen for their input!

This version fixes a couple of minor bugs with the first alpha...

  • Atom feeds with no content, but a summary tag are now recognised correctly

  • NULL entries no longer cause the ping screen to crash

  • If an entry is shorter than the minimum number of characters, it will still be chopped off to the nearest sentence

What we're currently working on for the next version:

  • Ability to send multiple pings in a single submit

  • Automatically stripping tags from the content

As with the previous version, if anyone wants it, please mail me!

Technorati Tags: ,, , , , , , , , , , , ,

The Database Patch Runner: Table Centric Views

On Wilfred van der Deijl's blog we got into a nice discussion about grabbing a table centric history of the patches applied to a database when using a functional patch centric upgrade organisation.

I think I've made it obvious in the past that I massively favour the functional grouping, since I think it makes the job of the developer and the person performing the upgrade that much clearer. However, I acknowledge the perceived need for a table history, and started thinking about how you might produce it.

Just as I was getting to some sort of conclusion, Wilfred put together a couple of great ideas... and I thought I'd give them a critical review.

Forgive me for this Wilfred... I really wanted to get this down, I think it's a great topic!

Group the patches by table, then add interpreted comments to the files in order to state which patch a given ALTER TABLE statement belongs with

I'm not sure about this solution for a simple reason. I don't like organising the source files by tables; I think this makes the developer's job that much harder.

In producing a patch the developer needs to update several files and cannot easily see the order in which these changes are applied.

I simple situation where this becomes very important is where a column needs to be moved from one table to another. In order to do so, I cannot see how this can be done without changing at least two files; one that adds the new column, and one that drops the old.. In addition, you need to find somewhere to put the data migration script. Does this go into to the DDL file for the table having the column added? Or do you could add a third file into the mix.

I have found in the past that the create column and data migration get put into the table scripts and there is a manual process added to ensure that the existing column gets dropped. The process is then forgotten in the heat of the production upgrade and the redundant column remains in the schema.

In addition, with the table centric grouping it is easy for a developer to miss a step in the DDL, a column isn't added to a table, for example. By having the functional grouping it's easier to spot this mistake since the developer has a single file that lists all the DDL applicable for that functional change.

Finally, there is a need to add that extra step... adding the comments to the table create scripts. I have an objection to this in the same way that I have an objection to JavaDoc comments. People simply don't add them, and nobody notices / cares until such time as those missing comments are crucial. By then it's too late. The comments could be policed, but whenever I see something that needs policing I see something that needs changing so the policing is no longer needed.

Another way would be to stick to your way with a SQL script per patch and just store all statements in the database... [snip] ... have system triggers that log all DDL statement against particular objects

I love this idea. I think this has got a lot going for it. The big advantage of this technique is that the person writing the patch does not need to change the way they work in order for it to be implemented.

As part of the patch runner installation it can add a system trigger to the current schema that will log the changes to the schema as they occur. This detail log can state which object that has changed, the change that was made, and the patch that issued the change.

The trigger can be notified of which patch is currently running (as stated by Wilfred) by the patch runner logging the name of the current patch in a package variable, a temporary table or some other session specific temporary store.

Once the patch runner has completed it can then produce a report stating which changes were applied by that upgrade, grouped by table. It can produce a report of all changes ever made to the database. If so required it could list the changes classified by some arbitrary grouping of objects (system functional area / table sizes).

As Wilfred states, there is a downside to this... the data is in the database, rather than version control. It is generated after the change has been applied. However, this data can be made available in version control fairly easily. The simplest solution is to have the report run regularly on a development or test server. Copy the report into version control and away you go. If you're running a nightly build then this can be done automatically each night from the test server in advance of the codeset being tagged.

Alternatively you can provide a simple user interface in order to query this information more interactively. If, for example, you're developing browser based applications this will be a fairly simple tool to produce within your normal toolset.

Additionally, such a system DDL trigger can be used to report on ANY changes made to the database, not just those applied through the official upgrade process. Having produced reports that will list object changes from the database it would be a trivial task to change those reports to list any objects that have changed without being ran through the patch runner. It may be that in a strict environment a change made outside of the patch runner (and therefore the standard application upgrade) could be blocked. Obviously, any developer or DBA that knows the structure will be able to work around it without too much work, but this then means that you are certain that the person doing so is consciously making a change outside of the accepted process, rather than because they are not aware that such a process exists.

As I've stated earlier, on Wilfred's blog, I'm not sure our organisation has a need for the table centric view of the changes made to a database. In fact I strongly urge those people who do so to ask why it is they do. I'd love to hear the reasons why... I feel as though I'm missing something!

That said, I think the second solution proposed by Wilfred is spot on, and if we were looking to produce a table centric view we would definitely follow that avenue first.

Technorati Tags: , , , , , , , , ,

Rant: You want my money?

To all you companies out there that want to take my money, and want me to pay you by credit card, on the internet. When you ask for my credit card number, you've got three choices:

1 - Build the entryboxes so that it doesn't matter if I put spaces or dashes in, just let my format the number 1234-1234-1234-1234, strip the number out and just use that.

2 - Build the entry boxes so that I can't put spaces, dashes or too many characters in and make it clear that I can't. That way I'll type my credit card number with just the numbers.

3 - Let me type a combination of spaces, dashes and numbers in the way that's natural to me, then tell me I'm an idiot. Give me messages like "Your credit card number has too many characters, please try again", or (my personal favourite) "You have entered your card number with spaces, please enter just numbers".

Guess which option will reduce the chances of me using your company next time. Guess which option your company most likely uses.

Friday, August 05, 2005

The Database Patch Runner - Rollbacks

In response to my earlier post here, Andrew Beacock asked a couple of seemingly small questions.

First of all, he asked for more detail. I'm going to address that in a later post.

The second question, I didn't really expect. He basically asked “How do you rollback a patch?”

I'm assuming that he's asked this question in relation to the point I made that went along these lines: Since the patch runner stops running patches whenever one of the fails, then recovery is simply a case of rolling back the changes that the patch had already made and then restarting the patch runner. If the runner ensures it doesn't attempt to re-run any successful patches, but does re-run the failed one then it effectively picks up from where it left off and goes on its merry little way.

The point I was trying to make was that since the patch runner stops, then you don't get into the situation where you're attempting to fix a problem that occurred half way through the upgrade by getting to the end that then picking up the mess it left in its wake. Instead you know where the upgrade failed, the patch log states it clearly; you know what the patch was trying to do, since you have a functional grouping of changes in the patch, and this gives you context for the failed change; you know that you only have to worry about the failure that is reported, not the hidden data problems that the rest of the upgrade generated because the first patch didn't do its job properly.

So how do you rollback the changes in order to get the patch up and running again?

To be honest, there's nothing I can think of other than getting your hands dirty; rolling up your sleeves and getting into the database. Picking apart what happened, piecing together the story and trying to gaffer tape up the problems.

However, you can make that job easier:

The big thing is... Think about transactions in your patch

Every DDL statement issues an implicit COMMIT, so don't pepper your patch with DDL, leaving data in a transient state committed in the database. Think about the order in which you do things in order to reduce the chances that a failure at any given point leaves data in an invalid state.

Ensure that your 'add to schema' DDL comes first. That way, if you're moving two columns between tables, and the second column's create fails, you don't get stuck in a situation where you need to reverse the first change or apply the second one manually.

Ensure that your 'remove from schema' DDL comes last. For the same reason as above. It's much easier to deal with if you know that your data migrations have completed and it's just a case of removing those columns manually and marking the patch as complete.

If you can get away with it, don't issue commits in your patch's data migration until it's completed. Admitted, if you're moving millions of rows of data around, this may not be possible. But think about it... do you really need to issue that commit every 10,000 records, or can you get away with running the whole lot through first? That way, when you get a failure, the patch runner can ROLLBACK and you know what state the data is in... the state it was before the patch started.

If you can't get away without committing regularly, or the patch simply takes to long to risk having to repeat it, think about how you can make the migration restartable. That is, store a watermark, make your patch keep track of where it was so that it can restart after a failure. Make sure that the patch can handle being restarted without it trying to re-do work it has already done. If it really is that risky, why not check that you have enough space in your rollback segments before you generate GBytes of data?

Ideally you want to make sure that your patch doesn't fail, but with a bit of thought before you write the patch, you can make your life easier when it does.

Technorati Tags: , , , , , , , ,

Comments back

Comments should be running normally again now. The offending Haloscan comments javascript has been replaced with the original Blogger javascript.

I had inadvertently switched over to Haloscan comments without switching off the Blogger comments. So adding a comment when you arrived from a permalink and clicking on 'Post a comment' gave you a Blogger comment that couldn't be seen from the main page.

Ah well, you live and learn! Normal commenting has now been resumed.

Thursday, August 04, 2005

Comment problems

Just so you know, I'm having trouble with the comments on here...

I've managed to install two competing sets of comments.
Haloscan is dealing with the comments of the main page, then Blogger is dealing with the comments that you see when you look at an individual post. Look at this post and you'll see what I mean.

Over the weekend I'll try to make sure I fix the problem...
For now, please accept my apologies :-)

The Database Patch Runner

Over here Wilfred van der Deijl has raised the time old question when dealing with databases.

How do you easily upgrade a database from an arbitrary version to a new particular version with the smallest amount of effort.

I responded with a brief description here , which caused Amihay Gohen to ask a question or two over E-mail. I thought I'd put together something a little more coherent and detailed here, rather than just send it on to Amihay directly.

This is the basic solution...

Whenever a developer / pair of developers need to change the database they write a patch. This patch contains any alter tables / data migrations / etc. A single file contains the full set of changes for a single conceptual change and may cover changes to several tables.

The patches are written in PL/SQL procedures. That is, each single set of changes is implemented in a single procedure. Each DDL command is issued in an EXECUTE IMMEDIATE, or similar, as is any DML that references a table or column that does not yet exist. The patch is written in a procedure so that we can use the Oracle exception handler to propagate fatal errors in the code to a controlling procedure, something that is far more difficult to do in a script.

These patch procedures are then installed by a "patch runner". This runner is guided by a .txt list of patches that is created by the developers. If a developer wants to add a patch then they write the procedure and add it to the patch list. The runner then uses this patch list to tell it which patches need to be installed, and in what order. The runner is written in PL/SQL, using UTL_FILE and DBMS_SQL to install the procedures.

The runner logs when a patch starts, in a table, and marks the completion or failure of that patch when it finishes. By looking in the patch log before it runs a patch, the patch runner ensures that it never attempts to run a patch that has previously been marked as complete.

In addition, if any patch fails then the patch runner aborts, stating that it failed. "Crash don’t trash!". If the patch runner is restarted then it will attempt to re-run any patches that previously failed. This makes it easier to restart a failed upgrade if ever required. You reverse out the changes already made by the failed patch and then rerun the patch runner.

The runner itself is installed in a single PL/SQL script, and we make sure that it installs all the components it needs before it starts, and tears itself down once its finished. It does not rely on any other part of the database or application in order to work. Since there are no dependencies outside of the patch runner install script, we can use a single patch runner for all of our database based applications, with no changes.

The installation and upgrade of the patch_log table is dealt with in the more traditional manner of a single CREATE TABLE at the start followed by ALTER TABLE statements as it changes over time. The patch log is not destroyed down with the runner.

So, in summary, you check out a given tagged version of the application to run against an arbitrary database and run the patch runner. It loads the list of patches that are applicable for that tagged version. It runs over each patch in turn and checks if it has previously ran. If it has not, then it runs the patch.
By the time it reaches the end of the list it has ran all the patches, none of them twice. You can even run the patch runner twice in succession and guarantee that the second run will not change the state of the database.

The key to that is then to make sure that you use the build often. Get your developers their own database workspaces and make sure that that rebuild and upgrade them often. By repeating the upgrades often you ensure that they basically work before they get moved into a test environment with the bigger worry of large volumes of data.

The patch runner is, of course, only a small part of the database build, and a smaller part of the whole application build. So the runner needs to put into the context of a larger build script. We use batch files here in order to manage our database build process, but I see no reason why other, for complete scripting languages shouldn't be used, or ANT. The runner is, after all, just a PL/SQL script.

As an aside, we also write tests for the patches, checking the pre and post conditions are valid… but that’s a very different topic!

Technorati Tags: , , , , , , , , , ,

Monday, August 01, 2005

Oracle Firefox

Whilst it doesn't give you anything that a bookmark doesn't give you anyway, that little search box next to the address bar still feels pretty handy.

And if you're an Oracle developer, then tahiti.oracle.com is pretty handy too.

Well, someone out there has put the two together here.

Not sure I'll use it that often, but it's worth a look!

Update: 25/01/06
Turns out I use this absolutely loads. If you haven't installed it yet... get it on!

Technorati Tags: , , , , , ,