Wednesday, February 16, 2005

Extreme Oracle... part 2

So, we've evolved some rules for using Oracle with PHP in an XP environment. Some of them are probably 'no-brainers', but then we all need reminding sometimes. Some of them are a little PHP specific, but then the reasoning is a lesson in itself. All of them are subject to change, but then we embrace change ;-)

1 - Do not put SQL into your PHP layer

No matter how trivial a SELECT statement can be or how quick it would be to write that INSERT statement in the PHP layer... Don't. Ever.

It's not the PHP layer's responsibility to understand the structure of the database. It's the database's responsibility. By putting that all the SQL in the Oracle layer we ensure that this responsibility is clearly stated. It also means we know where all the SQL is. So when our data structures change we know that only the Oracle layer needs to change. It's only when the logical structure of the data changes that the PHP layer is affected. In those cases we're probably going to change a lot more than just the database accessors anyway.

Instead of writing SQL in the PHP layer, we write PL/SQL functions to return data. These functions clearly describe the data they return. As fond as I am of SQL, I'd never say that SELECT statements clearly define the data they return, especially when more than two or three tables are involved. A clear function name and a well written unit test is a lot better at stating intention than a piece of SQL can ever be.

The data accessors can then be unit tested within the Oracle layer, close to the data structures. Not only that, but there is no need to include database data in the PHP unit tests. As the Oracle layer contains functions with clearly defined return structures, then these can be stubbed in PHP unit tests. If the SQL lived in the PHP class you're trying to test then stubbing isn't really feasible, leading you to either setup database data in your PHP unit test or, even worse, ignore that part of the class in your testing.

2 - Use ref cursors

Most of the time the data accessor will be returning more than a single value, it'll need to return a result set. In such cases it could be applicable to use one of the PL/SQL array like structures for just this purpose. Unfortunately, the PHP 4 OCI layer doesn't really support these particularly well. This is a bit of a shame, especially with the inclusion of BULK COLLECT in Oracle 8i and above, massively simplifying the collection of data from a SELECT statement into a PL/SQL table structure.

Still, PHP is good at handling ref cursors. For those that don't know, a ref cursor is an open SELECT statement that can be passed between Oracle functions, and through the OCI layer. When used within Oracle, a function returning a ref cursor would be used in place of an OPEN statement. Data is FETCHED from it in the same way as any other Cursor. The PHP Oracle Call Interface deals with the cursor in a similarly straightforward way. This means that you can deal with functions as if they are SQL statements without ever worrying about what the SQL statement actually is.

However, there is a downside. The PHP layer is now responsible for dealing with open cursors and therefore of cleaning up when they're finished with. Be strict, write a function in your PHP layer that will call a parameter specified function, get the cursor, retrieve the record set from it and then clean up. Use it where-ever you need to get the values from a ref cursor.

3 - Define record definitions for your ref cursors

Oracle ref cursors aren't declared with any structure, structure is applied to them when a SELECT statement is assigned to them. PHP is weakly typed and will build the structure it needs as it retrieves the rows from the cursor. You don't need to define the structure of the record that the function returns. Until you test...

Whilst you don't need to define the structure for the ref cursor, Oracle is strongly typed. You DO need to define the record structure when you retrieve any values from the ref cursor. And you MUST be retrieving the values from your ref cursor if you're testing your PL-SQL. Good practice dictates that you should write your tests in the same language as the code you're testing (there's a lengthy argument there that may appear in a later Blog, who knows). So your Oracle code has a record definition for each ref cursor somewhere.

We say put that record definition into the application. We group our functions in packages, and each package contains the record definition for each ref cursor returned by functions in that package.

By putting the definition in the package, you have a full specification for each function in the package where it is most appropriate: in the package specification. By following simple naming conventions it's easy to find the record definition that matches each function. In a non test driven environment you may find pitfalls here. In particular, it is easy to change the function without changing the record specification. However, since our tests use these specifications, if the two ever get out of step our tests break. This means our record definition should always be in step with our function.

The record definition could live in the test, and this is where we first put ours. But we soon realised something: If your record definition is in the package specification then it's available in your Oracle layer. And if it's in your Oracle layer that means you can use it when you...

4 - Use your accessors when retrieving data within the Oracle layer

Two of the reasons we ensure that all the SQL in the system only exists in the database layer is so that we minimise the number of places that SQL can be found and reduce the amount of data structure knowledge required in separate parts of the system. These are also good reasons why we should then make sure that we use the accessor functions we've written when we access data from elsewhere in the database layer.

As I've already stated, we can use ref cursors in the same way as any other cursor. We know the row structure of the cursor returned by a function as it's clearly defined alongside the function specification. So using the ref cursor in the Oracle layer is actually less lines of code than using a locally declared cursor.

It may have been inferred earlier that better PL/SQL table support in the PHP OCI may lead towards a different model, of keeping the cursor concerns in the database and only passing complete data structures into the PHP layer. Unfortunately, this would then result in lessening the usability of those functions in the database layer due to the cumbersome nature of PL/SQL array handing. Consequently, now we've started using ref cursors in our projects we'd have to see a good argument against them before we'd stop. That doesn't mean we couldn't be convinced...

More later...

No comments: