More than 2 decades of writing software, and still loving it...
Thursday, July 12, 2007
Can a change in execution plan change the results?
We've been using Oracle Domain indexes for a while now in order to search documents to get back a ranked order of things that meet certain criteria.
The documents are releated to people, and we augment the basic text search with other filters and score metrics based on the 'people' side of things to get an overall 'suitability' score for the results in a search.
Without giving too much away about the business I work with I can't really tell you much more about the product than that, but it's probably enough of a background for this little gem.
We've known for a while that the domain index 'score' returned from a 'contains' clause is based not only on the document to which that score relates, but also on the rest of the set that is searched. An individual document score does not live in isolation, rather in lives in the context of the whole result set.
No problem. As I say, we've known this for a while and so have our customers. Quite a while ago they stopped asking what the numbers mean and learned to trust them.
However, today we realised something. Since the results are affected by the result set that is searched, this means that the results can be affected by the order in which the optimizer decides to execute a query.
I can't give you a full end to end example, but I can assure you that the following is most definately the case on one of our production domain indexes (names changed, obviously):
We have a two column table 'document_index', which contains 'id' and 'document_contents'. Both columns have an index. The ID being the primary key and the other being a domain index.
The following SQL gives the related execution path:
SELECT id, SCORE( 1 )
FROM document_index
WHERE CONTAINS( document_contents, :1, 1 ) > 0
AND id = :2
SELECT STATEMENT
TABLE ACCESS BY INDEX ROWID SCOTT.DOCUMENT_INDEX
DOMAIN INDEX SCOTT.DOCUMENT_INDEX_IDX01
However, the alternative SQL gives this execution path:
SELECT id, SCORE( 1 )
FROM document_index
WHERE CONTAINS( document_contents, 'Some text', 1 ) > 0
AND id = :2
SELECT STATEMENT
TABLE ACCESS BY INDEX ROWID SCOTT.DOCUMENT_INDEX
INDEX UNIQUE SCAN SCOTT.DOCUMENT_INDEX_PK
Normally, this kind of change in execution path wouldn't be a problem. But as stated earlier, the result of a score operation against a domain index is not just dependant on the individual records, but the context of the whole result set. The first execution provides you a score for the single document in the context of the all the documents in the table, the second gives you a score within the context of just that document. The scores are different.
Now obviously, this is an extreme example, but more subtle examples will almost certainly exist if you combine the domain index lookups with any other where clause criteria. This is especially true if you're using literal values instead of bind variables in which case you may find the execution path changing between calls to the 'same' piece of SQL.
My advice? Well, we're going to split our domain index look ups from all the rest of the filtering criteria, that way we can prepare the set of documents we want the search to be within and know that the scoring algorithm will be applied consistently.
Wednesday, July 04, 2007
Handy "Alert Debugging" tool
One of the coolest things about OO Javascript is that methods can be written to as if they are variables. This means that you can re-write functions on the fly. Bad for writing maintainable code if you're not structured; Fantastic for things like MVC controllers (rather use the controller to forward calls on to the model, you use it to rewire the view so that it calls it directly, and all without the view even realising it!).
What I didn't realise was that the standard window object (and probably so many others out there) can have its methods overwritten like any other.
Probably the simplest example of that proves to be incredibly useful... changing the alert function so that the dialog becomes a confirm window. Clicking cancel means that no further alerts are shown to the user. Great for when you're writin Javascript without a debugger and have to resort to 'alert debugging'.
window.alert = function(s) { if( !confirm(s) ) window.alert = null; }In case you're wondering... I found it embedded in the comments on this post: http://www.joehewitt.com/blog/firebug_for_iph.php. Cheers Menno van Slooten
Tuesday, June 26, 2007
Tab Complete in Windows
Another one of those things that I can never remember off the top of my head so find myself constantly looking it up whenever I get access to a new machine.
I figure it may as well be my own site that I get the info from :-)
To switch on 'Tab Complete' in Windows command line change the following registry keys to '09':
I figure it may as well be my own site that I get the info from :-)
To switch on 'Tab Complete' in Windows command line change the following registry keys to '09':
- HKEY_LOCAL_MACHINE\Software\Microsoft\Command Processor\CompletionChar
- HKEY_LOCAL_MACHINE\Software\Microsoft\Command Processor\PathCompletionChar
Friday, June 22, 2007
Haiku
Saw a cracking Haiku on a t-shirt the other day:
Haiku are easy
But sometimes they don't make sense
Refridgerator.
Then my mind started dwelling on it:
Got the release out
But the testing's not finished
It's falling over
Or
Database is slow
Just can't see what's wrong with it
Set autotrace on
Or
A quick refactor
Turns into a bigger job
Should have unit tests
Haiku are easy
But sometimes they don't make sense
Refridgerator.
Then my mind started dwelling on it:
Got the release out
But the testing's not finished
It's falling over
Or
Database is slow
Just can't see what's wrong with it
Set autotrace on
Or
A quick refactor
Turns into a bigger job
Should have unit tests
Monday, May 28, 2007
Records
And to follow on from the last post... my current personal bests:
I figure if I keep them here, at least I'll always know where they are!
I figure if I keep them here, at least I'll always know where they are!
5km Run | 23:44 (Battersea Park, 'Beat the Baton' 28/05/07) |
10km Run | 53:23 (Hyde Park, 'Run London' 08/10/06) |
Half Marathon | 2:17:49 (Redcar, 'Tees Valley Half Marathon' 12/03/06) |
Rubik's cube | 57 seconds |
Subscribe to:
Posts (Atom)