Tuesday, April 27, 2010

BLOBs are not just blobs

Recently when talking to someone about PBMS it occurred to me that I had been thinking about BLOBs in the traditional database sense in that they were atomic blocks of data the content of which the server knew nothing about. But with PBMS that need not be the case.

The simplest enhancement would be to allow the client to send a BLOB request to the PBMS daemon with an offset and size to just return a chunk of the BLOB. Depending on the application and the BLOB contents this may make perfectly good sense, why force the client to retrieve the entire BLOB if it only want part of it.

A much more interesting idea would be to enable the user to provide custom server side functions that they could run against the BLOB.

So how would his work?

The PBMS daemon would provide its own "BLOB functions" plugin API. The API would be quite simple where the plugin would register the function names it supports. When the PBMS daemon receives a BLOB request specifying a BLOB function name, it calls the BLOB function passing it a hook to the BLOB data and then returns to the client what ever the function returns.

The first use of this that I can imagine would be to provide a function that would return the thumbnail from a jpeg image rather than the entire image. Other functions may just return the jpeg metadata.

The idea is that BLOBs are not just blobs but are highly structured documents which, given the knowledge of the document structure, it is possible to return portions of the BLOB that are of interest to particular applications.

No comments: