Thursday, March 26, 2009

The PrimeBase BLOB Streaming (PBMS) engine alpha version 5.08 is ready

Alpha version 5.08 of the BLOB streaming engine for MySQL has been released. You can download the source code from www.blobstreaming.org/download. The documentation has also been updated.

What's new in 5.08:
  • All PBMS data is stored under a 'pbms' directory in the MySQL server's data directory rather than in the database directories them selves.
  • This version now builds with Drizzle and can be loaded as a 'Blobcontainer' plug-in.
  • Added the possibility of storing BLOB metadata along with the BLOB in the repository.
  • Added the possibility of assigning an alias to a BLOB, which can then be used to retrieve the BLOB instead of using the engine generated URL.
  • Added an updateable system table 'pbms_metadata_header' to control which HTTP headers are stored as metadata.
  • Added an updateable system table 'pbms_metadata' that contains all the metadata associated with the BLOBs.
  • New PBMS API functions have been added to set and get BLOB metadata.
  • A new PBMS API function has been added to allow applications to get the BLOB metadata with out getting the actual BLOB.
  • Added some new fields to the 'pbms_repository' system table.
  • Removed the raw BLOB data from the 'pbms_repository' system table and placed it in its own table, 'pbms_blob'.
  • Dropping a database containing PBMS BLOBS referenced from non-PBXT tables no longer requires special handling.
  • System tables can now be selected with an 'order by' clause.
As you can see a lot of work has been done on it and a lot of things have changed. I have already talked about most of the major changes in my previous 2 BLOG postings but I recommend having a look at the documentation for more details.

A few things to watch for in the new version:

  • The location and format of the BLOB repository files has changed. This means that if you are using an older version you will need to import the data from the server using the older version of PBMS into tables on a server running the new version. Feel free to contact me if you have any questions about this. I will try to maintain backward compatibility with older versions of PBMS but until the code is no longer alpha I can not guarantee this.
  • A patch was added to the PBXT engine to prevent a crash when shutting down MySQL. The PBXT version '1.0.07m-rc' contains this patch and is available for download with the the PBMS engine.
  • There remains an unsolved problem that can lead to PBXT hanging when a PBXT table containing longblobs is dropped and the PBMS engine is being used for BLOB storage.
As usual if you have any questions about PBMS and BLOB storage please send them to me and I will do my best to answer them.

Barry

Wednesday, March 4, 2009

PBMS supports BLOB metadata and aliases

The PrimeBase PBMS engine now supports user defined metadata.

When the PBMS engine receives a BLOB the HTTP header tag value pairs are stored with the BLOB as metadata. To restrict which headers are stored as metadata an updatable system table 'pbms_http_fields' is provided in which users can add the names of the headers that are to be treated as metadata. When the BLOB is retrieved from the engine the metadata is sent back along with the BLOB as HTTP headers in the same way that it was received.

The metadata can be accessed from within the database via the 'pbms_matadata' system table or altered by performing inserts, updates, or deletes on the table.

A BLOB alias is a special metadata field that allows you to associate a database wide unique name with the BLOB which can then be used later to retrieve it. If you are familiar with Amazon S3 storage it works in a similar manner where you can think of the database as the S3 bucket and the BLOB alias as the S3 key.  To fetch the BLOB back using the alias you use a URL with the format <database>/<alias>. The following is an example using 'curl' to send a BLOB with alias 'MyBLOB' to the PBMS engine and then fetch it back again:

curl -H "PBMS_BLOB_ALIAS:MyBLOB" -d "A BLOB with alias" "http://localhost:8080/test"
Returning:
~*test/_1-632-4934f86e-0*17

curl -D - "http://localhost:8080/test/MyBLOB"
Returning:
HTTP/1.1 200 OK
PBMS_CHECKSUM: A3762FF16159FAB246EBA2BE50F98CF4
PBMS_BLOB_SIZE: 17
PBMS_LAST_ACCESS: 1236200654
PBMS_ACCESS_COUNT: 0
PBMS_CREATION_TIME: 1236200654
PBMS_BLOB_ALIAS: MyBLOB
Content-Length: 17

A BLOB with alias
In this example the BLOB has been uploaded to the PBMS engine but not yet referenced so it will be automatically deleted after a preset time. 

As you can see several custom headers have been added to the reply:
  • PBMS_CHECKSUM is the MD5 checksum of the BLOB data.
  • PBMS_BLOB_SIZE is the size of the BLOB data in bytes.
  • PBMS_LAST_ACCESS is the last access time of the BLOB in seconds since Jan. 1 1970.
  • PBMS_ACCESS_COUNT is the number of times the BLOB has been downloaded.
  • PBMS_CREATION_TIME is the creation time of the BLOB in seconds since Jan. 1 1970.
It is also possible to fetch back just the header info with out the BLOB data, example:
curl -H "PBMS_RETURN_INFO_ONLY:yes" -D - "http://localhost:8080/test/MyBLOB"
Returning:
HTTP/1.1 200 OK
PBMS_CHECKSUM: A3762FF16159FAB246EBA2BE50F98CF4
PBMS_BLOB_SIZE: 17
PBMS_LAST_ACCESS: 1236200654
PBMS_ACCESS_COUNT: 0
PBMS_CREATION_TIME: 1236200654
PBMS_BLOB_ALIAS: MyBLOB
Content-Length: 0
The PBMS BLOB metadat enables users to store BLOB specific data with the BLOB and have it returned to them with out having to create a separate database table to store it and then execute separate SQL command to update it and retrieve it when ever they upload or download a BLOB. 

The BLOB alias allows users to generate their own names for BLOBs which they can then use to access the BLOB without having to make a call to the database to get the PBMS generated URL.