Why use PBMS?
I have talked to people about why they should use PBMS to handle BLOB data often enough, so I was surprised when someone asked me where they could find this information and I discovered I had never actually written it down anywhere.  So here it is.
    
If you are unfamiliar with PBMS, PBMS stands for PrimeBase Media Streaming. For details please have a look at the home page for 
BLOB Streaming.
Both MySQL and Drizzle are not designed to handle BLOB data efficiently. This is not a storage engine problem, most storage engines can store BLOB data reasonably efficiently, but the problem is in the server architecture itself. The problem is that the BLOB data is transferred to and from the server as part of the regular result set. To do this both the server and the client must allocate a buffer large enough to hold the entire BLOB. DBMSs that are designed to handle BLOBs such as Oracle, SyBase, and PrimeBase all pass the BLOB data outside of the regular result set. This way they avoid the requirement of having to buffer the entire BLOB. APIs such as ODBC understand this and provide functions such as SQLGetData() that can be called multiple times to retrieve data in chunks so that the client doesn’t need to buffer BLOBs if it doesn’t need to.
PBMS is designed to address this problem and provide MySQL and Drizzle with a means to efficiently handle BLOB data by allowing BLOB data to be transferred outside of the regular result set.
There are currently 2 approaches to handling BLOBs when using MySQL or Drizzle, one approach is to just store the BLOB data in the database in a Blob column which I will call the “BLOBs in database” approach, the other is to store the BLOB in a file some where and then store the path to the file in the database, which I will call the “BLOBs in files” approach.  I will compare these 2 methods of handling BLOB data to the "PBMS" approach which is to use the PBMS daemon.
 The BLOBs in database approach:
Advantages:
- Simple      to implement, BLOB data is treated no different than any other data.
- Flexible,      DBMS independent applications can be written using ODBC or JDBC to access      the data. 
- The      referential integrity of the BLOB data is ensured by the database. 
- Standard      database maintenance ensures the security of the data. 
Disadvantages:
- The      BLOB data is buffered on both the server and client side so that a 100 M      BLOB will require a 100 M buffer on the server and then another on the      client to receive the BLOB into.       If the server is busy handling 100 such requests then it will need      10 G of buffer space.
- Database      replication becomes impractical because of the size of the logs when the      BLOB data is written to them.
- The      use of mysqldump, or similar tools to backup databases result in huge      backup files because the BLOBs must be converted to hex strings in order      to write then to the backup log which doubles their size.
- The      MySQL cluster server cannot be used with databases containing BLOBs.
The BLOBs in files approach:
Advantages:
- The      BLOB data is not part of the result set so large buffers are not required.
- The      BLOB data can be store in a location that is remote from the database      server.
- Standard      replication will work (but the BLOB data will not be replicated).
- Standard      backup procedures can be used with the database (but the BLOB data will      not be backed up).
Disadvantages:
- A      separate backup solution must be found for the BLOB data while keeping it      consistent with the database backups.
- A      separate solution is required to replicate BLOB data. 
- Requires      a custom designed system including client software.
- The      client software needs to know how the BLOBs are stored and needs to be      provided with a method of accessing the BLOBs. If the BLOBs are not      located locally to the client then additional software may be required.
- The      referential integrity, making sure that the BLOB files being stored on the      file system are consistent with the BLOB references stored in the      database, is no longer controlled by the database server.
- Doesn’t      scale well because most file system perform poorly when the number of      files starts to exceed a couple of million.  
- Installation      and maintenance is more complex because specialized knowledge is required.
- The client application is responsible for handling  the effects of transaction rollbacks in ensuring referential integrity.
The PBMS approach:
Advantages:
- Simple      to implement, all data storage and access is handled by the database      server and PBMS engine.
- Flexible,      DBMS independent applications can be written using JDBC to access the      data. 
- The      referential integrity of the BLOB data is ensured by the database. 
- Standard      database maintenance ensures the security of the data. No special      knowledge is required.
- Replication      of the BLOB data is possible.
- The      BLOB data can be streamed in and out of the database so that buffer sizes      are independent of the size of the BLOBs.
- Better      performance, test show that inserts and selects are significantly faster      when BLOBs of 50 K or more are handled using PBMS.
- The      solution scales well, BLOBs are packed into files so the number of files      in the file system is much less than you would get with a one BLOB per      file system. 
- The      maximum size of a BLOB is only limited by the maximum file size of the      host machine.
- The      BLOB data can be stored in a location remote from the database server,      such as on a different machine or in S3 cloud storage. This reduces the      load on the database server host and it’s network bandwidth use.
- The PBMS daemon ensures that BLOB inserts and deletes are handled properly in the event of a transaction rollback. Transaction check points are also supported.
Disadvantages:
- PBMS      is not shipped with any MySQL distribution, but it is ship with Drizzle      and can be downloaded from http://www.blobstreaming.org.
- The      MySQL server does not directly support PBMS but Drizzle does provides      direct support.
Although MySQL doesn’t support PBMS directly it is not difficult to add support to the InnoDB engine and anyone interested can contact me and I will happily assist them with it. I use InnoDB for most of my testing.
Conclusions:
PBMS provides efficient BLOB handling that is missing from MySQL and Drizzle. This would be enhanced greatly by integrating PBMS support more directly into the MySQL server. 
I currently have plans to increase the support for PBMS in drizzle by adding a new column type for storing PBMS BLOB references. This enables the use of PBMS to be part of the database schema design and simplifies support for client libraries. The client library will then be able to recognize it is getting a BLOB reference and can then make calls to the PBMS daemon to stream the data back to the client. Ideally this is the type of support MySQL should also provide PBMS.