Storing Binary Data

110 pts.
Binary data
Database performance
I've been in the process of reviewing the pros and cons of storing binary data (e.g. image files) either in blob fields in a database (presumably MySQL) vs. the filesystem of a server. I'm aware there are some concerns of security, but I'm mainly interested in performance at present. I could easily ask you a million questions on the topic, but for brevity's sake I'll keep it down to two general ones plus a very specific one: 1) First off, precisely rather than running you down with a million questions on the topic, is there any good documentation resource you recommend I look up to better understand the issues at stake? If anything, I'd like to be able to make more educated questions. I have DuBois' book at hand, so I could also look there if you point me to a particular section (book is rather large!). 2) Do you have any general "words of wisdom" of your own on the topic? Like, what side do you take? What has your experience been like? what would you recommend under a given set of circumstances? 3) As for the specific question I have: one arguably interesting pro I've heard from people advocating filesystem storage for binary data is browser side caching; whenever a file is (re)requested, the web server can stat the file and determine if it needs to re-send it to the client browser, depending on its cache; needless to say, this may save up quite a bit of bandwidth for a busy site. Is something similar possible when the binary data is being stored in the database? I don't see a way in which a cheap operation such as stat could be performed on a blob for said cache comparison to take place. In case this is true and filesystem advocates do win the points on this one, is there something of a related nature (caching / performance / saving bandwidth / etc) that database storage has over filesystems?

Answer Wiki

Thanks. We'll let you know when a new response is added.

Storing blob data is where most people start. Where they end up after performance tanks is to move the blob data back to the file system which is where it really belongs.

By storing blob data within the database your database server’s hard drives have to work that much harder trying to support both the normal database traffic as well as the additional traffic of the blob data being written to disk. You also end up wasting a lot of memory on the database server while the database server tries to cache blob data into memory which is of no use as the image probably isn’t going to be called all that often. On a machine with only a couple of Gigs and even a half way busy database this could easily eat up 25-50% of your memory taking away from loading your OLTP data into memory.

There are advantages to storing blobs in the database, though – you get ACID compliance, your backup tasks are a little bit simpler, and you can use replication to duplicate the information to multiple servers.

For the best of both worlds, you can store the “master” data in the database, and use some caching system to keep the image also stored in a file. Your web server can serve the file without touching the database, and your caching system will be responsible for updating the file if it’s not found or if the master data changes. Just make sure that you don’t do SELECT * against a table containing a blob, when you don’t need that column!

Discuss This Question:  

There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

Thanks! We'll email you when relevant content is added and updated.


Share this item with your network: