"(Or if you're a crazed wunderkind like LiveJournal founder Brad Fitzpatrick, you invent a memory-based distributed hashtable as a cache to put in front of the database.)"
The cool thing about Brad is that he released that creation as open source -- we can all benefit from his genius, like Facebook already has. Memcached is an amazingly effective way of getting the benefits of SQL storage in a simple, scalable, and reliable way. It's impossible to over-hype how much it kicks ass.
If a class of files requests N>1 copies, at what point after the HTTP PUT can the application be happy that N copies exist? It seems fine to think I've three copies of that file but what if machine failure occurs before MogileFS has created the other two?
Also, it's intended to operate on whole files at a time, although HTTP GET might be usable to fetch a run of bytes. If two web servers both try and write the same filename, doesn't the latest one win?
I can see it's great for certain things, e.g. storing the user's images, but not for the stuff traditionally in the database. Or have I missed something?
The cool thing about Brad is that he released that creation as open source -- we can all benefit from his genius, like Facebook already has. Memcached is an amazingly effective way of getting the benefits of SQL storage in a simple, scalable, and reliable way. It's impossible to over-hype how much it kicks ass.