Tuesday, April 17, 2012

Smart and Cheap and 135 Terabytes- BackBlaze

THIS is cool. 135T Raid-6, Http accessible storage for $7k+ - Phenomenally Disruptive price compared to standard storage practices (e.g. SAN).

135 Terabytes for $7384

Not only that but TCO is low:
One guy (Sean).. maintain[s] our fleet of 201 pods, which holds 9,045 drives.(16 petabytes!)
Philosophy:
to plan for equipment failure and build a system that operates in spite of it. 
BackBlaze's Online "Cloud" system backup stack:
desktop computer -> client app ->HTTP -> custom tomcat app ->memory cache-> LVM -> ext4 -> (md?)raid 6->Linux -> Sata

I think Google was a frontrunner in the idea of reliable infrastructure on top of cheap commodity hardware and linux/opensource +test and  engineer improvements yourself (e.g. Paper  and Summary) with  principles like:

  •  Instead of a  3x +  premium for "Enterprise hardware" - design your system to expect failure and fail gracefully. 
  • Instead of a  premium for "Enterprise hardware"--  do your own engineering. (e.g. Backblaze uses Hitachi Deskstar 5K3000 HDS5C3030ALA630 drives - Doesn't it make more sense to have someone test drives and find the best one, rather than pay a bunch of very high paid managers to sit through sales presentations for the privilege of paying a lot more for an "Enterprise" solution which is less reliable?)
  • Instead of heaping all of the redundancy requirements on your infrastructure (which costs bigtime because its those smaller .9's which cost the big bucks) : Tune your infrastructure to support your app and build your app for redundancy and  component failure.
Why does backblaze give away their trade secrets??? Well they don't really give it all away.  It's the freemium thing. 
If you need to build a reliable, redundant, monitored storage system, you’ve got more work ahead of you. At Backblaze we’ve developed software that manages and monitors the cloud service.
And that software stack is proprietary :)