[FoRK] Amazon S3 storage service

Eugen Leitl < eugen at leitl.org > on > Thu Mar 16 03:27:42 PST 2006

On Tue, Mar 14, 2006 at 10:04:23PM -0500, Stephen D. Williams wrote:

> I know they are in an entirely different price range.  So why is the EMC 
> stuff such a pain with catastrophic failures that take tons of effort to 
> unravel and the scalable stuff like Google's 200,000 servers/drives 
> apparently so easy?

It's not easy -- managing a distributed system of some 200000 nodes takes
some good systems and processes. Also, the core infrastructure
(GFS) is basically read-only. 

Things are easier for a real FS on a small scale with a good
interconnect (10 GBit Ethernet or InfiniBand).
> Obviously using more appropriate data / processing / communication / 
> management models is a big win.  Pushing everything through narrow fibre 
> channels to a large SAN is obviously less total bandwidth than each 
> drive having a fast path to the CPU, even legacy IDE does well there.  

Software RAID is more cost-effective on unloaded machines, because
modern CPUs outperform but the most expensive hardware RAID with only
a moderate load. Also, a caching fs keeps data close to the CPU
where they're needed.

> Trying to fix bandwidth to a SAN with an expensive and complicated 
> multi-channel multiplexer just widens the gap.

It is also a reliability issue. You can address reliability with
diamond-studded platinum hardware, and a horde of support gnomes
on standby, but it will still fail. Admittedly, unobtainium components
fail far less often than their not-so-glamorous counterparts.

I'd rather live with a diagnostic and failover/healing infrastructure
built around reasonably stable (enterprise-grade SATA) components,
and buy more infrastructure for the saved premium.
> In between are things like iSCSI and distributed filesystems.
> In the long run, we need cheap, independent drives that minimize power, 

I distinctly hope that we'd be soon getting solid-state drives with sub-us
latency instead of today's ms -- as somebody recently observed today's
large drives are very much like tape cartridges.

> have very fast networking through a combination of daisy chaining and 
> network switches, and have a general purpose onboard IO processor with 

I have never understood why nobody builds Ethernet drives. (Yes,
I know, but Seagate doesn't sell them, and you pay through the
nose for shelves and cases).

> plenty of cache that people can load Linux with rapidly evolving 
> networking / storage management software.  So every drive becomes an 
> embedded Linux server that can be directly connected to a compute server 
> / desktop or dispersed on the network.

But when quality 1U Opteron servers cost <600 EUR and come with dual GBit NICs,
why should I use embedded servers and further brake things by chaining more

Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
ICBM: 48.07100, 11.36820            http://www.ativel.com
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE

More information about the FoRK mailing list