[FoRK] Amazon S3 storage service
Eugen Leitl <
eugen at leitl.org
> on >
Thu Mar 16 03:27:42 PST 2006
On Tue, Mar 14, 2006 at 10:04:23PM -0500, Stephen D. Williams wrote:
> I know they are in an entirely different price range. So why is the EMC
> stuff such a pain with catastrophic failures that take tons of effort to
> unravel and the scalable stuff like Google's 200,000 servers/drives
> apparently so easy?
It's not easy -- managing a distributed system of some 200000 nodes takes
some good systems and processes. Also, the core infrastructure
(GFS) is basically read-only.
Things are easier for a real FS on a small scale with a good
interconnect (10 GBit Ethernet or InfiniBand).
> Obviously using more appropriate data / processing / communication /
> management models is a big win. Pushing everything through narrow fibre
> channels to a large SAN is obviously less total bandwidth than each
> drive having a fast path to the CPU, even legacy IDE does well there.
Software RAID is more cost-effective on unloaded machines, because
modern CPUs outperform but the most expensive hardware RAID with only
a moderate load. Also, a caching fs keeps data close to the CPU
where they're needed.
> Trying to fix bandwidth to a SAN with an expensive and complicated
> multi-channel multiplexer just widens the gap.
It is also a reliability issue. You can address reliability with
diamond-studded platinum hardware, and a horde of support gnomes
on standby, but it will still fail. Admittedly, unobtainium components
fail far less often than their not-so-glamorous counterparts.
I'd rather live with a diagnostic and failover/healing infrastructure
built around reasonably stable (enterprise-grade SATA) components,
and buy more infrastructure for the saved premium.
> In between are things like iSCSI and distributed filesystems.
> In the long run, we need cheap, independent drives that minimize power,
I distinctly hope that we'd be soon getting solid-state drives with sub-us
latency instead of today's ms -- as somebody recently observed today's
large drives are very much like tape cartridges.
> have very fast networking through a combination of daisy chaining and
> network switches, and have a general purpose onboard IO processor with
I have never understood why nobody builds Ethernet drives. (Yes,
I know, but Seagate doesn't sell them, and you pay through the
nose for shelves and cases).
> plenty of cache that people can load Linux with rapidly evolving
> networking / storage management software. So every drive becomes an
> embedded Linux server that can be directly connected to a compute server
> / desktop or dispersed on the network.
But when quality 1U Opteron servers cost <600 EUR and come with dual GBit NICs,
why should I use embedded servers and further brake things by chaining more
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
ICBM: 48.07100, 11.36820 http://www.ativel.com
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
More information about the FoRK