[FoRK] gmail

Gordon Mohr gojomo at usa.net
Thu Apr 1 20:39:44 PST 2004


daniel grisinger wrote:
> James Tauber wrote:
> 
>> So is it official yet that Google's GMail is *not* a joke?
> 
> 
> looks like it's real.  http://www.gmail.com/ responds, and
> forbes magazine is reporting that while the lunar jobs
> were a joke, the new mail service is not.
> 
> how they are going to manage the data storage requirements
> is beyond me.  every million users who hit their storage
> limit represent a full petabyte of data, at internet scale
> that is going to add up very quickly.

It's highly compressible text, and I betcha the 1GB is
measured in uncompressed data.

They can also take advantage of message body redundancy.
If we're both Gmail customers, and I send you a message,
both my 'Sent' message and your 'Inbox' message can be
a single copy. If I send a list message to 100 Gmail
customers, more the savings.

And don't you think they're already dealing with such
magnitudes of data with their web cache? It's been said
Google has over 100,000 spinning commodity PCs as part
of their operations. Even if each has only 250GB of disk,
which seems plausible, they're already slinging 25
petabytes or more.

At the Internet Archive, the current white-box servers
have 4x300GB disks, for 1.2 terabytes per machine. As
part of the IA "petabox" project [1], a 1U half-depth
rackserver with 4 IDE drives is being developed. Hitachi
this month began shipping 400GB IDE drives [2]. So a
single 80-machine rack in the style suggested by the
"Google Cluster Architecture" paper [3] could provide
storage (if not full service) for over 100,000 Gmail
users, even without compression or message-body sharing.

They've already got over a thousand similar racks, what's
another thousand more to support a hundred million
email accounts?

- Gordon

[1] http://www.petabox.org
[2] http://news.com.com/2100-1015_3-5171944.html
[3] http://www.computer.org/micro/mi2003/m2022.pdf


More information about the FoRK mailing list