[FoRK] Grid Computing + Web Services

J. Andrew Rogers < andrew at ceruleansystems.com > on > Mon Oct 30 14:31:30 PST 2006

On Oct 30, 2006, at 4:35 AM, Stephen D. Williams wrote:
> Nearly all problems are embarrassingly parallel and therefore  
> partitionable.  Ebay (per auction), search (per term, domain),  
> MySpace (per profile), Amazon (per item), reservations (per trip),  
> banking (per account), etc.
> There are various amounts of cross communication needed, with  
> presence being one of the most cross-chatty.

The devil is in the details.  Partitioning is so obvious that  
everyone should be doing it, but once you get into the details of  
implementation it starts to become clear why it is not a simple  
answer to all problems.  Let's use Ebay as the example case, since  
they are hypothetically screwing it up.  One of the big semantic  
differences between Ebay and Google is that Ebay requires moderately  
pervasive consistency and isolation, while Google does not.

Partitioning the data space is trivial on the surface, but  
partitioning the computational load is not if you require any kind of  
consistency and isolation guarantees.  Data partitioning can reduce  
some of this but if you are working with a large number of small  
machines, the number of transactions that can be retired per second  
per machine is fixed and small-ish.  The net result is that much  
smaller transaction spikes will put the system out of spec than if  
you were using much larger hardware to buffer the transaction load  
across the auctions.  These is even *worse* if you need soft  
execution time guarantees; some of the 3PC protocols companies like  
Google use to distribute computational loads on the same logical  
object are inherently vulnerable to race conditions that are handled  
with an indefinite number of transaction aborts and restarts  
operating at network latencies.  Just fine for some applications,  
unacceptable for others.

I assume that Ebay synchronizes all auctions on a single instance  
since, ironically, they need better scalability than a distributed  
auction instance will provide given their requirements.  If they had  
a limited number of auctions per server this would probably work just  
fine for most plausible cases, though far more costly than anything  
that Google does.

Ebay is more or less fitting their internal architecture to the  
problem.  They can massively distribute their system by partitioning  
the data space, but they still need to synchronize on a single data  
instance and the semantics of their application is significantly more  
expensive to scale.  In practice, you handle this with a bunch of  
large-ish machines.  Fortunately, as you point out, their data space  
actually decomposes relatively nicely and they can get away with some  
inconsistency in other areas (like search indexes), giving them some  
wiggle room despite the relatively strict transaction semantics  
required for the auction itself.

There are two cases, and relatively common ones at that, where it  
gets ugly:

- Data domains that do not have a trivial or "nice" decomposition or  
- Applications that require strict consistency guarantees of various  
types from end-to-end

>> A number of people and organizations have hacked on PostgreSQL for  
>> clustering purposes (including yours truly), but it runs up  
>> against the same walls that Oracle does.  This is not a limitation  
>> of the software but of what is allowed in theory.  For *some*  
>> applications you can get away with more by loosening *some*  
>> constraints on the semantics, but for many applications that is a  
>> non-starter.  You can get a lot of mileage by very cleverly  
>> reducing the number of threads synchronizing on a single resource,  
>> but for some applications that number will always be difficult to  
>> scale.
> Can you think of a good, hard example?

(A bit of an ambiguous reference...)

I think our apparent disagreement is around the typical semantics  
that are required for many applications.  What you can get away with  
when massively distributing an application is very much dependent on  
how limited the transaction semantics are that have to be supported.   
For some, like text search, the transaction semantics are extremely  
limited and so it massively distributes very easily.  For others,  
stricter transaction semantics are required that effectively  
constrain the plausible design space e.g. Ebay.  Many of the "easy"  
limited semantic applications have already been done, but there is a  
reason massively distributed apps with stricter semantics are not  
ubiquitous (generally non-existent in fact).  If you start doing  
things like putting strict guarantees on massive shared resources  
(like search indexes) or significantly ratchet up the write load or  
timeliness requirements, it gets difficult.

Here is an example of the kinds of applications that people are  
trying to implement in a massively distributed system today that have  
strict semantics that go beyond the kind of apps that Ebay and Google  
are doing (so far):

Use as the data domain 4D geometries, a mid-sized data set size, and  
rapidly changing time-sensitive data but within the bounds of  
relatively manageable transaction loads.  Very high data availability  
and reliability -- if a server or two fail catastrophically, it  
should be no major loss.  Assume a 100k simultaneous complex read  
queries against the data with soft execution time guarantees as a  
function of the number of users; there will necessarily be many  
copies of the data.  The data must be perfectly consistent across  
queries and no data can be lost or temporarily unavailable, either  
from the system or from the perspective of a user running queries  
against the system.   Geographic distribution; assume the network  
latency is significant.

This has both of the ugly problems: the data does not neatly  
partition, and the semantics are pretty strict (though not strictly  
ACID either) for a system that must scale well beyond the capacity of  
a single machine.  To the best of my knowledge, Google's architecture  
cannot support both the semantics and guarantees of this  
application.  This application can be built, but it requires  
designing a general architecture specifically fitted to the  
specifications.  This same architecture has a lot of drawbacks that  
make it economically inefficient for implementing something like  
Google.  For some other applications, the semantics are strict but  
the specifications are completely orthogonal to either Google or the  
above example, leading to yet another significantly different  

My main point is that there is no general architecture that will work  
for most applications, massively distributing an app in an efficient  
manner requires matching the broader architecture to the transaction  
semantics inherent in the application.  A poorly matched architecture  
won't scale well, putting things back where they began.  There might  
be a middle ground, but stricter semantics usually require more  
customization and that is where the interesting action is these days  


J. Andrew Rogers

More information about the FoRK mailing list