[FoRK] Grid Computing + Web Services
J. Andrew Rogers <
andrew at ceruleansystems.com
> on >
Mon Oct 30 22:20:23 PST 2006
On Oct 30, 2006, at 6:56 PM, Stephen D. Williams wrote:
> All auctions on a single instance? What stops each auction from
> being on its own server? Each auction would reasonably have a
> single instance, but you could partition auctions among servers in
> any way that is convenient. They may have a scalability problem
> with a single auction that is extremely popular, but since they use
> communication concentrators (web servers that run application
> client code), and auctions are very simple things, I know they
> could handle tens of thousands of transactions per second on a
> reasonable machine.
Tens of thousands of transactions per second? Either you are talking
about some very exotic hardware or really meant *transactions per
minute*, which is also commonly used (e.g. TPC). Current typical
four core server hardware will retire around 500 transactions per
second sustained with a well engineered app.
By "single instance" I did not mean a database, I meant that you do
not have a distributed write load for single objects (e.g. a single
auction). In effect, the auction synchronizes on a single physical
row rather than multiple live copies of the same row. Distributed
instances can be done, but only for availability/durability because
it does bad things to transaction throughput.
> Search indexes at eBay even near the beginning were internally
> cached and refreshed periodically, on the order of minutes whenever
> I checked it.
That makes sense, since strict consistency in the search/index
servers would put a lot of extra load on the auction servers. I did
not remember them being very strictly consistent, which seems like a
very economical constraint to lose considering that it does not
materially affect app functionality.
>> There are two cases, and relatively common ones at that, where it
>> gets ugly:
>> - Data domains that do not have a trivial or "nice" decomposition
>> or partitioning
>> - Applications that require strict consistency guarantees of
>> various types from end-to-end
> At a high level, these are true. The more I look at it, the more I
> see that A) often mistakes have been made in data/semantic
> architecture and B) there are often multiple ways to meet
> consistency guarantees.
I don't disagree. Probably the single best way to distribute loads
in a consistent way is to do implement something sort of like a
distributed multi-versioning protocol where A) any single user always
gets a consistent state though not necessarily the same as other
users, and B) the system at large can guarantee that a globally
consistent state is eventually attainable. However, these
implementations get hairy once you start talking about really massive
systems, reliable data, and availability guarantees.
> Yes, that's a tough one, but fairly exclusive to financial market
> systems, don't you think? Still, you have described a push only
> system which is fairly easy to replicate and coordinate so that
> queries can be processed over striped servers.
Financial markets? Not even close. More like meteorological data,
syndicated news, infectious disease data, etc. All in near real-
time. There are a huge number of applications that we *could* build
in theory that take advantage of vast, rich data sets lying around if
there was an infrastructure that was capable of supporting it. For
many very good reasons, strict consistency and durability guarantees
are required -- bad things could happen otherwise. We use a lot of
this type of data now for non-critical and/or non-real-time uses, but
that completely ignores the arguably more important market for the
same data when it is critical and real-time.
These are "push" type systems in a sense, but with millions of
subscribers with very complex constraints on what they actually see
and no trivial way of partitioning those constraints without a lot of
seemingly unnecessary and expensive brute force. It could also be
framed as "pull" depending on how you want to look at it.
J. Andrew Rogers
More information about the FoRK