[FoRK] Joyent, cloud service evolution (J. Andrew Rogers)
meltsner at alum.mit.edu
Tue Jun 21 21:52:01 PDT 2016
The complicated part is when the huge stream of data can't be
processed in smaller, independent streams. Counting the number of
failed logins per second for some huge service, perhaps, if there's an
attack going on using many different IDs and endpoint addresses at the
same time. [Might not be the best example, but all I could come up
with off the top of my head.]
Traditional databases for big concurrent systems like airline
reservations worked extra hard not to sell the same seat twice on a
given plane; I believe most modern ecommerce systems are "optimistic"
-- assume enough widgets are available and apologize later if you run
out for a couple of customers -- but this simplifies locking and
significantly improves performance. Insisting on fully transactional,
distributed locks made a two-server DB that I used to work on (not at
the current employer) slower than a single server; for some
applications, even three servers would still be slower than a single
one because the distributed locks were so expensive.
You can assume that the databases will be "eventually consistent" and
get much better performance, especially if it's unlikely that multiple
users will be competing for the same items; at the bleeding edge (not
sure how much has been published), the databases proceed
optimistically, but can back out of transactions that failed due to
conflicts relatively long after those were committed.
That's my limited understanding of the situation -- most of the
methods JAR deprecates are pretty close to the state of the art for
mid-range commercial products, so I'm curious how far the new stuff
has gotten beyond the lab -- does something like
https://www.cockroachlabs.com/ count as The Old Show or Next
On Tue, Jun 21, 2016 at 5:50 PM, Stephen D. Williams <sdw at lig.net> wrote:
> That's what I expect in a lot of cases: tiers of communications concentrators / condensers / filters, then parallel I/O channels to
> some kind of storage.
> In some cases, you may not even want to store this kind of data on secondary storage: fan it into high speed data channels into a
> big, fast memory. Then do operations on that. As an image even. So you can use GPUs. Replicate it a few times for different
> kinds of analysis.
> On 6/21/16 2:46 PM, Marty Halvorson wrote:
>> After reading all this, I wonder what you'll think of how the CERN LHC collects data?
>> Each experiment generates 10's of trillions of bit's per second. A reduction of up to 2 orders of magnitude is done in hardware,
>> the result is sent via very high speed pipes to a huge number of PC's where a further reduction of 1 or 2 magnitudes is done.
>> Finally, the data is stored for research purposes usually done on collections of supercomputers.
> FoRK mailing list
After 30+ years of email, I have used up my supply of clever ,sig material.
More information about the FoRK