[FoRK] Joyent, cloud service evolution (J. Andrew Rogers)
Gregory Alan Bolcer
greg at bolcer.org
Wed Jun 22 10:20:27 PDT 2016
Data comes in all shapes and sizes. One of the best breakdowns I've seen.
As for small databases? Go big or go home. ;-)
On Tue, Jun 21, 2016 at 9:52 PM, Ken Meltsner <meltsner at alum.mit.edu> wrote:
> The complicated part is when the huge stream of data can't be
> processed in smaller, independent streams. Counting the number of
> failed logins per second for some huge service, perhaps, if there's an
> attack going on using many different IDs and endpoint addresses at the
> same time. [Might not be the best example, but all I could come up
> with off the top of my head.]
> Traditional databases for big concurrent systems like airline
> reservations worked extra hard not to sell the same seat twice on a
> given plane; I believe most modern ecommerce systems are "optimistic"
> -- assume enough widgets are available and apologize later if you run
> out for a couple of customers -- but this simplifies locking and
> significantly improves performance. Insisting on fully transactional,
> distributed locks made a two-server DB that I used to work on (not at
> the current employer) slower than a single server; for some
> applications, even three servers would still be slower than a single
> one because the distributed locks were so expensive.
> You can assume that the databases will be "eventually consistent" and
> get much better performance, especially if it's unlikely that multiple
> users will be competing for the same items; at the bleeding edge (not
> sure how much has been published), the databases proceed
> optimistically, but can back out of transactions that failed due to
> conflicts relatively long after those were committed.
> That's my limited understanding of the situation -- most of the
> methods JAR deprecates are pretty close to the state of the art for
> mid-range commercial products, so I'm curious how far the new stuff
> has gotten beyond the lab -- does something like
> https://www.cockroachlabs.com/ count as The Old Show or Next
> Ken Meltsner
> On Tue, Jun 21, 2016 at 5:50 PM, Stephen D. Williams <sdw at lig.net> wrote:
> > That's what I expect in a lot of cases: tiers of communications
> concentrators / condensers / filters, then parallel I/O channels to
> > some kind of storage.
> > In some cases, you may not even want to store this kind of data on
> secondary storage: fan it into high speed data channels into a
> > big, fast memory. Then do operations on that. As an image even. So
> you can use GPUs. Replicate it a few times for different
> > kinds of analysis.
> > sdw
> > On 6/21/16 2:46 PM, Marty Halvorson wrote:
> >> After reading all this, I wonder what you'll think of how the CERN LHC
> collects data?
> >> Each experiment generates 10's of trillions of bit's per second. A
> reduction of up to 2 orders of magnitude is done in hardware,
> >> the result is sent via very high speed pipes to a huge number of PC's
> where a further reduction of 1 or 2 magnitudes is done.
> >> Finally, the data is stored for research purposes usually done on
> collections of supercomputers.
> >> Thanks,
> >> Marty
> > _______________________________________________
> > FoRK mailing list
> > http://xent.com/mailman/listinfo/fork
> After 30+ years of email, I have used up my supply of clever ,sig material.
> FoRK mailing list
greg at bolcer.org, http://bolcer.org, c: +1.714.928.5476
More information about the FoRK