[FoRK] Big Data

Koen Holtman k.holtman at chello.nl
Sat Feb 4 05:19:02 PST 2012

I was in physics data grids back in the early 00's.

The nice thing about big physics data is that it is big for solid
technical reasons.  All the experiments I was involved with had detailed
technical calculations about the required dimensions of their data storage
systems.  Build it any smaller and your chances of discovering something
new drop to zero.  Build it much larger and you are just wasting money.

In big data about humans, there is no such technical argument about how
big the data needs to be.  There is just the assumption that more data
inevitably leads to more knowledge, more profitable ad placement, a more
efficient economy, etc.  

It feels very much like a cargo cult.

I suspect that for many of these investor-supported and
advertiser-supported big data sets, the point of diminishing returns was
passed a long time ago.  The more perceptive high priests might know this
already, but if they do they are not telling.



On Fri, 3 Feb 2012, Gregory Alan Bolcer wrote:

> Yeah, but physic stuff is boring.  Big data on "human" behaviors is so 
> much more interesting.
> Greg
> On 2/3/2012 9:37 AM, Joseph S. Barrera III wrote:
> > At SLAC/LCLS we do Big Data but not "Cloud".
> >
> > (psexport01.~/release)$ df -h | grep "/data\|Used"
> > Filesystem Size Used Avail Use% Mounted on
> > 860T 758T 59T 93% /reg/data/ana11
> > 860T 572T 245T 71% /reg/data/ana01
> > 430T 238T 171T 59% /reg/data/ana02
> > 899T 145T 755T 17% /reg/data/ana12
> >
> > http://today.slac.stanford.edu/feature/2010/lcls-pcds.asp
> >
> > Not much compared to CERN, of course...
> >
> > http://press.web.cern.ch/public/en/LHC/Computing-en.html
> >
> > _______________________________________________
> > FoRK mailing list
> > http://xent.com/mailman/listinfo/fork
> >

More information about the FoRK mailing list