[FoRK] Software hacks using timestamp counters

Eugen Leitl eugen at leitl.org
Mon Oct 1 03:33:57 PDT 2012

On Sun, Sep 30, 2012 at 11:53:47PM -0700, Stephen D. Williams wrote:

>> Everything is single-process again with minimal synchronization. Very 1990s but on much faster silicon.
> When you have a single item of work with interrelated data and stages 
> that must use multiple cores simultaneously, thinking purely  

There are no longer cores, but nodes on a fabric. No shared memory.
Everything must be by explicit message passing. If it's to be quick,
it must have no serial section.

> single-thread / single-core doesn't solve the problem.  One way to 
> differentiate multi-thread usage is whether threads are used  
> indiscriminately (kill the hardware and let the scheduler sort it out) or 
> as a way to directly manage multi-core use.  The latter case can best be 

You're talking SMP and shared memory space. Doesn't scale much beyond
64 cores, tops.

> handled by creating a thread pool where you have a single thread for each 
> core.  (And maybe a small number of admin threads that run occasionally, 
> like when all work is done.)  Then each of those threads can operate on 
> independent and/or shared work queues.  Done properly, there is almost no 
> locking, memory contention, sleeping, or system calls.

You can't lock other by way of waiting for a message to arrive.
Memory contention is already a problem with single cores, hence
the need for stacked memory. DRAM will go >100 GByte/s shortly,
as long as each node has ~GByte.

I can tell you the developers don't realize yet that threading
will desert them rather soon.

Need to do MPI-like and OpenCL stuff for speed isn't on their
radar yet.

More information about the FoRK mailing list