[FoRK] Software hacks using timestamp counters
eugen at leitl.org
Mon Oct 1 03:33:57 PDT 2012
On Sun, Sep 30, 2012 at 11:53:47PM -0700, Stephen D. Williams wrote:
>> Everything is single-process again with minimal synchronization. Very 1990s but on much faster silicon.
> When you have a single item of work with interrelated data and stages
> that must use multiple cores simultaneously, thinking purely
There are no longer cores, but nodes on a fabric. No shared memory.
Everything must be by explicit message passing. If it's to be quick,
it must have no serial section.
> single-thread / single-core doesn't solve the problem. One way to
> differentiate multi-thread usage is whether threads are used
> indiscriminately (kill the hardware and let the scheduler sort it out) or
> as a way to directly manage multi-core use. The latter case can best be
You're talking SMP and shared memory space. Doesn't scale much beyond
64 cores, tops.
> handled by creating a thread pool where you have a single thread for each
> core. (And maybe a small number of admin threads that run occasionally,
> like when all work is done.) Then each of those threads can operate on
> independent and/or shared work queues. Done properly, there is almost no
> locking, memory contention, sleeping, or system calls.
You can't lock other by way of waiting for a message to arrive.
Memory contention is already a problem with single cores, hence
the need for stacked memory. DRAM will go >100 GByte/s shortly,
as long as each node has ~GByte.
I can tell you the developers don't realize yet that threading
will desert them rather soon.
Need to do MPI-like and OpenCL stuff for speed isn't on their
More information about the FoRK