Re: OpenMP ? [new, portable multiprocessing std]

Ron Resnick (resnick@interlog.com)
Wed, 03 Dec 1997 00:32:07 +0200


At 10:07 AM 10/28/97 -0800, Rohit Khare wrote:
>
>[Who's Steven Wallach? Why now? What kind of portability -- just
>POSIX? thread model, or just IPC? Ach, some days you just want trade
>press... RK]
>

No, presumably this would standardize the various existing MP architectures
like IBM's SP - really just a gang of AIX servers with shared-all memory
access - and other such offerings from other vendors. This goes beyond
anything POSIX has ever talked about, since the SP has OS extensions
and APIs beyond bare Unix, for things like segment locking.

>As a result scalable software for such systems will exist, at some
>level, only in a message passing model. Message passing is the native
>model for these architectures, and higher level models can only be
>built on top of message passing.
>
>Unfortunately, this situation has existed long enough that there is
>now an implicit assumption in the high performance computing world
>that the only way to achieve scalability in parallel software is with
>a message passing programming model. This is not necessarily true.
>There is now an emerging class of multiprocessor architecture with
>scalable hardware support for cache coherence. These are generally
>referred to as Scalable Shared Memory Multiprocessor, or SSMP,
>architectures [1]. For SSMP systems the native programming model is
>shared memory, and message passing is built on top of the shared
>memory model. On such systems, software scalability is straightforward
>to achieve with a shared memory programming model.
>
>[I don't buy it. Shared-memory is a sham in the long-term. Truly
>distributed -- *decentralized* distributed computing -- will require
>the message-passing model in spades. Shmem is not how antcolonies
>work. Sure, today, this is for supercomputing only, but in the
>long-term cheap, slow, and ubiqutous trumps expensive, fast, and
>centralized. Good PR though... -RK]
>

Of course I do agree with you Rohit (this once!).
In the contest between SMPs
(symmetric multiprocessors) and MPPs (massively parallel processors)
versus lowly PC/Ethernet clusters, I go for lowly every time. Now that
crunch-intensive things like prime number factoring are routinely being
done better with spare cycles of lowend machines on the Net
than by dedicated supercomputers,
there is little doubt that the loosely-coupled
distributed model is, in general, the more powerful,
more flexible, more economic one.

Having said all that though, the arguments being made by this Wallach
guy aren't totally braindead. Shared-all and shared-none models are hardly
new. Both can get the job done. Depending on what the job is, one may
be more efficient than the other. Algorithms that have more compute and
less communication are often better with shared-none, and vice versa.

In a loosely-coupled distributed setting, shared-memory models
invariably have to have message-passing underlying them, since the
only way to move bits over loosely-coupled wiring is with packets.
So, Linda, JavaSpaces, (presumably also MS's Millenium)
those kinds of global shared-memory models
all have to get implemented over some kind of message-passing protocol,
and eventually over discrete IP packets. And they're subject to exactly
the same issues message-passing distribution paradigms face: can't
avoid it! Partial failures, partitions, asynchrony & unordered messages-
that's distributed life, for better or for worse.

However, in a more tightly-coupled distributed setting, there are tricks
you can play with the hardware, which presumably is what this OpenMP
is all about. Those 'cache coherence' things he mentions
use inter-machine DMA tactics - essentially, the wiring between
the boxes starts to look more like a bus (it gets synchronously clocked)
and less like a LAN. This really does mean that a pool of common storage
starts to take on more of the properties of true single-machine shared-memory.
Of course, these things have been highly proprietary. Every vendor defines
these 'super-buses' in their own peculiar way. And they only stretch so far.
You can't have synchronized clocks over arbitrary distances, so these clusters
are pretty much stuck in one room, and scale to numbers like 4-way, 8-way,
whatever. They're not what you'd exactly call a 'scalable, distributed
network'.
They're good for what they do, but they're inflexible, not scalable,
proprietary,
expensive -- i.e. a great business for server vendors to get into.

I'm sure you know all this - I'm not trying to patronize you. But since
you quoted his stuff & panned it without adding much detail,
I thought it was fair territory to point these things out.

There will always be classes of problems (much of numerical computing -
for example) for which such SM architectures will be applicable. So a
standard for these systems will really help people who develop for these
kinds of environments. But of course I'm
with you in believing that these cases are the exceptions, not the norm.
For most of everyday information processing
tasks, gobs of tiny, cheap, networked processors with a simple
elegant message-passing protocol at the software level is the way to
go.

Ideally, what I'd like to see is a software model broad enough to span
these ranges of physical deployment - one day running on an Internet's
worth of little embedded watch-battery sized machines, the next on a dedicated
MPP. That software model, I believe, is based on objects. Only objects
let you combine the pieces you want to the endgoal you need, then
recombine them differently tomorrow. Furthermore, if those objects all
share a common binary model & messaging model, standard services &
class libraries, and are 'open' (free to license, free to use, etc.), have
introspection and serialization capabilities, an integrated security and
concurrency model - well, then I guess we'd have Java :-).

Ron