Integrating distributed computing and the Web...

I Find Karma (
Mon, 1 Jul 96 10:36:03 PDT

Date: Fri, 21 Jun 96 07:15:28 -0700
From: (HPCwire)

by Mark Baker, Geoffrey Fox, Wojtek Furmanski, Syracuse University; HPCwire

by Mark Baker, Geoffrey Fox, Wojtek Furmanski, Syracuse University; HPCwire
Marina Chen, Boston University; Jim Lowie, Cooperating Systems

Syracuse, NY -- Center for Research on Parallel Computation (CRPC)
researchers at the Northeast Parallel Architectures Center (NPAC) at Syracuse
University and collaborators have been developing concepts and prototypes of
"Compute-Webs" over the past year. This work is partly motivated by the
integration of information processing and computation for both a better
programming environment and a natural support of data-intensive computing.
The World Wide Web itself represents the largest available computer, with
some 20 million potential nodes worldwide. This potential is expected to grow
by a factor of 10 as the Information Superhighway is fully deployed.

The group's first prototype was built on compute-extended Web servers using
the standard CGI mechanism. It was successfully applied to the factorization
of the RSA 130 decimal digit number using the latest sieving algorithm, which
was distributed to a net of Web servers and clients in a load-balanced,
fault-tolerant fashion. This work was presented at SUPERCOMPUTING '95 and won
the High-Performance Computing Challenge Award for Most Geographically
Dispersed and Heterogeneous Factoring on the World-Wide Computer in the
Teraflop Challenge contest.

Clearly, the current Web is not the place to explore complex parallel
algorithms with stringent latency and synchronization requirements. The RSA
130 problem was "embarrassingly parallel" and suitable for the high
functionality but modest performance of the Web. There are at least two
natural extensions of this work, MetaWeb and WebFlow, which implement
coarse-grain software integration and are insensitive to the modest bandwidth
and high latency of geographically distributed computing and current HTTP Web

The researchers found that the CGI-enhanced Web servers that supported RSA
130 factoring did not provide the standard support expected from clustered
computing packages. They are designing their new system, MetaWeb, as a
cluster or MetaComputer management system built on top of Web servers.
MetaWeb includes load balancing, fault tolerance, process management,
automatic minimization of job impact on user workstations, security, and
accounting support. There are two immediate examples of advantages of this
Web-based approach: It automatically provides MetaComputing linkage for all
platforms, including Windows as well as UNIX operating systems, and it can
naturally use "real" databases such as DB2/Oracle, which have already been
linked to the Web.

All system and job information will be directly stored in a relational or
object database. Initially, MetaWeb will use CGI scripts linking to existing
Perl or C modules, and eventually migrate to a full Java-based system.
Another important feature of this proposed system is the natural linkage
of scientific data and performance visualization. The group intends to link
University of Illinois CRPC researcher Dan Reed's Pablo performance analysis
environment to the Web compute servers so that users can both store the
performance trace in the associated databases and display them either offline
or in real time using Java applets.

Web technology has evolved dramatically since the group's first RSA 130
project. The group sees the growing role of Java both for servers and
clients, and VRML for visualization and data specification in the
Compute-Webs. They see the low-level Web computing model WebVM as given by a
mesh of interacting computationally extended Web servers that form the base
infrastructure for a variety of high-level programming environments. It
starts from the Intranet domain and current Web technologies, opens up for
new technology insertions via portable transparent module API design, and
gradually builds reliable worldwide scalability.

MetaWeb facilitates this process by adding system/cluster management and
performance visualization support. WebFlow, a natural early high-level
programming environment, imposes a dataflow programming paradigm on top of
WebVM and offers Java-based tools for visual authoring of computational
graphs as well as "little language"-based scripted support for adaptive
programmatic variations of the dataflow topology. The initial application of
WebFlow is to adaptive mesh refinement in a set of projects that includes the
binary black hole grand challenge and environmental simulations from
University of Texas CRPC researcher Mary Wheeler.

WebFlow inherits concepts from previous coarse-grain dataflow-based
computations, popularized by systems such as AVS, Khoros, HENCE, or CODE. The
Web supports dataflow because this is the model by which distributed
information is accessed in the Web client-server model. Furthermore,
the already established Web-based framework for electronic publication can
be extended to support Web publication of software modules within a
standardized plug-and-play interface. WebFlow integrates computing, parallel
I/O, and information such as database and VRML visualization services in this

By augmenting the developing WebVM/WebFlow framework with the solid MetaWeb
system management, and by linking the sites of the WebFlow HPCC developers'
network, this project can provide the foundation for a true Web-based
MetaComputer that can span the globe. This will allow HPCC researchers to
fully leverage the Web's primary strength: universal access to common tools
and standards for computing, authoring, and information. The resulting
environment is very appropriate for supporting the geographically distributed
collaboration and computation envisioned in the current NSF resolicitation of
the supercomputer centers.

For more information about this project, see the NPAC projects Web site For information
about cluster computing packages, see the Cluster Computing Review Web site or the
first issue of the NHSE Review Web site


Reprinted with permission from the Spring 1996 issue of Parallel Computing
Research, Volume 4, Issue 2, the newsletter of the Center for Research on
Parallel Computation.