[FoRK] Long anticipated,
the arrival of radically restructured database
architectures is now finally at hand.
Stephen D. Williams
sdw at lig.net
Thu May 5 21:51:10 PDT 2005
ACM Queue article by Jim Gray and Mark Compton, based on recent
presentations by Stonebraker et al.
This overlaps a lot of what I've been thinking and saying and it ties it
together nicely with various database and related trends. This is a
useful conceptual map to database, distributed processing, and data
concepts that are inspiring rapid innovation. It sure seems like the end
of a long cold winter of contentment with plain old RDBMS/SQL.
A Call to Arms
ACM Queue vol. 3, no. 3 - April 2005
by JIM GRAY, MICROSOFT
MARK COMPTON, CONSULTANT
Long anticipated, the arrival of radically restructured database
architectures is now finally at hand.
Avalanche of Information
We live in a time of extreme change, much of it precipitated by an
avalanche of information that otherwise threatens to swallow us whole.
Under the mounting onslaught, our traditional relational database
constructs—always cumbersome at best—are now clearly at risk of
In fact, rarely do you find a DBMS anymore that doesn’t make provisions
for online analytic processing. Decision trees, Bayes nets, clustering,
and time-series analysis have also become part of the standard package,
with allowances for additional algorithms yet to come. Also, text,
temporal, and spatial data access methods have been added—along with
associated probabilistic logic, since a growing number of applications
call for approximated results. Column stores, which store data
column-wise rather than record-wise, have enjoyed a rebirth, mostly to
accommodate sparse tables, as well as to optimize bandwidth.
Is it any wonder classic relational database architectures are slowly
sagging to their knees?
But wait… there’s more! A growing number of application developers
believe XML and XQuery should be treated as our primary data structure
and access pattern, respectively. At minimum, database systems will need
to accommodate that perspective. Also, as external data increasingly
arrives as streams to be compared with historical data,
stream-processing operators are of necessity being added.
Publish/subscribe systems contribute further to the challenge by
inverting the traditional data/query ratios, requiring that incoming
data be compared against millions of queries instead of queries being
used to search through millions of records. Meanwhile, disk and memory
capacities are growing significantly faster than corresponding
capabilities for reducing latency and ensuring ample bandwidth.
Accordingly, the modern database system increasingly depends on massive
main memory and sequential disk access.
This all will require a new, much more dynamic query optimization
strategy as we move forward. It will have to be a strategy that’s
readily adaptable to changing conditions and preferences. The option of
cleaving to some static plan has simply become untenable. Also note that
we’ll need to account for intelligence increasingly migrating to the
network periphery. Each disk and sensor will effectively be able to
function as a competent database machine. As with all other database
systems, each of these devices will also need to be self-managing,
self-healing, and always-up.
No doubt, you’re starting to get the drift here. As must be obvious by
now, we truly have our work cut out for us to make all of this happen.
That said, we don’t have much of a choice—external forces are driving
WELCOME TO THE REVOLUTIONS
The rise, fall, and rise of object-relational databases. We’ve enjoyed
an extended period of relative stasis where our agenda has amounted to
little more than “implement SQL better”—Michael Stonebraker, professor
at UC Berkeley and developer of Ingres, calls this the
“polishing-the-round-ball” era of database research and implementation.
Well, friends, those days are over now because databases have become the
vehicles of choice for delivering integrated application development
That’s to say that data and procedures are being joined at long
last—albeit perhaps by a shotgun wedding. The Java or common language
runtimes have been married to relational database engines so that the
traditional EJB-SQL outside-inside split has been eliminated. Now Beans
or business logic can run inside the database. Active databases are the
result, and they offer tremendous potential—both for good and for ill.
(More about this as the discussion proceeds.) Our most immediate
challenge, however, stems from traditional relational databases never
being designed to allow for the commingling of data and algorithms.
The problem starts, of course, with Cobol, with its data division and
procedure division—and the separate committees that were formed to
define each. Conway’s law was at work here: “Systems reflect the
organizations that built them.” So, the database community inherited the
artificial data-program segregation from the Cobol DBTG (Database Task
Group). In effect, databases were separated from their procedural twin
at birth and have been struggling ever since to reunite—for nearly 40
The reunification process began in earnest back in the mid-1980s when
stored procedures were added to SQL (with all humble gratitude here to
those brave, hardy souls at Britton-Lee and Sybase). This was soon
followed by a proliferation of object-relational databases. By the
mid-1990s, even many of the most notable SQL vendors were adding objects
to their own systems. Even though all of these efforts were fine in
their own right—and despite a wave of over-exuberant claims of
“object-oriented databases uber alles” by various industry wags—each one
in turn proved to be doomed by the same fundamental flaw: the high risks
inherently associated with all de novo language designs.
It also didn’t help that most of the languages built into the early
object-relational databases were absolutely dreadful. Happily, there now
are several good object-oriented languages that are well implemented and
offer excellent development environments. Java and C# are two good
examples. Significantly, one of the signature characteristics of this
most recent generation of OO environments is that they provide a common
language runtime capable of supporting good performance for nearly all
The really big news here is that these languages have also been fully
integrated into the current crop of object-relational databases. The
runtimes have actually been added to the database engines themselves
such that one can now write database stored procedures (modules), while
at the same time defining database objects as classes within these
languages. With data encapsulated in classes, you’re suddenly able to
actually program and debug SQL, using a persistent programming
environment that looks reassuringly familiar since it’s an extension of
either Java or C#. And yes, pause just a moment to consider: SQL
programming comes complete with version control, debugging, exception
handling, and all the other functionality you would typically associate
with a fully productive development environment. SQLJ, a nice
integration of SQL and Java, is already available—and even better ideas
are in the pipeline.
The beauty of this, of course, is that the whole
inside-the-database/outside-the-database dichotomy that we’ve been
wrestling with over these past 40 years is becoming a thing of the past.
Now, fields are objects (values or references); records are vectors of
objects (fields); and tables are sequences of record objects. Databases,
in turn, are transforming into collections of tables. This objectified
view of database systems gives tremendous leverage—enough to enable many
of the other revolutions we’re about to discuss. That’s because, with
this new perspective, we gain a powerful way to structure and modularize
systems, especially the database systems themselves.
Clean, Object-Oriented Programming Model
A clean, object-oriented programming model also increases the power and
potential of database triggers, while at the same time making it much
easier to construct and debug triggers. This can be a two-edged sword,
however. As the database equivalent of rules-based programming, triggers
are controversial—with plenty of detractors as well as proponents. Those
who hate triggers and the active databases they enable will probably not
be swayed by the argument that a better language foundation is now
available. For those who are believers in active databases, however, it
should be much easier to build systems.
None of this would even be possible were it not for the fact that
database system architecture has been increasingly modularized and
rationalized over the years. This inherent modularity now enables the
integration of databases with language runtimes, along with all the
various other ongoing revolutions that are set forth in the pages that
follow—every last one of them implemented as extensions to the core data
TPlite rides again. Databases are encapsulated by business logic,
which—before the advent of stored procedures—always ran in the TP
(transaction processing) monitor. It was situated right in the middle of
classic triple-tier presentation/application/data architectures.
With the introduction of stored procedures, the TP monitors themselves
were disintermediated by two-tier client/server architectures. Then, the
pendulum swung back as three-tier architectures returned to center stage
with the emergence of Web servers and HTTP—in part to handle protocol
conversion between HTTP and the database client/server protocols and to
provide presentation services (HTML) right on the Web servers, but also
to provide the execution environment for the EJB or COM business objects.
As e-commerce has continued to evolve, most Web clients have transformed
such that today, in place of browsers blindly displaying whatever the
server delivers, you tend to find client application programs—often in
communicate with the server. Although most e-commerce clients continue
to screen-scrape so as to extract data from Web pages, there is growing
use of XML Web services as a way to deliver data to fat-client
applications. Just so, even though most Web services today continue to
be delivered by classic Web servers (Apache and Microsoft IIS, for
example), database systems are starting to listen to port 80 and
directly offer SOAP invocation interfaces. In this brave new world, you
can take a class—or a stored procedure that has been implemented within
the database system—and publish it on the Internet as a Web service
(with the WSDL interface definition, DISCO discovery, UDDI registration,
and SOAP call stubs all being generated automatically). So, the “TPlite”
client/server model is making a comeback, if you want it.
Application developers still have the three-tier and n-tier design
options available to them, but now two-tier is an option again. For many
applications, the simplicity of the client/server approach is
understandably attractive. Still, security concerns—bearing in mind that
databases offer vast attack surfaces—will likely lead many designers to
opt for three-tier server architectures that allow only Web servers in
the demilitarized zone and database systems to be safely tucked away
behind these Web servers on private networks.
Still, the mind spins with all the possibilities our suddenly broadened
horizons seem to offer. For one thing, is it now possible—or even
likely—that Web services will end up being the means by which we
federate heterogeneous database systems? This is by no means certain,
but it is intriguing enough to have spawned a considerable amount of
research activity. Among the fundamental questions in play are: What is
the right object model for a database? What is the right way to
represent information on the wire? How do schemas work on the Internet?
Accordingly, how might schemas be expected to evolve? How best to find
data and/or databases over the Internet?
In all likelihood, you’re starting to appreciate that the ride ahead is
apt to get a bit bumpy. Strap on your seatbelt. You don’t even know the
half of it yet.
Making sense of workflows. Because the Internet is a loosely coupled
federation of servers and clients, it’s just a fact of life that clients
will occasionally be disconnected. It’s also a fact that they must be
able to continue functioning throughout these interruptions. This
suggests that, rather than tightly coupled, RPC-based applications,
software for the Internet must be constructed as asynchronous tasks;
they must be structured as workflows enabled by multiple autonomous
agents. To get a better feel for these design issues, think e-mail,
where users expect to be able to read and send mail even when they’re
not connected to the network.
All major database systems now incorporate queuing mechanisms that make
it easy to define queues, to queue and de-queue messages, to attach
triggers to queues, and to dispatch the tasks that the queues are
responsible for driving. Also, with the addition of good programming
environments to database systems, it’s now much easier and more natural
to make liberal use of queues. The ability to publish queues as Web
services is just another fairly obvious advantage. But we find ourselves
facing some more contentious matters because—with all these new
capabilities—queues inevitably are being used for more than simple ACID
(atomic, consistent, isolated, durable) transactions. Most particularly,
the tendency is to implement publish/subscribe and workflow systems on
top of the basic queuing system. Ideas about how best to handle
workflows and notifications are still controversial—and the focus of
The key question facing researchers is how to structure workflows.
Frankly, a general solution to this problem has eluded us for several
decades. Because of the current immediacy of the problem, however, we
can expect to see plenty of solutions in the near future. Out of all
that, some clear design patterns are sure to emerge, which should then
lead us to the research challenge: characterizing those design patterns.
Building on the data cube abstraction. Early relational database systems
used indices as table replicas to enable vertical partitioning,
associative search, and convenient data ordering. Database optimizers
and executors use semi-join operations on these structures to run common
queries on covering indices, thus realizing huge performance improvements.
Over the years, these early ideas evolved into materialized views (often
maintained by triggers) that extended well beyond simple covering
indices to enable accelerated access to star and snowflake data
organizations. In the 1990s, we took another step forward by identifying
the “data cube” OLAP (online analytic processing) pattern whereby data
is aggregated along many different dimensions at once. Researchers
developed algorithms for automating cube design and implementation that
have proven to be both elegant and efficient—so much so that cubes for
multi-terabyte fact tables can now be represented in just a few
gigabytes. Virtually all major database engines already rely upon these
algorithms. But that hardly signals an end to innovation. In fact, a
considerable amount of research is now being devoted to this area, so we
can look forward to much better cube querying and visualizing tools.
Advent of Data Mining
One step closer to knowledge. To see where we stand overall on our
evolutionary journey, it might be said that along the slow climb from
data to knowledge to wisdom, we’re only now making our way into the
knowledge realm. The advent of data mining represented our first step
into that domain.
Now the time has come to build on those earliest mining efforts.
Already, we’ve discovered how to embrace and extend machine learning
through clustering, Bayes nets, neural nets, time-series analysis, and
the like. Our next step is to create a learning table (labeled T for
this discussion). The system can be instructed to learn columns x, y,
and z from attributes a, b, and c—or, alternatively, to cluster
attributes a, b, and c or perhaps even treat a as a time stamp for b.
Then, with the addition of training data into learning table T, some
data-mining algorithm builds a decision tree or Bayes net or time-series
model for our dataset. The training data that we need can be obtained
using the database systems’ already well-understood create/insert
metaphor and its extract-transform-load tools. On the output side, we
can ask the system at any point to display our data model as an XML
document so it can be rendered graphically for easier human
comprehension—but the real power is that the model can be used as both a
data generator (“Show me likely customers,”) and as a tester (“Is Mary a
likely customer?”). That is, given a key a, b, c, the model can return
the x, y, z values—along with the associated probabilities of each.
Conversely, T can evaluate the probability of some given value being
correct. The significance in all this is that this is just the start. It
is now up to the machine-learning community to add even better
machine-learning algorithms to this framework. We can expect great
strides in this area in the coming decade.
The research challenges most immediately ahead have to do with the need
for better mining algorithms, as well as for better techniques for
determining probabilistic and approximate answers (we’ll consider this
further in a moment).
Born-again column stores. Increasingly, one runs across tables that
incorporate thousands of columns, typically because some particular
object in the table features thousands of measured attributes. Not
infrequently, many of the values in these tables prove to be null. For
example, an LDAP object requires only seven attributes, while defining
another 1,000 optional attributes.
Although it can be quite convenient to think of each object as a row in
a table, actually representing them that way would be highly
inefficient—both in terms of space and bandwidth. Classic relational
systems generally represent each row as a vector of values, even in
those instances where the rows are null. Sparse tables created using
this row-store approach tend to be quite large and only sparsely
populated with information.
One approach to storing sparse data is to plot it according to three
characteristics: key, attribute, and value. This allows for
extraordinary compression, often as a bitmap, which can have the effect
of reducing query times by orders of magnitude—thus enabling a wealth of
new optimization possibilities. Although these ideas first emerged in
Adabase and Model204 in the early 1970s, they’re currently enjoying a
The challenge for researchers now is to develop automatic algorithms for
the physical design of column stores, as well as for efficiently
updating and searching them.
Dealing with messy data types. Historically, the database community has
insulated itself from the information retrieval community, preferring to
remain blissfully unaware of everything having to do with messy data
types such as time and space. (Mind you, not everyone in the database
community has stuck his or her head in the sand—just most of us.) As it
turns out, of course, we did have our work cut out for us just dealing
with the “simple stuff”—numbers, strings, and relational operators.
Still, there’s no getting around the fact that real applications today
tend to contain massive amounts of textual data and frequently
incorporate temporal and spatial properties.
Happily, the integration of persistent programming languages and
environments as part of database architectures now gives us a relatively
easy means for adding the data types and libraries required to provide
textual, temporal, and spatial indexing and access. Indeed, the SQL
standard has already been extended in each of these areas. Nonetheless,
all of these data types—most particularly, for text retrieval—require
that the database be able to deal with approximate answers and employ
probabilistic reasoning. For most traditional relational database
systems, this represents quite a stretch. It’s also fair to say that,
before we’re able to integrate textual, temporal, and spatial data types
seamlessly into our database frameworks, we still have much to
accomplish on the research front. Currently, we don’t have a clear
algebra for supporting approximate reasoning, which we’ll need not only
to support these complex data types, but also to enable more
sophisticated data-mining techniques. This same issue came up earlier in
our discussion of data mining—data mining algorithms return ranked and
probability-weighted results. So there are several forces pushing data
management systems into accommodating approximate reasoning.
Handling the semi-structured data challenge. Inconvenient though it may
be, not all data fits neatly into the relational model. Stanford
University professor Jennifer Widom observes that we all start with the
schema <stuff/>, and then figure out what structure and constraints to
add. It’s also true that even the best-designed database can’t include
every conceivable constraint and is bound to leave at least a few
How best to deal with that reality is a question that has sparked a huge
controversy within the database community. On one side of the debate are
the radicals who believe cyberspace should be treated as one big XML
document that can be manipulated with XQuery++. The reactionaries, on
the other hand, believe that structure is your friend—and that, by
extension, semistructured data is nothing more than a pluperfect mess
best avoided. Both camps are well represented—and often stratified by
age. It’s easy to observe that the truth almost certainly lies somewhere
between these two polar views, but it’s also quite difficult to see
exactly how this movie will end.
One interesting development worth noting, however, has to do with the
integration of database systems and file systems. Individuals who keep
thousands of e-mail messages, documents, photos, and music files on
their own personal systems are hard-pressed to find much of anything
anymore. Scale up to the enterprise level, where the number of files is
in the billions, and you’ve got the same problem on steroids.
Traditional folder hierarchy schemes and filing practices are simply no
match for the information tsunami we all face today. Thus, a fully
indexed, semistructured object database is called for to enable search
capabilities that offer us decent precision and recall. What does this
all signify? Paradoxically enough, it seems that file systems are
evolving into database systems—which, if nothing else, goes to show just
how fundamental the semistructured data problem really is. Data
management architects still have plenty of work ahead of them before
they can claim to have wrestled this problem to the mat.
Historically aware stream processing. Ours is a world increasingly
populated by streams of data generated by instruments that monitor
environments and the activities taking place in those environments. The
instances are legion, but here are just a few examples: telescopes that
scan the heavens; DNA sequencers that decode molecules; bar-code readers
that identify and log passing freight cars; surgical unit monitors that
track the life signs of patients in post-op recovery rooms; cellphone
and credit-card scanning systems that watch for signs of potential
fraud; RFID scanners that track products as they flow through
supply-chain networks; and smart dust that has been programmed to sense
The challenge here has less to do with the handling of all that
streaming data—although that certainly does represent a significant
challenge—than with what is involved in comparing incoming data with
historical information stored for each of the objects of interest. The
data structures, query operators, and execution environments for such
stream-processing systems are qualitatively different from what people
have grown accustomed to in classic DBMS environments. In essence, each
arriving data item represents a fairly complex query against the
existing database. The encouraging news here is that researchers have
been building stream-processing systems for quite some time now, with
many of the ideas taken from this work already starting to appear in
Triggering updates to database subscribers. The emergence of enterprise
data warehouses has spawned a wholesale/retail data model whereby
subsets of vast corporate data archives are published to various data
marts within the enterprise, each of which has been established to serve
the needs of some particular special interest group. This bulk
publish/distribute/subscribe model, which is already quite widespread,
employs just about every replication scheme you can imagine.
Now the trend in application design is to install custom subscriptions
at the warehouse—sometimes millions of them at a time. What’s more,
realtime notification is being requested as part of each of these
subscriptions. Whenever any data pertaining to the subscription arrives,
the system is asked to immediately propagate this information to the
subscriber. Hospitals want to know if a patient’s life signs change,
travelers want to know if their flight is delayed, finance applications
ask to be informed of any price fluctuations, inventory applications
want to be told of any changes in stock levels, just as information
retrieval applications want to be notified whenever any new content has
It turns out that publish/subscribe systems and stream-processing
systems are actually quite similar in structure. First, the millions of
standing queries are compiled into a dataflow graph, which in turn is
incrementally evaluated to determine which subscriptions are affected by
a change and thus must be notified. In effect, updated data ends up
triggering updates to each subscriber that has indicated an interest in
that particular information. The technology behind all this draws
heavily on the active database work of the 1990s, but you can be sure
that work continues to evolve. In particular, researchers are still
looking for better ways to support the most sophisticated standing
queries, while also optimizing techniques for handling the
ever-expanding volume of queries and data.
Keeping query costs in line. All of the changes discussed so far are
certain to have a huge impact on the workings of database query
optimizers. The inclusion of user-defined functions deep inside queries,
for one thing, is sure to complicate cost estimation. In fact, real data
with high skew has always posed problems. But we’ll no longer be able to
shrug these off because in the brave new world we’ve been exploring,
relational operators make up nothing more than the outer loop of
nonprocedural programs and so really must be executed in parallel and at
the lowest possible cost.
Cost-based, static-plan optimizers continue to be the mainstay for those
simple queries that can be run in just a few seconds. For more complex
queries, however, the query optimizer must be capable of adapting to
varying workloads and fluctuations in data skew and statistics, while
also planning in a much more dynamic fashion—changing plans in keeping
with variations in system load and data statistics. For petabyte-scale
databases, the only solution may be to run continuous data scans, with
queries piggybacked on top of the scans.
The arrival of main-memory databases. Part of the challenge before us
results from the insane pace of growth of disk and memory capacities,
substantially outstripping bandwidth capacity and the current
capabilities for minimizing latency. It used to take less than a second
to read all of RAM and less than 20 minutes to read everything stored on
a disk. Now, a multi-gigabyte RAM scan takes minutes, and a terabyte
disk scan can require hours. It’s also becoming painfully obvious that
random access is 100 times slower than sequential access—and the gap is
widening. Ratios such as these are different from those we grew up with.
They demand new algorithms that let multiprocessor systems intelligently
share some massive main memory, while optimizing the use of precious
disk bandwidth. Database algorithms, meanwhile, need to be overhauled to
account for truly massive main memories (as in, able to accommodate
billions of pages and trillions of bytes). In short, the era of
main-memory databases has finally arrived.
Smart devices: Databases everywhere. We should note that at the other
end of the spectrum from shared memory, intelligence is moving outward
to telephones, cameras, speakers, and every peripheral device. Each disk
controller, each camera, and each cellphone now combines tens of
megabytes of RAM storage with a very capable processor. Thus, it’s quite
feasible now to have intelligent disks and other intelligent peripherals
that provide for either database access (via SQL or some other
nonprocedural language) or Web service access. The evolution from a
block-oriented interface to a file interface and then on to a set of
service interfaces has been the defining goal of database machine
advocates for three decades now. In the past, this required
special-purpose hardware. But that’s not true any longer because disks
now are armed with fast general-purpose processors, thanks to Moore’s
law. Database machines will likely enjoy a rebirth as a consequence.
In a related development, people building sensor networks have
discovered that if you view each sensor as a row in a table (where the
sensor values make up the fields in that row), it becomes quite easy to
write programs to query the sensors. What’s more, current distributed
query technology, when augmented by a few new algorithms, proves to be
quite capable of supporting highly efficient programs that minimize
bandwidth usage and are quite easy to code and debug. Evidence of this
comes in the form of the tiny database systems that are beginning to
appear in smart dust—a development that’s sure to shock and awe anyone
who has ever fooled around with databases.
Self-managing and always-up. Indeed, if every file system, every disk,
every phone, every TV, every camera, and every piece of smart dust is to
have a database inside, then those database systems will need to be
self-managing, self-organizing, and self-healing. The database community
is justly proud of the advances it has already realized in terms of
automating system design and operation. The result is that database
systems are now ubiquitous—your e-mail system is a simple database, as
is your file system, and so too are many of the other familiar
applications you use on a regular basis. As you can probably tell from
the list of new and emerging features enumerated in this article,
however, databases are getting to be much more sophisticated. All of
which means we still have plenty of work ahead to create distributed
data stores robust enough to ensure that information never gets lost and
queries are always handled with some modicum of efficiency.
STAYING ON TOP OF THE INFORMATION AVALANCHE
People and organizations are being buried under an unrelenting onslaught
of information. As a consequence, everything you thought was true about
database architectures is being re-thought.
Most importantly, algorithms and data are being unified by integrating
familiar, portable programming languages into database systems, such
that all those design rules you were taught about separating code from
data simply won’t apply any longer. Instead, you’ll work with extensible
object-relational database systems where nonprocedural relational
operators can be used to manipulate object sets. Coupled with that,
database systems are well on their way to becoming Web services—and this
will have huge implications in terms of how we structure applications.
Within this new mind-set, DBMSs become object containers, with queues
being the first objects that need to be added. It’s on the basis of
these queues that future transaction processing and workflow
applications will be built.
Clearly, there’s plenty of work ahead for all of us. The research
challenges are everywhere—and none is trivial. Yet, the greatest of
these will have to do with the unification of approximate and exact
reasoning. Most of us come from the exact-reasoning world—but most of
our clients are now asking questions that require approximate or
In response, databases are evolving from SQL engines to data integrators
and mediators that offer transactional and nonprocedural access to data
in many different forms. This means database systems are effectively
becoming database operating systems, into which various subsystems and
applications can be readily plugged.
Getting from here to there will involve many more challenges than those
touched upon here. But I do believe that most of the low-hanging fruit
is clustered around the topics outlined here, with advances realized in
those areas soon touching virtually all areas of application design.
Talks given by David DeWitt, Michael Stonebraker, and Jennifer Widom at
CIDR (Conference on Innovative Data Systems Research) inspired many of
the ideas presented in this article.
JIM GRAY is a Distinguished Engineer in Microsoft’s Scalable Servers
Research Group and is also responsible for managing Microsoft’s Bay Area
Research Group. He has been honored as an ACM Turing Award recipient for
his work on transaction processing. To this day, Gray’s primary research
interests continue to concern database architectures and transaction
processing systems. Currently, he’s working with the astronomy
community, helping to build online databases, such as
http://terraservice.net and http://skyserver.sdss.org. Once all of the
world’s astronomy data is on the Internet and accessible as a single
distributed database, Gray expects the Internet to become the world’s
MARK COMPTON, who now runs a marketing communications consulting group
called Hired Gun Communications, has been working in the technology
market for nearly 20 years. Before going out on his own, he headed up
marketing programs at Silicon Graphics, where he was the driving force
behind the branding program for the company’s Indigo family of desktop
workstations. Over a four-year period in the mid-1980s, he served as
editor-in-chief of Unix Review, at the time considered the leading
publication for Unix software engineers.
swilliams at hpti.com http://www.hpti.com Per: sdw at lig.net http://sdw.st
Stephen D. Williams 703-724-0118W 703-995-0407Fax 20147-4622 AIM: sdw
More information about the FoRK