Overview of client/server by Charlie Sauer. (LONG)

I Find Karma (adam@cs.caltech.edu)
Fri, 19 Apr 96 13:03:13 PDT


This is the c/s overview in that Distributed Computing
Environments book I was talking about. I found it to
be a good overview; forgive the little formatting dots...

Adam

Client/Server Concepts and Technology
by Charles Sauer

1. Elemental Concepts

"Client/server" architectures have been one of the main themes of distributed
computing environments, but it seems that there are many inconsistent concepts
of client/server architectures.
Though "client" and "server" can be easily defined, the phrase "client/server"
is often used in confusing and seemingly contradictory fashions.
This is partly because there are many forms of clients and servers.
As much as anything, the confusion stems from the pervasive use of
client/server architectures in so many different environments; just as the
environments are so diverse, so are the specifics of the clients and servers.
The intent here is to clearly define the generic terms and list the
most important forms of clients and servers.

A "server" is a manager of one or more resources.
A "resource" may be easily identified and physical in nature, e.g., a
printer, or it may be hidden and abstract, e.g., an authentication database.
A "client" is a user of the server's resource(s).

Client/Server Roles: Process View

In the standard computer science sense, a process is a program running on a
processor.
Though much of the discussion of "client/server computing" has focused on
machines that are clients and machines that are servers, some of the most
important client and server concepts and technologies relate to servers and
clients that are individual processes.
Examples of important server processes include authentication servers,
authorization servers, location servers, name servers, and time servers,
that are fundamental to making distributed ("client/server") environments
fit together.

In the case of an authentication server, for example, in the
.I Kerberos
system, the server's resource is a database sufficient for identifying
individuals, machines and/or processes.
Clients of the server request evidence, say, a key, that they can use
to identify themselves to other servers to request other services.
The authentication server manages the database and handles the requests.
The same server may manage other resources that are not visible or not obvious,
as discussed below with regard to parallel processes and threading.

It is often the case that a client needs to deal with a series or nesting
of servers to use the primary resource of interest.
For example, in the case of a print request, the client may need to first
acquire credentials from an authentication server. Then when the print
request is presented to the print server, that server may consult an
authorization server to determine whether the authenticated client is
authorized to use the requested printer.

In general, "client" and "server" are concepts, but there are often physical
implementations that are perceived as the client and server.
Servers are resource managers and clients are users of those
resources.
The server may be a single process, collections of processes, a single
machine environment, or even multiple machines, for example, a loosely
coupled cluster of machines providing "compute" serving.
An application program on a compute server may itself be viewed as a server.
Similarly, clients may be single processes, multiple processes and/or entire
machine environments.

Client/Server Roles: Machine Views

Though most client/server roles can be described in terms of processes, there
is a tendency to think in terms of clients as the machines that run the
client processes and the servers as the machines that run the server
processes.
It is natural to think of small, desktop, end user machines as clients and
centralized, larger (e.g., floor standing) machines as servers.
However, there are clearly exceptions.
First, a physically small machine may be adequate, if not preferable, for
some kinds of servers, such as
a print server or other server with minimal disk storage requirements.
Second, the nature of the resources may cause apparent role reversals.

The classic role reversal example is an X Windows server.
The responsibility of the X server is to manage the physical terminal
resources, usually a keyboard, a mouse and a display.
So the X server process always runs on a (typically small) end user machine,
not on a centralized, shared machine.
Further, the X Windows client processes may well run on centralized shared
(typically larger) machines.
In the extreme case of a dedicated X terminal implementation, only the
server process runs on the X terminal hardware, and all of the client
processes, including the window manager process, run on centralized, shared
machines.

Just as client/server machines may be physically different, they may also
be architecturally different.
An NFS file server may be a main-frame, a mini-computer, or a personal computer.
An NFS (client,server) pair may be of the same type and processor architecture
or significantly different, such as the case of a DOS client running PC-NFS
and an NFS server running under MVS.

In many cases, a single machine may be both a client and a server.
This could be due to nested service requests, such as the print server
requesting authorization service as described above, or due to peer to peer
relationships, for example, file sharing between two workstations.

Other Granularity Issues

Assuming a process granularity, there are additional implementation issues:

.I Parallel
.I Processes.
If a server is implemented as a single process, and it takes substantial
time to process a request, then clients may encounter significant waits
while earlier requests are satisfied.
If the resource is something that can only be used by one client at a time,
say, a printer, this is to be expected.
However, if the resource is a database suitable for concurrent access, then
significantly improved responsiveness can be obtained by providing multiple
parallel server processes, any of which can satisfy the client requests.

.I Multiplexor
.I Processes.
However, having idle processes wastes system resources, so the number of
processes should be chosen to tradeoff responsiveness vs. waste of resource.
Where the overhead of creating and terminating a server process is negligible
relative to the overall processing of a single request, as in, say, a bulk
file transfer request, then it is reasonable to have a multiplexor process
waiting to accept requests.
When a request is received by the multiplexor, then it can create a server
process to handle the request.

A typical example of a multiplexor process is the
.I inetd
process found in many TCP/IP implementations in Unix operating systems.
The inetd process is prepared to deal with dozens of different kinds of
services that are each implemented by separate processes.
Without inetd, a system could easily have more than a hundred processes
waiting for requests and wasting system resources.
Since the requests handled by these services occur infrequently and are
typically long running when they occur, inetd creates a new process
for each request and then waits for another request.

.I Subprocess
.I Granularity
.I -
.I Threads.
The use of dedicated processes for a server assumes that the supporting
hardware can afford the associated overhead and the clients can allow
the delays associated with process context switching.
In most operating systems, the full context of a process (processor
registers and memory) is sufficiently large that there is significant
overhead associated with saving and restoring contexts when
switching processes.
A "thread" is a separate execution context (processor state) sharing
memory addressing context (e.g., page table entries) with other threads
(execution contexts) within a process.
Since threads share memory and other resources, there is less overhead
of switching from one thread to another than switching from one process
to another.

It is often the case that servers are best implemented as threads
within a
single process, sharing memory to reduce overhead of context switching
and to improve responsiveness.

.I Server
.I Replication.
Since many clients depend on a server to get their work done, if a server
fails or becomes inaccessible, many clients' work will be disrupted.
To try to avoid disruptions, servers are often deployed with especially
robust hardware, with battery or generator backup power, and other provisions.
However, the operating system supporting the server is often a significant
point of failure, as may be the server itself.
To increase availability, it is often desirable to replicate a server on
multiple machines, so that if one server system fails, another can continue
to provide service.
The presence of replicated servers can also help with load balancing and
improve performance.
However, keeping replicated servers' data synchronous may
involve significant complexity and overhead, so in some cases it may be
preferable to work toward providing a single more robust server than
replicated servers.

Other Characteristics of Client/Server Architectures

The above discussion is largely from an implementation point of view.
End users typically don't want to know about implementation.
It is very easy for implementors to allow the client/server boundaries
to be visible to users, and this is often a mistake.
Rather, client/server implementations should usually be targeted at
making boundaries hidden from the end user.
Correspondingly, end users should typically not be aware of locations
of clients and servers; the implementor should strive for location
transparency.

In some senses, it is even more important that location transparency be
provided for application programmers.
Otherwise, applications will likely be inflexible when hardware and software
systems are augmented or otherwise changed.

Often, the boundaries and locations are most visible to the administrators
of client/server systems.
To some extent this is unavoidable; to gain the efficiency and flexibilility
of multisystem architectures, one must be prepared to manage the component
systems and their relationships.
Nevertheless, if it is impractical to manage the systems, they will not be
usable, so extra attention must be paid to ease of distributed system
management.

2. Examples

The purpose of this section is to outline some of the many forms of clients
and servers currently deployed.

Structured Data Service

One of the best known forms of client/server implementation is a database
environment where the client is responsible for human interaction and for
formulating queries/updates
based on that interaction, and the server, say a SQL (Structured Query
Language) implementation, is
responsible for actually querying/updating the database.
However, the "database" need not necessarily be visible as such nor represented
in a relational model.

The cc:Mail electronic mail system is based on a database
which stores mail messages and directories of users.
This database, called a "post office," typically exists on a dedicated server
machine.
The mail messages supported by cc:Mail may be larger and more complex than
the simple text files usually supported by electronic mail systems.
The messages may include program object files, faxes, and bit maps
of other graphic images.
Since electronic mail messages are often sent to many addressees, it is
desirable to store messages in a way that multiple users have shared access
to the same disk copy of a message.

In cc:Mail, messages being created or retrieved are stored in private
client storage.
(The private client storage may actually exist on a file server, but for this
discussion it is reasonable to think in terms of memory and disk storage local
to the client.)
The client side of the application provides "item" (component) manipulation,
text editing, and component viewing.
These actions are performed on temporary local copies of a message.
The server provides retrieval of existing messages, a repository for existing
messages, forwarding services for messages destined to other post offices
(including gateways to other mail systems), and user directory services for
addressing.

Another example, primarily in the Unix environment, is the seemingly ubiquitous
"net news" and access to net news via NNTP (network news transfer protocol).
Net news is a loosely managed federation of thousands of machines,
each of which provides access to thousands of shared bulletin boards.
The bulletin boards, called "news groups," each in term contain many messages,
usually called "articles."
For example, one news group is "comp.arch," which is the
group in the collection of computer oriented ("comp") groups oriented toward
computer architecture ("arch") issues.

Without the use of NNTP, net news is typically used in a multiuser fashion
where each machine has a complete copy of the current articles in the groups
of interest to the users of that machine.
The articles are stored as individual files in the standard file system
hierarchy, e.g., the current files in comp.arch might be stored as

/news/comp/arch/885
/news/comp/arch/886
...
/news/comp/arch/1036

where articles older than number 885 have been "expired," and the most
recent article available is number 1036.

This approach was devised prior to NNTP. It works adequately for multiuser
systems with system administrators prepared to assure
that articles are transferred properly to and from other systems, assure
that articles are expired properly, and that news groups are created/deleted
properly.
However, it is a non-trivial effort to maintain such a
system, and most people who read/contribute ("post") net news do not want
to deal with the administration aspects.
Further, it is inefficient in both inter-machine communications and disk
storage to communicate and store the many megabytes of articles that are
created every day.

The solution for a typical user of a private desktop machine on a local area
network is
.I not
to have the storage and mechanism associated with multiuser news systems.
Rather, a more convenient and efficient approach is to use NNTP based clients
for reading and posting news articles.
Here, "clients" is plural for two reasons, even assuming a single user:
First, there are at least two basic activities that a user would typically
perform, reading news and posting news, and separate client programs might
be used for both, for example, the Unix "rn" (read news) and "Pnews" (post news)
commands.
Second, there are many different user interfaces used for these same purposes:
the dedicated commands just named, interfaces integrated into editors, and
X Windows based interfaces.
That multiplicity of interfaces is irrelevant to the primary client-server
issues.

In a typical usage scenario, say invocation of rn using NNTP, rn contacts the
predesignated NNTP server to get an index of the current articles
in each of the news groups.
The NNTP server runs on a system that stores the articles/groups
in the file system as suggested above.
Thus the index is readily available when the client requests it.
Based on the index, the client will typically request a list of subject lines,
which the server can easily construct and provide, and then the contents of
each of several articles, which the NNTP server simply transfers to the client.
Similarly, when a user wants to post an article, he or she constructs it with
an editor, say invoked by Pnews, and Pnews sends the completed article to
the NNTP server which stores the article in the file system hierarchy.
(From there it is visible to other local users and NNTP clients, and gets
forwarded to other systems by traditional means or additional NNTP functions
not described here.)
The standard functions are easily provided, and require no permanent storage
or administration at the client.

Lotus Notes is an MS-Windows system analogous to net news, but provides
additional features/functions, e.g., capability
to deal with more complex data types than plain text.
But the underlying implementation concepts, from a client server point of
view, are fairly similar to the NNTP architecture and implementation.

X Window System Servers and Clients

As discussed previously, the role of an X Windows System server is to manage
the physical resources at the user machine, i.e., the graphics subystem and
the input devices.
The system running the X server receives keyboard, mouse and other input
which is produced in the forms specified in the X protocol and passed
to the appropriate client process.
The client process may have a general role, for example, a window manager client
process manages the size/location/visibility of all windows, or a very specific
function, such as to copy an image from disk to the X server.
A client process supplies output to the X server in unstructured form, such as
a raw bit map, or in structured form, such as a character in a specific font,
and the server displays the output on the screen (assuming the client's window
is at least partially visible).

Internet Daemon (inetd) Services

In the Internet Protocol family, there are hundreds of defined/proposed
services, ranging from fundamental/widely used services such at the FTP
file transfer protocol, to well known but less frequently used services
such as the NTP Network Time Protocol to relatively unknown and/or
experimental services.

Without inetd, a system would likely be able to practically support only a
small subset of these services.
(This is true, if for no other reason, because of the
many processes that would need to be running to respond to requests for
the various services.)
With inetd, there is a configuration file, /etc/services, which lists all
of the services supported by the system, the TCP or UDP port numbers
that are used to establish client/server communication, and the program
which is actually used to implement the service.
The "file transfer program daemon," ftpd is such a program.

When a client attempts to initiate an inetd controlled service, say FTP
file transfer, it attempts to establish a connection to the appropriate
port number at the server.
At the server, inetd responds to this request, initiates an ftpd process,
and passes the connection request on to the ftpd process.
The client and the ftpd proceed, perhaps transferring large files,
until the FTP session is done and the server ftpd process terminates.
This happens without further involvement of inetd.

The inetd role just described may be repeated many times in parallel,
for several concurrent file transfers, each with a separate ftpd process,
plus remote login and other services, or in series or in pretty much arbitrary
mixtures.
This happens with essentially no overhead when the services are
inactive and relatively low overhead in establishing resource intensive
services.
However, for services with high rates of requests, it is probably inappropriate
to use inetd.
It is more efficient to have a separate, dedicated process.

Diskless booting

In a typical TCP/IP environment, a machine with stable local storage (usually a
local disk) can retain its 32 bit internet address
to properly identifiy itself when it initiates network
activity.
However, a diskless workstation without
stable local storage needs a mechanism for determining its internet
address.
Since the machine does not have local storage, it further needs to use the
network to obtain a copy of the operating system before it can even bootstrap
the operating system.

Diskless booting in TCP/IP environments is typically accomplished by use of
Reverse Address Resolution Protocol (RARP) and either Trivial File Transer
Protocol (TFTP) or the BOOTP protocol based on TFTP.
The diskless machine has some form of unique identification permanently
stored in read-only memory, e.g., Ethernet machines have the 48 bit Ethernet
address stored in the network controller.
With RARP, the diskless machine broadcasts a "who am I" query.
An RARP server on the local network receives the query, finds the internet
address in a table indexed by the unique identifier (e.g., 48 bit Ethernet
address) and returns the address to the diskless machine.

Once the diskless client has its internet address, it can begin using IP
based protocols.
In particular, it can contact a "well known" (preconfigured) server to
obtain the operating system image via TFTP, or it can use the BOOTP
protocol to find a server that provides boot images and then obtain the
system image.
In either case, the system image is transferred directly into memory and
the bootstrap process begins much the same as on a machine with local disk.

Similar issues exist in other environments, and other protocols/services
exist for resolving those issues.
For example, in NetWare environments there is a NetWare specific diskless
boot protocol which is supported in the read-only memory of diskless
workstations.

File service

Almost certainly, the service most frequently used by clients is file service,
i.e., storage and retrieval of files on a file server in lieu of storage on
local disk.

In MS-DOS environments, file service is usually provided by Novell NetWare,
IBM Lan Server, Microsoft Lan Manager or Banyan Vines.
Though there are many differences in implementation and administration,
all of these are conceptually similar in providing additional lettered
disk drives that can be treated as if they were local.
For example, a file service implementation provides a
virtual drive H: on the file server that is used as if it were a local drive,
e.g., the main C: drive.
Typically, though not exclusively, these products are oriented toward using
a dedicated server machine as the file server, i.e., the same machine is
typically not both a client and a server of file service.
Many (possibly hundreds of) client machines share the same file server.
The H: drive suggested above uses the same portion of the server file system
to support each of the client machines.

In small MS-DOS environments, so called "peer to peer" networks are common,
where the same machines are used as both clients and servers for file service.
Such file service systems may have advantages of economy, by avoiding the
machine cost of a dedicated server and the personnel cost of an adminstrator
for that server.

In Unix and similar environments, Sun's Network File System (NFS) is
the
.I de
.I facto
standard for file service.
Architecturally, NFS allows general peer to peer capabilities, but is
typically used with (nearly) dedicated file servers.
In any case, the client machine "mounts" all or part of a server's file
hierarchy over part of its local hierarchy, just as it would mount a
local disk.
For example, if the server is to provide most applications, then initialization
of a client might include mounting a server's /usr/bin and /usr/lib
hierarchies over the corresponding local hierarchies.
Subsequent requests to open/read/write files in those hierarchies would
be redirected to the server.
In most cases, this would be transparent to the user/programs generating
the requests.

Though predominant in actual usage, NFS has some well known limitations,
and some alternate products attempt to address those deficiencies.
Most promising of these, perhaps, is Transarc's
Andrew File System (AFS).
Since typical usage of file servers, especially in large scale environments,
is with dedicated servers, AFS was developed in a manner that enforced and
facilitated such enviroments.
Though more recent versions of AFS allow peer to peer characteristics, the
development history of separate roles has led to an architecture that
facilitates administration and support of large scale environments.
3. Prerequisite Technology and Concepts

With the perspective of the above introduction, we're ready to consider
how client/server applications are really constructed, that is, to
consider the technology that is prerequisite to building client/server
systems and to consider additional concepts associated with client/server
systems.

3.1 Programming Models - Remote Procedure Call

First, let's consider the programming models appropriate to client/server
systems. Many of the implementations of the examples described above
are based on "unstructured" client/server interaction, that is, there is
no obvious structure that is consistent from application to application,
outside of obvious primitives such as message passing between
threads/processes/machines.

However, an appropriate structure provides benefits of programmer efficiency,
and program correctness, so it is desirable to at least look for higher level
structures.
By far, the most prevalent paradigm for more structured
client/server program development is the "remote procedure call" (RPC).

In the most simple form, the remote procedure call is an obvious generalization
of the traditional procedure call present in most programming languages.
The fundamental difference is that the calling procedure executes in one
thread/process/machine and the called procedure executes in another.
Because RPC is so analogous to the familiar mechanism, it is conceptually easy
to construct applications using RPC.

Nevertheless, there are at least two immediate consequences that must be
dealt with:
(1) There is no possibility for shared memory between the calling
procedure and the called procedure.
Thus any communication between calling
and called procedure must be via explicit parameters or return values; use
of global variables or analogous communication is not possible.
(2) The overhead of a remote procedure call is roughly four orders of
magnitude higher than that of a local call; the local overhead may be well
under a microsecond, but the remote overhead is likely on the order of several
milliseconds or more.
There is no known way to make either of these consequences really transparent
to the programmer.

The above implies great attention to handling
of parameters.
Simply adding all global/shared variables to a parameter list is not feasible.
Except for a few
simple variables, call by value is almost a necessity.
It is typically infeasible to use reference parameters.
Even with call by value, passing structures or arrays as parameters is likely
to be performance prohibitive.
Use of pointers as parameters is similarly problematic, for these pointers
imply access to memory on the remote machine.
Finally, if the calls are between heterogeneous processor architectures,
there needs to be provision for translating between the representations
used by the different processors, e.g., between different byte orderings
of multibyte data types.

There is another implementation characteristic that is not easily hid from
programmers.
Programmers are used to symbolicly referencing procedures.
However, symbolic references are typically reduced to numeric references
at compile time or module bind time.
In a simple remote context, without specialized support, resolution of symbolic
references is usally impractical, so programmers must know numeric identities
of the procedures being called.

Interface Definition Languages

Because of all of the above issues, it is usually inappropriate to write
programs directly in C or equivalent languages, since
the programmer must handle the above issues in an unstructured manner.
Rather, it is more effective to use augmented versions of C (or equivalent),
typically called "interface definition languages," which provide specialized
constructs for specifying remote procedure calls and verifying that the above
issues are dealt with properly.

For example, a call using the HP/Apollo Network Interface Definition Language
might look like...

[uuid(334033030000.0d.000.00.87.84.00.00.00), version(1)]
interface bank {
import
"nbase.idl";
typedef
long int bank$acct_t;
typedef
char bank$acct_name_t[32];
void bank$deposit(
[in] handle_t h,
[in] bank$acct_t acct,
[in] long int amount,
[out] status_$t *status
);
};

The "uuid" is a universally unique identifier, unique over all machines and
time, as discussed below.
The "import" statement is similar to the contextual "#include" of C and
other languages.
It does not result in a literal textual include, but allows predefined
declarations which can be used in several related programs.

Other RPC Issues

In a local procedure call, absent catastrophe, it is certain that the
called function will be executed exactly once.
Remote procedure calls are typically implemented with connectionless
(datagram) protocols such as UDP, so it is possible that networking
errors will cause a failure of the client machine to send the
request to the server, or of the server machine to return the results.
An RPC implementation will typically include a timer mechanism that will retry
calls/replies that are not properly acknowledged.
There is thus the potiential that a call might be executed zero, one, two
or more times.

There are two common approaches to dealing with this:
1. Require called functions to be "idempotent," i.e., multiple calls to
the function have the same effect as a single call.
2. Require the RPC implementation to guarantee that any given function
will be called at most once.
Neither is fully satisfactory, and some RPC implementations allow for both.

In some applications, the synchronous delays of a strict call/return model
of RPC will result in unacceptable performance.
For example, file transfer applications would typically have low transfer
rates.
Thus, variations of RPC are often implemented and still considered RPC.
Rather than requiring 1-1 matching of calls/returns, an implementation may
allow multiple calls before a return is generated and corresponding coalescing
of returns.
The more an implementation deviates from a strict call/return model,
the less appropriate the name "remote procedure call."
Lacking a more suitable name, the name RPC is usually still used.

3.2 Location/Name Service

Program Views

For a client to actually use a service, it must know how to locate and name
it.
>From a program point of view, locations and names will ultimately consistof
numeric identifiers.
Locations will typically be identified symbolically and numerically using
the conventions of the supporting network protocol.
For IP, symbolic names will be domain names (e.g., "cs.utexas.edu") and
numeric identifiers will be 32 bit integers in the classic formats.

Names of services may initially be symbolic, but ultimately, and perhaps
initially, will be numeric.
In any case, numeric identifers are needed to efficiently name services
at the program level.
The numeric identifiers must be consistent, or at least not ambiguous,
across heterogeneous hardware, operating systems and networks.
Otherwise, attempts at interoperation between heterogeneous systems
and/or unification of distinct networks will be frustrated by the ambiguity
and/or inconsistency.
For example, if network A treats service 5 as a file transfer service
and network B treats service 5 as an authentication service, something
will have to change for the two networks to interoperate.

There are at least two approaches that are used to address this problem,
management of well known identifiers and use of universally unique identifiers.
With the well known identifier approach, identifiers are typically 32 bits
long and use of the identifiers is, at least partially, managed by some
central authority.
When a service is developed that is expected or demonstrated to be of general
interest, an available number is assigned to the service and that number is
publicized.
Some subset of the numbers is reserved for use without preregistration
(so that public services can be developed and so that private services can be
used without interference with public services) and some subset of the
numbers are reserved for temporary use.

With appropriate management and usage, use of well known identifiers as
described above is effective and appropriate.
However, there is inherent delay in reservation of and publishing the identifier
of a new service. Further, it is often the case that human error causes two
services to have the same identifier.
This and other management problems become more troublesome as more services
become available and more networks become interconnected.

Universally unique identifiers are usually based on much larger integers,
say 128 bits, which are large enough to contain a network identification
field, e.g., a 48 bit Ethernet address, a time stamp and other information.
The time stamp must be fine enough in granularity (at least small number
of milliseconds, if not finer) that it is unlikely to be used more than once,
and large enough in potential value that it may span decades, if not more.
Such a time stamp is likely to be at least 40 bits (the number of seconds
in a year requires 25 bits to represent, plus, say, 10 bits to resolve to
milliseconds and 5 bits to cover 32 years) and, possibly, many more.
The remaining bits can be used to ensure uniqueness in case of colliding
time stamps and for classification purposes.

Though the unique identifier is generated on a particular machine, their
is nothing to say that another machine cannot use the same identifier
to describe the same service, as long as it is told to do so.
It is intended that the use of network addresses and time stamps is
primarily to ensure uniqueness, and that these will be opaque to
most uses of the identifiers.
However, there are clearly concerns for the overhead of storing, manipulating
and communicating 128 bit values, both amongst machines and programmers.
(If well known indentifiers are chosen starting at 0, then it is likely that
a programmer can accurately remember the "32 bit" numeric identifier of
a service, but it is highly unlikely that a programmer can accurately
remember a 128 bit unique identifier.)
Also, most existing systems have started based on small, well known identifiers,
so the use of unique identifiers typically involves retrofitting,
reengineering and/or coexistence of the two approaches.
User Views

Users also need ways to at least name/locate objects and services.
This is usually done with symbolic references that are intended to be
relatively easy for people to use.
These names need to be general enough that they can be used across
interconnected networks.
For example, there are hundreds of thousands of machines connected on the
Internet, so giving each one a name that is both simple and unique is
impractical.

On the Internet, machine names are hierarchical, with the rightmost, highest
component either giving the organization type or the geographical entity
containing the machine.
Typical organization types include COM, EDU, GOV and ORG for commerical,
educational, governmental and non-profit organizations, respectively.
The next component in the hierarchy indicates the name of the organization,
e.g., Dell.COM, UTexas.EDU, LANL.GOV or OSF.ORG.
Depending on the size/structure of the organization, there may be additional
qualifiers within the organization, e.g., Austin.IBM.COM or CS.UTexas.EDU.
The full set of organization qualifiers is the "domain name."
Similarly, where the name is to be geographically based, locality qualifiers
are added to the major geographic entity to produce a domain name.
Finally, there is a simple machine name, e.g., CHS, which combined with the
domain name provides a unique name for the machine, e.g., CHS.Dell.COM.

Similarly, naming of files requires a hierarchical approach or some other
convention to simplify naming a diverse set of objects.
Besides syntactic issues and constraints on components of a hierachical name,
one of the main distinctions amongst hierarchical approaches is whether
there exists one hierarchy or many parallel hierarchies.

In Unix systems and directly related systems, there is a single hierarchy
that spans different devices.
The top ("root") file directory of the hierarchy is named with a forward
slash "/" character.
Additional directories, possibly on different devices, are separated by
additional slashes until the actual file is named.
"/news/comp/arch/303" names a file known as "303" within the "arch" directory,
which is in turn found in the "comp" directory, which is in turn found in the
"news" directory, which is found in the root directory.

In MS-DOS and other systems, there are several disjoint hierarchies,
usually corresponding to distinct devices, such as the "C:", "D:", ...
convention used in MS-DOS.
Within a given hierarchy, things are conceptually the same as the above
example, e.g., "C:\WINDOWS\WIN.INI" names an Microsoft Windows configuration
file found in directory WINDOWS found in the root directory of the C: device.

Sometimes it is desirable and/or necessary for a user to name both an entity
on a machine and a machine.
A good example is naming another user on another machine for purposes of
sending mail, e.g., postmaster@Eng.Sun.COM.
In this case, explicit identification of both the entity and the machine
is appropriate.
However, it is usually desirable to just name an entity without needing to
know/cite the location.
Thus aliasing conventions are often provided to simplify use of a service.

Given an aliasing convention, it is necessary to be able to find the real
identity of an entity, for example to deterine that 143.166.1.39 is the
Internet address of chs.dell.com.
It is frequently desirable to be able to find the aliases
for a given entity, e.g., to determine that the name of a machine given
an Internet address.
It may also be desirable/necessary to find additional
information about an entity, say, whether a machine provides printing service.

For frequently used services, especially those with relatively stable
names, aliases and services, it is usually desirable to have the information
directly available on a given machine so that local activities can be
performed without network accesses.
For example, a machine using TCP/IP typically has, at least, its own name
and Internet address configured in a local "hosts" file.
(As discussed before, this may not be true of a diskless machine.)
Often, a hosts file will have the names and Internet addresses of other
frequently accessed machines, but this is a hazardous practice, because
the file will likely not be updated as quickly as machine names and addresses
change.
"Name" servers are used in such situations so that local files listing
name/addresses are not needed.
Typically a machine will be configured with the addresses of several
name servers, which are used when a name/address is not found in the
local hosts file.

In general, name servers and similar servers can be used for a variety of
other purposes.
The Network Information Service associated with Sun's NFS/ONC is used, not
only for host names/addresses as described above, but also for user names/ids,
group names/ids, passwords, and so forth.

With file systems, other static mechanisms, often called "remote mounts" in
the sense of Unix device mounts, are used to graft parts of remote file
system hierarchies onto local hierarchies.
For example, instead of mounting the "/usr" hierarchy of applications and
libraries from a local disk, it might be mounted from a file server such
that reference to a seemingly local file, say, /usr/ucb/mail, actually
references a remote file.

Time Service

Just as a collection of clocks in a house is likely to have a different
time on each clock, a collection of computers is likely to have a different
time setting for each machine.
Differences of a few minutes may be insignificant for houshold clocks, but
may be quite significant for computer time keeping.
Some application programs, e.g., the Unix "make" utility, base important
decisions on time stamps of files.
("make" is used to create up to date executable programs with a minimum
of effort by comparing time stamps and only translating sources that
are more recently modified than the current copies of objects dependent
on those sources.)
Some security services, e.g., the Kerberos authentication system described
below, use time stamps to avoid certain kinds of attempts at breaching security.

There are a variety of protocols and services for synchronizing clocks
on a network of machines, none of which are pervasively used.
Perhaps one most commonly used in the Internet community is the Network Time
Protocol (NTP).
Several public domain implementations of NTP exist.
Digital Equipment's Digital Time Service provides a superset of NTP's
capabilities.s
DTS is part of the OSF Distributed Computing Environment (DCE), so it may
become more widely used as DCE is deployed.

Authentication and Authorization

In single user, standalone systems, authentication and authorization of
n users is usually entirely a matter of physical access security, typically
either access to the location of the system or mechanical/electrical
locking of the system. In some environments, boot passwords or other means
are used in addition to or instead of physical security, but if physical
security can be compromised, then these means are usually easily compromised.
In a single user system there is little need for authentication and
authorization at finer granularity, except, potentially, to protect
the user from accidentally causing harm to the system.

In a multiuser, standalone system, physical security is usually necessary
but not sufficient.
In addition, users must authenticate themselves, typically, using passwords.
Different users may have different priveleges, e.g., an administrator
may have nearly unlimited powers and other users will have lesser
authority.
But these authentication and authorization mechanisms are likely to be
fairly simplistic, because the system security can rely on some level
of physical security.

In a distributed system, there is little, if any, physical security.
In most physical networks, it is sufficient to gain physical access to
a single system on the network, or just an unused network connection point,
to have very strong capabilities.
As of this writing, installed systems often depend on physical security that
does not really exist.
It is nearly trivial to get a personal workstation to misrepresent itself
as having a different Ethernet address, or a different Internet address,
than the one assigned to it.

Thus, more robust authentication mechanisms are usually desirable, if not
mandatory.
Usually, these systems are based on use of encryption.
In Internet environments, the most frequently used are the Sun "secure RPC"
mechanism and the "Kerberos" system from MIT Project Athena.
Both of these depend on encrypted exchanges between machines to ensure
that each machine has confidence it knows the identity of the other.
There are significant differences in both implementation approaches
and in generality, but both systems are oriented toward authenticating
users of client machines more than the machines themselves.
Thus the systems can depend on user supplied passwords as the basis
for the encryption keys.
The Kerberos system is gaining increased usage in the industry, nearly to
the point of gaining
.I de
.I facto
standard status.
Research continues into both determining the limitations of the current
Kerberos architecture/implementation and into providing improved
architectures and implementations.

Once a user is authenticated, there remains the question of determining
what the user is authorized to do.
The authority of users typically revolves around files, but extends to
other objects as well.
There are two general approaches, either to use permission structures
based on granting coarse rights such as read/write/execute to users
and/or groups of users, or to use access control lists to gain fine
granularity control of the rights of each user with regard to each object.
Relatively coarse granularity is much simpler to explain and administer,
but usually does not provide adequate control for heterogeneous environments.
But finer granularity mechanisms are often not used at all, or not used
effectively.
Systems such as the Sun Network Information Service (AKA "yellow pages")
and MIT Project Athena depend on coarse granularity as is traditional in
Unix-based systems.
Access control lists are being provided in newer environments, such as the
OSF DCE, and will likely become more widely used in the future.
Client/Server Concepts and Technology

4. Service Implementation Issues

There are some implementation issues that are such a fundamental part of
client/server computing, that further discssion is appropriate.
(Some of these issues have been the source of much controversy amongst
implementors and users of services.)

File Service Transparency of Naming/Administration

Naming of file hierarchies and adminstration of that naming are one of the
very visible distinctions amongst systems.
Some systems require that locations of files be totally hidden from not
only the user, but even the administrator, in so far as possible.
In these systems, the implementation usual forces each system amongst a
group of systems to see exactly the same hierarchy.
This provides a very consistent view for the user, which is usually
desirable, but precludes adminstrators from creating custom hierarchies
visible only to a subset of machines.
The LOCUS architecture is one example of such a system.

Some systems require the administrator to be aware of the locations of
files, but allow machines to be configured such that end users are not
aware of whether files are local or remote.
The remote mount facilities of NFS and the mapping facilities of NetWare
are examples of this approach.
In principle, it is possible to create a totally uniform view such that
each machine in the network sees exactly the same hierarchy, just as described
above, but this is usually not easily accomplished.
For example, in NFS, the administrator might configure each machine such that
it mounts /usr from a file server, e.g.,

mount -F NFS christy:/usr /usr

but other parts of the hiearchy would differ unless such mounts were performed
to encompass all of the hierarchy.

Some systems, such as Apollo Domain or Transarc's AFS, explicitly include
location information in the user naming conventions, possibly with unique
syntax to distinguish machine names from directory names.
For example, in the OSF DCE variant of AFS, an explicit reference might be
of the form

/.../christy/usr

In systems with this approach, there are usually aliasing mechanims or
other shortcuts that allow users to ignore location names most of the time.
For example, in a Unix system, /usr might be a symbolic link pointing to
/.../christy/usr.

Caching/Consistency

As is often said, there is no such thing as a free lunch.
Even though a server may be able to provide a service more efficiently,
more consistently, ... than if a client attempted to handle the function
locally, there is inherent networking overhead in the client making the request,
in the server receiving the request, in the server sending a response and
in the client receiving a response.
For some requests, this overhead may dominate the time it actually takes
to execute the service.
Further, some of these requests may be repetitions of previous requests
which result in repetitions of the previous responses.

For such requests, it is desirable to try to have clients retain local copies
of the responses and reuse the local copies rather than repeatedly request
them from the server.
For example, a client might retain results of (name,address) queries of a
name server, and before querying the server for such lookups, search to
see if the results are available in a local cache.

For some cached items, where information is unlikely to change frequently
and/or it is relatively harmless if the cached information is incorrect,
then it is reasonable to not try to require the cache to be consistent with
the server.
For example, if a (name,address) pair became invalid, using the invalid
information is likely to be quickly discovered, with little inconvenience
or problem.
If the cache is flushed periodically, causing acquisition of up to date
results, then this is probably sufficient.

However, for some cached items, it is crucial that the information be
correct.
For example, if data from a file server is cached, and several clients
try to update the same data, either there must be cache consistency
mechanisms in place or the concurrent updates may cause serious
data inconsistencies.
Thus file services usually include some policy or mechanism that forces
client data to be consistent with server copies.
Without such a policy, applications/users must be aware of potential
inconsistencies and limit usage accordingly.
Further, the client implementation is likely to be much more conservative
about caching, avoiding aggressive read-ahead and write-behind practices,
and performance is suboptimal.

One approach to file caching is to only cache data that is not being shared
amongst clients at the present time, i.e., only one client machine has
processes with the file open.
Another is to only cache data that is from files opened in a read-only mode.
With applications that are careful to only request write authority when
needed, the combination of these two approaches covers a wide range of usage.
However, to best cover all likely usage, many implementations include
mechanisms to allow caching in more general conditions.
One approach is to provide write access to a single machine on demand,
revoking that access (while demanding updated pages) from the machine
previously having write access.
This approach is usually implemented conceptually using tokens, as in the LOCUS
architecture.
(The Modified/Exclusive/Shared/Invalid (MESI) protocols used to enforce
processor cache consistency in multiprocessor hardware are conceptually
similar.)
An alternative approach is to require clients to synchronously send writes
to the server when multiple client machines seek write access.

Replication

Because clients cannot function without certain prerequisite services, it
is necessary that these services be made as robust and continuously available
as practical.
However, it is necessarily the case that due to network, hardware or software
failures, any given server can become inaccessible or otherwise unusable.
A variety of replication schemes are used to limit disruption from
loss of servers.

For some servers, it is easy to provide replicas.
For example, print service may depend on multiple printers and server
machines, with relatively little implementation effort or run time overhead.
But in most instances, there is need to provide some level of consistency
between the replicas.
In nearly all cases, there is a primary, updatable copy of the
server's stored information, plus several secondary read-only replicas.
Queries can be satisfied by either the primary or the secondary versions,
but updates can only be performed by the server with the primary version.
If the primary server is unavailable, then updates are not possible.

The hard questions arise as to whether the secondary copies must be kept
up to date atomically, i.e., whether it is acceptable or not for a query
to get different data from different servers at any given instant, and,
if the secondary copies must be kept fully synchronized with the primary
copies, how this can be accomplished most efficiently.
In addition, there must be mechanisms to allow one of the secondary versions
to become primary in case of extended unavailability of the original
primary.

For some services, atomic consistency is clearly not a requirement.
For a name server, it is entirely reasonable that it give a stale response
to a query, assuming that either (a) the stale response is still acceptable,
(b) upon using the stale response, the client will be directly informed of
the current response, or (c) the stale response will result in failure
recognizable by the client, and the client can retry the process until it
gets a usable response.
When atomic consistency is not required, then the primary server can provide
bulk updates to the secondary servers on a periodic basis.
If a secondary must take over for an unavailable primary, then there may
be some lost updates, but this loss is presumed recoverable and/or acceptable.

For other services, say a financial data base, inconsistency of replicas is
clearly unacceptable.
Much research has gone into algorithms for providing both atomic consistency
and performance, but there is no apparent way to avoid significant overhead
in maintaining consistency.
Thus even file service implementations will often depend on acceptability of
inconsistency between primary and secondary copies of primarily read only
files such as program executables.
Client/Server Concepts and Technology

5. Further Reading

One of the earlier influential papers on client server computing is the
discussion of the Grapevine system at Xerox Palo Alto Research Center [BIRR82].
For another view, see [MICR91].

For more information on remote procedure call, see the original thesis by
Nelson [NELS81], the discussion of SUN RPC in [SAND85] and
the Apollo Network Computing Architecture [DINE87].

Most of the work on use of encryption for authentication traces back to
Needham and Schroeder [NEED78]. See, also, [TAYL86] for discussion of
Sun's Secure RPC and the original Kerberos paper [STEI88].

We have largely omitted discussion of implementations of file servers.
The LOCUS system is described in detail by Popek and Walker [POPE85].
See [SAND85] for the original description of NFS, and [HOWA88] and
[KAZA90] for discussion of the Andrew File System.
See [NOVE90] for discussion of Novell NetWare.

REFERENCES

[BIRR82] A.D. Birrell, R. Levin, R.M. Needham, M.D. Schroeder, "Grapevine:
An Exercise in Distributed Computing,"
.I Communications
.I of
.I the
.I ACM
.I 25,
4 (April 1982).

[BLOO86] J.M. Bloom and K.J. Dunlap, "A Distributed Name Server for the
DARPA Internet,"
.I Usenix
.I Conference
.I Proceedings,
(June 1986).

[DYER88] S.P. Dyer, "The Hesiod Name Server,"
.I Usenix
.I Conference
.I Proceedings,
(February 1988).

[HOWA88] J.H. Howard,
"An Overview of the Andrew File System,"
.I Usenix
.I Conference
.I Proceedings,
(February 1988).

[KAZA90] M.L. Kazar, B.W. Leverett, O.T. Anderson, V. Apostolides,
B.A. Bottos, S. Chutani, C.F. Everhart, W.A. Mason, S. Tu, and E.R. Zayas,
"DEcorum File System Architectural Overview,"
.I Usenix
.I Conference
.I Proceedings,
(June 1990).

[DINE87] T.H. Dineen, P.J. Leach, N.W. Mishkin, J.N. Pato, and G.L. Wyant,
"The Network Computing Architecture and System: An Environment for Developing
Distributed Applications,"
.I Usenix
.I Conference
.I Proceedings,
(June 1987).

[MICR91] Microsoft Corporation,
"Downsizing Corporate Information Systems: An Overview of Client-Server
Computing in the Enterprise-Wide Environment," part number 098-19369, May 1991.

[NEED78] R.M. Needham and M.D. Schroeder, "Using Encryption for
Authentication in Large Networks of Computers,"
.I Communications
.I of
.I the
.I ACM
.I 21,
12 (December 1978).

[NELS81] B.J. Nelson,
.I Remote
.I Procedure
.I Call.
Ph.D. Dissertation, Report CMU-CS-81-119, Carnegie-Melon University,
Pittsburgh, PA 1981.

[NOVL90] Novell Corporation,
.I NetWare
.I System
.I Interface
.I Technical
.I Overview,
Addison-Wesley, 1989.

[POPE85] G. Popek and B. Walker,
.I The
.I LOCUS
.I Distributed
.I System
.I Architecture,
MIT Press, 1985.

[SAND85] R. Sandberg, D. Goldberg, S. Kleiman, D. Walsh and B. Lyon,
"Design and Implementation of the Sun Network File System,"
.I Usenix
.I Conference
.I Proceedings,
(June 1985).

[SCHE87] R.W. Scheifler and J. Gettys, "The X Window System,"
.I ACM
.I Transactions
.I on
.I Graphics
.I 5,
2 (April 1977).

[STEI88] J.G. Steiner, B.C. Neuman and J.J. Schiller, "Kerberos: An
Authentication Service for Open Network Systems,"
.I Usenix
.I Conference
.I Proceedings,
(February 1988).

[STEV90] W.R. Stevens,
.I Unix
.I Network
.I Programming,
Prentice-Hall, 1990.

[SUN86] Sun Microsystems,
.I Yellow
.I Pages
.I Protocol
.I Specification,
(1986).

[TAYL86] B. Taylor and D. Goldberg, "Secure Networking in the Sun Environment,"
.I Usenix
.I Conference
.I Proceedings,
(June 1986).