Canonical argument for (and against) layered protocol design: RFC 817

Rohit Khare (
Wed, 1 Jul 1998 04:31:53 -0700


Dave Clark had much of the argument down in July 1982 when this RFC was
published. It's one of the only cogent published references to the
tradeoffs in choosing implementation boundaries I've come across.

Is there a better, more canonical paper on layering as an abstract
technique to itself?

To me, the argument for layering can be visualized as a lattice of muxes.
Consider a "monolithic" solution with N inputs and outputs, a fully
connected evaluator that simultaneously considers all of its inputs and
generates all N outputs. It's a long, wide rectangle: the box represents
N^2 complexity. Layering is about achieving that same computation by
"abstracting" some of the inputs: using them at higher layers to control
the muxing of the (now smaller) set of signals in from the bottom. The
trick is that the total "surface area" (complexity) of the muxers in a
stack is less than the one big one.

Think about how complex the monolithic code to react to any IP packet
pattern and discern that it was a filesystem action or a web transfer or a
realaudio stream; separation of concerns is the only way this system could
be understandable at all.

N bits out
| |
| N^2 mux |
|_ _ _ _ _ _ _ _ _|
| | | | | | | |
N bits in

Split into two layers:

N/2 out
k _| .25 N^2 |_ (where n, k are
|_ _ _ _ _| "more abstract"
| | | | control-flow bits)
n _| .25 N^2 |_
|_ _ _ _ _|
| | | |
N/2 in

Ah, this is a lame explanation. All I can say is that in my intuition,
entropy is like toothpaste: you can squeeze it around, from data format to
interpreter, from client to server, but it's irreducible. Except for
layering, a divide-and-conquer that actually reduces the volume of

Anyway, on to the genius of Dave Clark. Highlights include:

In fact, this RFC will argue that modularity is
one of the chief villains in attempting to obtain good performance, so
that the designer is faced with a delicate and inevitable tradeoff
between good structure and good performance. Further, the single factor
which most strongly determines how well this conflict can be resolved is
not the protocol but the operating system.

[Here's why systems like ubernet (in Infospheres) can't ever be useful
outside laboratory settings]

Many operating systems take a substantial fraction of a millisecond just to
service an interrupt. If the protocol has been structured as a process, it
is necessary to go through a process scheduling before the protocol code
can even begin to run. If any piece of a protocol package or its data must
be fetched from disk, real time delays of between 30 to 100 milliseconds
can be expected. If the protocol must compete for cpu resources with other
processes of the system, it may be necessary to wait a scheduling quantum
before the protocol can run. Many systems have a scheduling quantum of 100
milliseconds or more. Considering these sorts of numbers, it becomes
immediately clear that the protocol must be fitted into the operating
system in a thorough and effective manner if any like reasonable throughput
is to be achieved.

[Where can you build something like Telnet into a system? HTTP? A
stock-quote service piled even higher on the stack, atop HTTP? should any
of these be loadable kernel modules?]

There are normally three reasonable ways in which to add a protocol to an
operating system. The protocol can be in a process that is provided by the
operating system, or it can be part of the kernel of the operating system
itself, or it can be put in a separate communications processor or front
end machine.

[Of course, the risk is precisely that monolithic systems are more complex:
one muxer is more complex than two]

Thus, the programmer who is forced to implement all or part of his protocol
package as an interrupt handler must be the best sort of expert in the
operating system involved, and must be prepared for development sessions
filled with obscure bugs which crash not just the protocol package but the
entire operating system.

[Section 4 is recommended in general for its consideration of several
hypothetical slices: at IP, at the upper interface of TCP, above Telnet;
and beyond. Section 5 considers the scheduling of one process per network
interface, or one process per logical TCP connection; introduces the
interdependencies for decent application performance. For Telnet, the TCP
driver needs to wait a few milliseconds in case there's another character
to send to conserve packets by piggybacking its ack on it; the same driver
for file transfer should never dally, lest the sending window narrow. It's
just like the tradeoff in OS modularization, like filesystems -- which any
high-performance database system has to circumvent.]

[For MsgList:] "The concept of a protocol is still unknown and frightening
to most naive programmers." [Also] "TCP is a more complex protocol, and
presents many more opportunities for getting things wrong."
[on scaling to the future Internet:] "keep in mind that there may be as
many as a thousand network numbers in a typical configuration."
[From the Conculsion]
Most discussions of protocols begin by introducing the concept of
layering, which tends to suggest that layering is a fundamentally
wonderful idea which should be a part of every consideration of
protocols. In fact, layering is a mixed blessing. Clearly, a layer
interface is necessary whenever more than one client of a particular
layer is to be allowed to use that same layer. But an interface,
precisely because it is fixed, inevitably leads to a lack of complete
understanding as to what one layer wishes to obtain from another. This
has to lead to inefficiency. Furthermore, layering is a potential snare in
that one is tempted to think that a layer boundary, which was an
artifact of the specification procedure, is in fact the proper boundary to
use in modularizing the implementation. Again, in certain cases, an
architected layer must correspond to an implemented layer, precisely so
that several clients can have access to that layer in a reasonably
straightforward manner. In other cases, cunning rearrangement of the
implemented module boundaries to match with various functions, such as
the demultiplexing of incoming packets, or the sending of asynchronous
outgoing packets, can lead to unexpected performance improvements
compared to more traditional implementation strategies.

[A lot of the ideas here clearly led on to ALF, Clark's Application-Level
Framing, and MIT's exokernel work, led by Kaashoek (trained, in turn, under