[FoRK] This Man Wants To Control the Internet

Eugen Leitl <eugen at leitl.org> on Wed Oct 31 04:17:32 PDT 2007


This Man Wants To Control the Internet

10.25.2007 This man wants to control the Internet. And you should let him.

by Carl Zimmer, Photographs by Trujillo/Paumier 

John Doyle is worried about the Internet. In the next few years, millions
more people will gain access to it, and existing users will place ever higher
demands on our digital infrastructure, driven by applications like online
movie services and Internet telephony. Doyle predicts that this skyrocketing
traffic could cause the Internet to slow to a disastrous crawl, an endless
digital gridlock stifling our economies. But Doyle, a professor of control
and dynamic systems, electrical engineering, and bioengineering at Caltech,
also believes the Internet can be saved. He and his colleagues have created a
theory that has revealed some simple yet powerful ways to accelerate the flow
of information. Vastly accelerate the flow: Doyle and his colleagues can now
blast the entire text of all the books in the library of Congress across the
United States in 15 minutes.

I travel to Pasadena to learn about Doyleâ™s work and am a bit flummoxed
by his suggestion to meet not in his lab but in his gym. Doyle gets on a
treadmill and begins to pound away. I turn on my machine and try to keep up.
At 53, Doyle is long, lean, and hawk-faced. He is a championship athlete, and
he works out furiously twice a day at least. Itâ™s hard for me to catch
enough air to ask Doyle questions, but he has no problem holding forth in
response. As he talks I begin to realize his secret agenda in meeting me
here. A treadmill is the perfect place to start to understand his ideas,
because for Doyle, the world is filled with complex networksâ”and a body
in the middle of a workout is a very good example of what those networks are
all about.

A system of linked computers like the Internet is obviously a network, but so
are jetliners, human bodies, and even bacterial cells. Theyâ™re all
networks because they are made up of lots and lots of parts that work
together. Robust networks have parts that continue to work together smoothly
even if conditions fluctuate unpredictably. In the case of the Internet, a
million people may try to send e-mail at once. In the case of Doyleâ™s
body, here on his treadmill, its physiology holds steady even as he pushes
himself to his limit. âœInside of you, everythingâ™s going crazy,â
Doyle says, âœbut itâ™s all keeping your body temperature steady and
your body upright.â

Doyle knows, however, that networks that look perfectly sound can be headed
for collapse with little warning. He has found that in order to achieve
robustness, all systems must follow certain rules. Robustness doesnâ™t
come cheap. As a system is tuned to become robust under one set of
conditions, that tuning makes the system fragile under other, sometimes
unexpected, conditions. Robustness and fragility go hand in hand. While Doyle
pounds away on his treadmill, he offers his own body as exhibit number one.
He has optimized his body to meet the grueling challenges of winning
triathlons, but in doing so he has made his body vulnerable to problems that
rarely plague a nonathlete. He has bad ankles, a groin injury, and other
injuries earned over a lifetime of playing sports. In August, he almost died
after falling down a rock face while hiking in Panama.

The treadmill slows to a stop. Doyle checks his pulse. âœOne of the
reasons Iâ™m so interested in robustness,❠he said, âœis that
Iâ™m so fragile.â

Doyle came to MIT in 1975 and fell in love with a science known as control
theory. Control theorists, roughly speaking, try to understand how
complicated things can run efficiently, quickly, and safely instead of
crashing, exploding, or otherwise grinding to a halt. They analyze systems by
modeling the variables that dictate how they will behave. But rather than
checking through every possible combination of variables to see if, say, a
plane will fly straight or stall when the wind picks up, control theorists
look for underlying laws of control that can predict how something will
behave using just a few key variables. âœControl theory is at the center
of modern technology,❠Doyle explains.

As technology has become more complex over the past century, researchers have
had to find new ways to control airplanes, factories, computers, and the
like. Much of that progress has come by brute-force tinkering, but a lot of
it has come from a growing understanding of the basic laws of control. Doyle
began developing his own innovative ideas in control theory as an
undergraduate. By 1976 he was consulting for Honeywell. By age 32 he had been
hired with immediate tenure at Caltech.

Doyle made his mark by figuring out how to prove a system is robust. In the
early 1980s, NASA asked him to look at the space shuttle. Several shuttles
had already flown, but the agency wanted reassurance about their behavior
during reentry. Using wind tunnels and computer simulations, NASA had come up
with apparently stable designs, but there were too many variables to test
everything. âœYou had this obscenely large space of possibilities,â
Doyle says. âœSomewhere lurking in there could be a crash, and you
donâ™t know.â

Doyle looked at all the forces that might be exerted on a space shuttle due
to atmospheric conditions, its velocity through the air, and so forth.
NASAâ™s engineers had plotted these forces in a so-called multidimensional
spaceâ”say, a pitching torque along one axis and a longitudinal
acceleration along another. By developing new mathematical tools, Doyle
proved that there was a volume of this multidimensional space, inside of
which every combination of forces was certainly safe. Outside that region
lurked disaster. The space shuttle design was lodged comfortably inside the
safe region.

+++ Part of the metabolic network of the E.colimicrobe Image courtesy of
Ouzounis & Karp/Genome Research

âœNot only could we show it was safe, we could prove it,❠Doyle says.

The techniques Doyle developed to test the space shuttle have become standard
tools for testing new designs of airplanes and helicopters. But Doyle had a
more fundamental question on his mind. Just how big could the volume of
safety be made?

It is certainly possible to make things more robustâ”in other words,
expand the safe region. A jetliner is far more robust than the worldâ™s
first airplane, the 1903 Wright Flyer. Doyle took up a question that had
concerned researchers since the birth of control theory in the 1930s and
â™40s: Were there any fundamental limits to the growth of robustness? He
focused on one of the most important ways in which engineers make things more
robust: by adding feedback loops. A jet can keep track of its movement,
temperature, and a long list of other readings, and it can continually
correct every one, adjusting itself to bring variables back into line. But
Doyle showed how just cranking up robustness under some conditions creates
new opportunities for failure. A jet is far more stable in high winds than
the Wright Flyer, but on the other hand, it is vulnerable to software bugs
that the Wright brothers never had to worry about. âœYou replace
mechanical failure with lots of software failure,❠Doyle says.

In the 1990s, studying complex systems of all sorts became something of a fad
following the emergence of âœchaos theory.❠Competing versions of this
theory were emerging left and right; chaos was being touted as the science of
the future. Doyle was unimpressed by most of the new ideas. âœIt was clear
to me that they were just so far off the mark,❠he says. Doyle made up a
name that combined all the trendy buzzwords he came across: âœemergilent

One reason that Doyle loathes emergilent chaoplexity is because it relies on
superficial patterns. Doyle, by contrast, insists that his analyses draw from
the gritty details of how things actually work.

As an example, Doyle points to what are known as scale-free networks. Many of
these networksâ”interlinked sets of airports, friends, nerves in the body,
and so onâ”have the same basic structure. A few nodes are highly connected
hubs, while most other nodes have only a few connections. Any given small
city airport probably connects to just a few others. Passengers rely on being
able to transfer at a hub to reach most other places. But if you live in
Chicago, you can take a direct flight from Oâ™Hare Airport to hundreds of

Some researchers, like Albert-László Barabási at the University of Notre
Dame, have argued (pdf) that the Internet shares a similar structure and that
this accounts for why the Internet keeps humming even when some of its
systems fail. Since hubs are rare, failures involving them are even rarer.
But should a hub fail, researchers warned, it would lead to catastrophe.
Their warning made headlines, with CNN reporting in 2000: âœScientists
Spot Achillesâ™ Heel of Internet.â

Doyle was not impressed. âœEverybody who knew how the Internet worked was
puzzled by all this,❠he says. He decided to test the Achilles♠heel
theory by joining up with a group of collaborators and mapping a section of
the Internet in unprecedented detail.

In that map, they found no Achillesâ™ heel. The Internet does have a few
large servers at its core, but those servers are actually not very well
connected. Each one has only a few links, mainly to other large servers
through high-bandwidth connections. Much of the activity that occurs on the
Internet actually lies out on its edges, where computers are linked by
relatively low-bandwidth connections to small servers; think about how many
e-mails office workers send to people in their building compared with how
many they send overseas. If one of the big links at the core of the Internet
crashed, Doyle and his colleagues discovered, it would not take the Internet
down with it. Traffic could simply be rerouted through other big links.

A vast amount of traffic could make the Internet catastrophically fragile. We
could wake up one morning and nothing works.  The Internet works
spectacularly well, despite the fact that over the past 30 years it has
expanded a million-fold, absorbing new technology from BlackBerries to the
iTunes music store with hardly any major changes to the basic rules it uses
to move data. Doyle now knows why. Itâ™s not just the physical arrangement
of cables and servers that makes the Net so robust. Doyle and his colleagues
showed that the software that runs the Internet uses feedback, in much the
same way a jetlinerâ™s computer does. The Internet can sense changing
conditions and adjust itself.

The Internet has two kinds of feedback. It maintains a constantly updated
picture of the entire network so that messages can be directed along the
fastest routes. It also breaks down those messages and encapsulates them
inside standardized packets of data, a little like using the standardized
waybills and boxes provided by FedEx. Each packet can take its own path
through the Internet. As packets arrive at the recipientâ™s computer, the
message fragments in each packet are extracted and reassembled. Critically,
as each packet arrives, it sends back a receipt to the senderâ™s computer.
In heavy traffic, some packets get lost. In response to lost packets,
computers slow down the rate at which they send their data, reducing

Together, these two types of feedback give the Internet a robustness more
powerful than anyone anticipated. âœThese Internet engineers werenâ™t
control theorists, but they built this incredibly robust network,❠Doyle
says. âœMan, thatâ™s awesome.❠Then again, the engineers were doing
something that evolution figured out long ago.

Back at Doyleâ™s messy Caltech office, he props his robust yet fragile
body in a recliner by his desk and shifts the conversation from technology
back to biology. Around the time Doyle began to use control theory to
understand the Internet, he also began using it to explore the mechanism of
life. If his ideas about control really were universal, he realized, then a
cell ought to share some basic organizational principles with an airplane or
the Internetâ”although finding the similarities might require some
digging. âœIf you want to understand how airplanes fly, looking at birds
helps, but you may end up thinking itâ™s all about flapping,❠he says.
âœIf you look at bats and insects, too, youâ™ll see how itâ™s lift
and drag and things like that. You use them to understand the deep stuff.â

+++ Control theorists have pondered living things for decades, but until
recently they lacked the mathematical tools to analyze them as they would a
technological system. Doyle and his colleagues have created some of those
tools. In keeping with Doyleâ™s gritty real-world philosophy, he then set
out to see how they applied to a common bacterium, Escherichia coli. He soon
discovered remarkably precise parallels between living networks and
technological ones.

Going to the moon was trivial compared to dealing with this.  When E. coli is
heated to dangerous temperatures, for example, it can rapidly churn out
thousands of heat-shock proteins, molecules that help protect the
microbeâ™s workings. When the temperature falls, the heat-shock proteins
quickly get dismantled. Doyle demonstrated that this behavior takes place
through a series of feedback loops inside the bacterium, akin to the feedback
loops that keep an airplane on autopilot steady even as the plane is buffeted
by gusts.

Doyle is now tackling a far bigger network of genes in E. coli: the master
network responsible for governing its metabolism. He and his team are probing
the control systems that allow the microbe to eat many different kinds of
sugar and transform them into the thousands of molecules that make up the
bacterium. E. coliâ™s metabolism is nothing if not robust, able to easily
withstand significant environmental fluctuations.

The reason the bacterium works so well, Doyle finds, is that it is organized
in much the same way as the Internet. Both the Internet and E. coli are
conceptually organized like a bow tie, with a broad fan of incoming material
flowing into a central knot and then flowing into another broad fan of
outgoing material. On the Internet, the incoming fan is made up of data from
a huge range of sourcesâ” e-mail, YouTube videos, Skype phone calls, and
the like. In E. coli, the incoming fan is made up of the many sorts of food
it eats. As information and food move into their respective bow ties, they
get homogenized: E. coli breaks down its food into a few building blocks,
while the Internet breaks down its motley incoming data streams into streams
of standardized packets.

>From the knot, both bow ties then fan out. E. coli turns its building blocks
into DNA, proteins, membrane molecules, and any other special ingredient it
needs. On the Internet, data packets reach a computer, where they can be
reassembled into the original e-mail, YouTube videos, Skype telephone calls,
and the like.

A bow-tie organization allows both the Internet and E. coli to run quickly
and efficiently. If E. coli (like all bacteria, indeed like all living
things) did not have a bow tie, it would have to use a different set of
enzymes to make each of the thousands of different molecules it needs from
each type of food. Rather than use such a huge, slow system, E. coli just
points all its metabolic pathways into the same bow-tie knot, making
everything from the same raw materials. Likewise, the Internetâ™s bow-tie
architecture means that it doesnâ™t have different ways to handle, say,
e-mail traffic and instant-message traffic. Everything passes through as the
same types of data packets.

The bow-tie architecture also makes both the Internet and E. coli robust. If
the type of incoming material changes rapidlyâ”say, a surge in video
traffic in the Internetâ™s case, or a new food source for the E.
coliâ”the system can process that material without having to retool its
entire metabolism to cope.

Another advantage of a bow tie is that it makes feedback control easy.
Information travels back from a receiving computer to the sender, which can
speed up or slow down its packets in response. E. coliâ™s metabolism is
loaded with analogous feedback loops. Normally E. coli can synthesize all the
amino acids it needs for making proteins. But if it can get a certain kind of
amino acid from the environment, that information shuts down its own
production line.

But as Doyle points out, improving robustness comes with a price. The bow-tie
structure opens the door to a vulnerability that could prove very hard to
fix. Because of the homogenization that occurs at the heart of the bow tie,
itâ™s difficult to identify and block harmful agents. In the case of the
Internet, it takes only a short piece of code to produce a digital virus that
can spread quickly to millions of computers and cause billions of dollars of
damage. In living organisms, real viruses hijack cells in much the same way.

Doyle thinks the similarity between E. coli and the Internet is no accident.
As networks get big and complicatedâ”either through the tinkering of
Internet engineers or through millions of years of evolutionâ”they must
follow certain rules to stay robust. âœThere is an inevitable
architecture,❠Doyle says.

Over dinner, Doyle muses on how to deal with these fundamental
vulnerabilities. He hasnâ™t found a way to improve biological reliability
(yet), but he does think he can help address the Internetâ™s limits.

The current packet-receipt feedback system (known as TCP) has worked
wonderfully for years to control the flow of Internet traffic, but it
wonâ™t be able to cope with the coming jam, when fridges will scan the
RFID chip on a milk carton and send an alert when the expiration date
arrives. âœWhether we like it or not, [Internet equipment giant] Cisco
will network everything. Soon our glasses will tell the kitchen theyâ™re
empty,❠Doyle says. That vast amount of traffic will make the Internet
catastrophically fragile. âœWe could wake up one morning and nothing

Many Internet experts are also worried, and theyâ™ve launched several
projects to save the network, including Steven Low, another Caltech
professor. Doyle is working with Low on his project, which is unusual in its
simplicity. Their plan to speed up the Internet is to simply do a better job
of paying attention to measurements of Internet traffic. Today computers
sense Internet congestion by noticing how many packets they lose. Thatâ™s
like trying to drive down a highway by just looking at whatâ™s 20 feet
ahead of you, constantly accelerating and then slamming on the brakes as soon
as you see something.

Doyle and his coworkers enable computers to use more information about
traffic flow, noting how long it takes for their packets to get to their
destination. The less traffic, the shorter the time, and with these traffic
reports on hand, their computers make much smarter decisions. The result is a
string of victories for high-speed Internet communication competitions. In
the last face-off in 2006, they managed to send 17 gigabitsâ”about a
full-length movieâ™s worthâ”each second across the Internet. Doyle
smiles as he describes their success, a flash of the athleteâ™s spirit in
his face. âœYouâ™re not just proving theorems,❠he says. âœIt
beats anything anyone else can do.â

Last year the Caltech team started operating a company, FastSoft, to market
their protocol. In March they started selling a box about the size of a DVD
player that you can plug into a server. In one test, a Fortune 500 company
was able to speed up its transmissions 30-fold. But Doyle stresses that a
real solution to the Internet crisis will require rethinking the control
process from the bottom up.

âœIf someone said, â˜Do a radical redesign,â™ Iâ™d say weâ™re
not ready yet,❠Doyle confesses. âœGoing to the moon was trivial
compared to dealing with this. Weâ™ve got a research path, but thereâ™s
some hard math to be done.â

More information about the FoRK mailing list