The demographic attack on web servers

Rohit Khare (khare@pest.w3.org)
Thu, 18 Apr 96 16:51:35 -0400


April 15, 1996
Looking Forward
Proxy servers will change the Web
By _Mark L. Van Name_ and _Bill Catchings_

)As the Internet and the Web continue to mature, new components and
technologies will rise to prominence. One technology that is about to gain
visibility and importance is the proxy server.

If you're not familiar with a proxy server, think of it as an Internet cache.
Like any cache, its primary goal is to reduce the traffic to its source--in
this case the Web or some other aspect of the Internet--by servicing requests
locally. This reduction can relieve bandwidth bottlenecks.

But proxy servers are not all good news. They have the potential to upset the
fragile area of server access accounting.

To see the possible win, consider a common Web activity, downloading a new
version of Netscape Navigator. Multiple folks in most organizations have
downloaded it, so those groups have wasted bandwidth moving the same bits.

Many groups solve this problem by having someone download the software late
at night and put it on an internal server from which others can fetch it.

A better solution would be to have a machine that caches the result of the
first download request and then services all subsequent downloads from a copy
in its local storage. That's a proxy server. Proxy servers can also cache Web
pages or anything else from the Internet. Proxy servers are invisible to
users, who just receive online items more quickly than usual.

Proxy servers do need to see all requests to manage their caches well. They
must also update their caches often enough that users always see adequately
current copies of the material they access.

A proxy server can even do more. For example, it can pre-fetch during times
of light activity all the pages a cached page references. Those pages will
then be on hand when users need them. Because all accesses flow through it, a
proxy server can also give managers a point of control for restricting access
to particular URLs.

A few caveats

The most obvious problem with proxy servers arises from the immediacy of the
Internet: Who wants yesterday's pages just because a proxy server has cached
them?

What should be bothering businesses is, however, a more subtle problem: Proxy
servers can stop content providers from getting accurate information about
who is hitting their pages. If, for example, a proxy server hits a popular
page once at 2 a.m. and then provides that page to users throughout the
following day, the provider of the page sees only one hit.

The main option providers have to counter this problem is registration. Most
folks, though, won't bother to register and so will remain invisible behind
the proxy server. The owner of the proxy server will know its users, but
content providers will see only the server.

Another option is to change pages so frequently that any proxy server will
either fall out of date quickly or have to update its contents almost
constantly. This approach, though, reduces the value of the proxy servers, and
we all need those servers to improve response time as the Internet grows.

A better answer is for proxy server vendors and content providers to develop
standards for working together. Those standards should define the ways proxy
servers will return access statistics and information. The standards should
also define mechanisms by which providers can help the servers know when to
update their caches.

_)_With this type of cooperation, proxy servers can realize their potential
and content providers can get the business measurements they need.