Re: [Seventh Heaven] Who Killed Gopher?

Rohit Khare (rohit@fdr.ICS.uci.edu)
Wed, 06 Jan 1999 17:59:08 -0800


> By @@verify@@April 1994, Web usage passed Gopher's share as well as
> every other Internet application protocol as a fraction of NSFnet

It was April 1995, in fact. I ought to know that because it was the month I
joined W3C!

It took about three hours to verify this today, though. I knew I had
researched the original statistics for TimBL just before I resigned --
and that it was very painful. It was again, today, even though I knew
the answer was in the FoRK archive:

http://xent.ics.uci.edu/FoRK-archive/spring97/0281.html

In fact, I wrote with some foresight:

-----------
Subject: When did the Web surpass other TPs?
Date: Wed, 16 Apr 1997 14:41:05 -0400 (EDT)

So, I've been digging to recapture a certain chart Tim BL is very
proud of:
the month that Web traffic surpassed gopher, email, etc on the
internet backbone traffic statistics.

[the above sentence is for the benefit of search engines, so others
will not wander through the wilderness of "no hits" or a "million
hits" like I did today]
------------

That FoRKpost isn't in anyone's public indexes anymore, it seem. Not
AltaVista, not HotBot, not Sherlock's suite. I gave up and started
installing WebGlimpse, from the Harvest package, but was stymied by a
fatal complier (not compilation!) error. I found a long essay at
C|Net's Builder.com on how to add a search engine, but curiously
enough there were no NeXTStep options :-)

Odd, given that the first search engine, WebCrawler, was indexed with
NeXT's DigitalLibrarian and IndexingKit. In fact, while searching NeXT
documentation to debug Glimpse, it occured to me all *I* needed to do was
drag FoRK-archive onto my bookshelf and it would be magically delicious!

And it was -- but it also triggered my memory to go try webcrawler,
and lawdy if it didn't spit out the right answer on its first result:
a copy of the original MERIT announcement at
http://www.cc.gatech.edu/gvu/stats/NSF/merit.html

DigitalLibrarian, in the final coup de grace, also found my FoRKpost
cited above.

--------------------------------------------

http://www.htdig.org/ has an open-source search robot that you can aim
at other sites in an intranet crawl. Gotta look into this one!

--------------------------------------------

As for Joseph's incisive comments:

> Very cool! Do you have a URI? (I'd rather bookmark or store the URI
> than the whole thing.)

http://www.ics.uci.edu/~rohit/IEEE-L7-http-gopher.html

>> Science proves a blind alley, though. Their fates were decided not on
>> technical merits, but on economic and psychological advantages. The
>> 'Postellian' school of protocol design focused on engineering 'right'
>> solutions for core applications (batch file transfer, interactive
>> terminals, mail and news relays) anchored in unique transport layer
>> adaptations (slow-start, Nagle timers, and routing as respective
>> examples). Our two specimens are 'post-Postel', in their details and
>> in their adoption dynamics. They are stateless; they don't have
>> (Gopher) or dilute (HTTP) the theory of reply codes; they scale
>> poorly, imperiling the health of the Internet; and they are 'luxuries'
>> for publishing discretionary information, not Host Requirements which
>> must be compiled into every node.
>
> There's some neat design philosophy ideas in here -- I suspect there
> is a useful ontology --- but its a confusing exposition. What does
> "right" core solutions mean? Who wouldn't advocate that? What is a
> "unique" transport layer? I'd read this as specific to the
> application, but in parenthesis you mention characteristics which
> makes me think my reading is incorrect. Post-postel throws me. Is the
> statelessness of HTTP counter to "right" solution, the unique
> transport layer, in details or adoption dynamics?

Well, the confusion is from over-compression. It's only a paragraph in
the article, so let me unfold some of the thinking:

* 'right' solution implies that there was any thinking at all, by
contrast to today's protocol development. That is to say, there was
a committment to elegance and even re-shipping a new product rather
than slavish obeisance to installed bases. So, 'right' in the sense it's
forward-engineered rather than reversed from early market entrants; and
theoretically grounded: the Network Virtual Terminal or NVFS, for example.

* 'core' as in "legitimate use of the net" -- supporting the formal uses of
the Internet as a cooperative computing platform rather than exploiting
it for 'frivolous' human communication. Arguably, the Postellian era was
focused on natural extensions of the interfaces built in to single-host
OSes: the Internet versions of the terminal, disk, and other interfaces.

* 'unique' because each of those adaptations symbiotically coevolved with
the application. The original 'core' was so diverse -- interactive Telnet
vs. batch FTP or one-hop connections over the connected Internet
vs. relaying across the UUCPnet/BITNET/FIDOnet/... -- that application
requirements had to be accomodated back at the transport layer. Telnet
process interrupts required priority TCP delivery. HTTP only wants a
mere 8-bit stream by comparison. Within the IETF, the transport 'people'
were directly involved back in the Postellian days -- two ADs took the
time to comment on the article's thesis to remind me "HTTP wasn't done
by anyone who knew transport, unlike the good ol' days" :-)

* Statelessness in HTTP is literally unlike the Postel style, which always
included a state diagram to untangle ("first MAIL FROM, then RCPT TO,
then DATA, then CRLF.CRLF,..."). The Postellian style looked more like
RPCs -- and even made sense back when latency was the same order of
magnitude as transmission time. Now that latency is far larger on fat
pipes, stateless request-response beats a meandering conversation to set
parameters and execute commands.

* 'adoption dynamics' The Internet wasn't always run on Web Years... while
the *artifact* was doubling in nodes/bits/packets at the same geometric
rate, I think that the social community wasn't tripping over itself like
today. That is to day, we can see intervals of years as new protocols are
tested, transitioned, and replaced, with some more forethought than 'the
next browser'. The converse is that marketshare meant less in standards
debates. You couldn't win a hissy fit over a feature by claiming to have
shipped X million copies -- adoption was a measured process guided by
the "wise men" of the IESG.

Not to say there *weren't* market dynamics and network effects going on
then! But try and convince me with a straight face byte ranges or cookies
were added to HTTP in a principled and considered manner, rather than
rammed through by Netscape customers in early days (Adobe Acrobat
per-page download and MCI's online mall shopping carts, respectively).

Expansively,
Rohit