Generation of network effects on the Web

Jim Whitehead (ejw@ICS.uci.edu)
Sat, 19 Sep 1998 20:41:45 -0700


I've been thinking about how the Web generates network effects recently,
part of a paper writing effort where I'm comparing three generations of
hypertext systems (monolithic hypertext systems, open hypertext systems, and
the Web) to see if there is something about the architecture of these
systems which either helps or hinders the generation of network effects.
I'm writing this mostly to try and settle my thoughts, but also to see if
you have any insights you'd like to share.

My first attempt at this was this workshop paper:

http://www.ics.uci.edu/~ejw/papers/ejw_ohs98.html

Part of this effort has been an attempt to repeat the analysis in Rohlfs'
paper analyzing network effects in the phone system. In particular, I've
been thinking about the utility function for people who use the Web. In
Rohlfs, this utility function is specified in terms of the number of other
*people* who use the telephone (along with the utility they get from
consumption of other noncommunications goods). But, this utility function
doesn't describe how the Web generates network effects.

It strikes me that the typical user of the Web, a reader, derives utility
from the *information* available on the Web, and typically derives little
utility from other readers (this assumes that other readers passing along
relevant pointers is a minor effect, which is debatable). Thus, for readers
on the Web, the utility function is proportional to the amount of
*information* available, and isn't correlated strongly with the number of
other readers using the Web. In fact, when the network is congested, other
readers decrease the utility of the Web.

However, just making the utility function dependent on the amount of
information available doesn't seem entirely right either. If the Web were
to be frozen today, with no more content added, the utility of the Web would
decrease over time as the information continued to become stale (but the
utility probably wouldn't zero out). This indicates that the utility of the
Web for a reader is related not just to the amount of information, but to
the freshness of that information. This makes intuitive sense, since one of
the significant advantages the Web has over existing "print" media is the
immediacy of the information.

Of course, even this utility function is suspect, since Web characterization
research shows that the popularity of information on the Web is very
uneven -- two studies of multiple logs shows that roughly 25% of servers
account for over 85% of traffic. This indicates that for any individual,
the utility of the Web is heavily weighted towards the sites they visit most
frequently with a smaller weighting for the rest of the information
available on the Web. Presumably the most frequently visited sites are ones
that are regularly updated.

What is clear from looking at the utility function for readers is that the
generation of network effects for the Web can be modeled as a
hardware/software system -- the "hardware" is the Web browser, while the
"software" are the web pages available. In this respect, the Web is
somewhat analogous to a TV, or radio, in that a reader is motivated to
"purchase" a web browser to gain access to the free information available
(and in the case of WebTV, the reader is in fact purchasing a browser,
although this case is complicated by the bundling of email, which may
actually be more compelling than the Web access).

Of course, as the VCR vs. Beta case demonstrates, in a hardware/software
system, content is the key determinant of the generation of network effects.
As a result, author's motivations are important. For authors on the Web,
the utility of the Web is directly related to the number of people who
actually view the author's content. Web characterization work gives a
complex picture, showing in one study
<http://www.eecs.harvard.edu/~vino/web/sits.97.html> that traffic growth for
Web sites fall into several categories:

- traffic is directly proportional to the number of readers on Web (free
software site)
- traffic is related to advertising/site overhauls (business site)
- traffic is related to the amount of content on the site (education sites,
like Harvard)
- traffic is related to the number of documents each person views (i.e.,
they encourage deep travel within the site) (government, web site designer,
internet service provider)
- traffic is related to number of search engine hits (adult entertainment)
- cost (professional society site)

All sites gain utility from people reading their content. But, the actual
relationship between the number of readers on the Web, and the growth of the
number of readers of a particular site depends on the type of site. For
example, during the study, the adult entertainment site went out of business
due to dropoffs in traffic generated by lack of hits from Internet search
engines.

I would like to say the incentive for an author to place content on the Web
is the total number of readers on the Web, hence creating a nice feedback
loop where content entices readers, who motivate more authors, who create
content which entices more readers. But, since there isn't a direct
correlation between number of readers on the Web and number of readers of a
particular site, for all types of sites, it's hard to make this assertion.
On the other hand, clearly there are more authors on the Web now than there
were previously, which requires some explanation.

In the study of web site growth, the web sites studied mostly experienced
traffic doubling periods between 1 month and 3 months, with longer durations
at 6 mos, 1 year, and 3+ years. This suggests that, despite what this
growth may be correlated to, for most of these sites the underlying cause of
the increase in traffic is the rapid growth of the number of readers on the
Web. This then provides some justification for the feedback loop model of
generation of Web network effects.

To summarize, network effects on the Web are generated by a feedback loop
where Web readers are enticed by (fresh, frequently-viewed) content, and
authors are motivated to provide this content by the number of readers of
the content. The number of readers of the content is (directly and
indirectly) related to the total number of readers on the Web. Growth of
the Web is ensured because existing content on the Web is more valuable than
non-Web content, which lures new readers, leading to more authors.

- Jim