Characteristics of WWW Client-based Traces

The final instance of hyperbolic distributions in our data occurs as

an instance of Zipf's law [15, discussed in [11],]. Zipf's law was

originally applied to the relationship between a word's popularity in

terms of rank and its frequency of use. It states that if one ranks

the popularity of words used in a given text (denoted by p) by their

frequency of use (denoted by P) then

P ~ 1/p

Note that this distribution is parameterless, i.e., is raised to

exactly -1, so that the nth most popular document is exactly twice as

popular as the 2nth most popular document. Zipf's law has

subsequently been applied to other examples of popularity in the

social sciences.

Our data shows that Zipf's law applies quite strongly to documents on

the WWW. This is demonstrated in Figure 8 for all 46,830 documents

referenced in our logs. The figure shows a log-log plot of references

to each document as a function of the document's rank in overall

popularity. The tightness of the fit to a straight line is remarkable

(R^2 = 1.00), as is the slope of the line: -0.986. Thus the exponent

relating popularity to rank for WWW documents is very nearly -1, as

predicted by Zipf's law.

