There's a set of N machines, arbitrarily connected together. The rule, as
stated by TimBL, is to use no more than root-N storange at any node.
Each machine has an address A, chosen from a fixed set (0..maxIP), and an
identity string I.
The solution is a pseudorandom 'coloring' function that can hash an I into
one of K colors. The set of k-colored machines is randomly distributed about
the network. Each machine n, though, can discover via gossip the 'nearest'
machine of each of the other K-1 colors. Similarly, the machine n can find the
nearest other machine of color-k and join a gossip-tree for keeping complete
information on the machines of that color.
Then, to resolve an I' -> A, talk to any machine of the color hash(I'). It
could be the nearest, which you know, or a random machine, which distributes
the load in case of failure.
Scaling N. This is one of the biggest objections: if you have a thousand
colors in the world, but another 100k machines join the network, how can we
'renumber' the color partitions? One is to assume that local-node storage
increases with Moore's Law, and 1000 colors is good for all time. The problem
is that Moore's Law is a statistical property, not an invariable expansion of
every node.
We use a transition strategy of running a 1000-colored world of tables and a
1001-colored set for a few weeks/days/minutes, and propagate the new,
monotonically-increasing value to the world. Alternatively, David mentioned
adding new tiers to reduce memory load from N^1/2 to N^1/3 (at the cost of
going from O(2) hops to O(3))
Narrow Pipes. There's no direct use of the underlying network topology,
though my guess is this would 'just work' when faced with a narrow
internetwork connection; a node would fill out it's 'zone' of K-1 colored
hosts with notes from 'this side' of the link.
Dynamic host-up, host-down. Not clear at all yet how reasonably this mesh
react to new hosts and the transience of existing hosts. What is known is that
if a host goes down, it's not going to be hammered with requests, nor its
alternate, if we each query is aimed at a different host of that color. [Of
course, I'm a little unclear on how you find other same-color hosts in the
first place, without asking in advance]
Deployment... well, we just won't talk about that in public, OK? I would
guess, though, that we could consider running this color protocol only between
some subset of the machines in the world, where I is restricted to a domain
name and A is a domain server (sound familiar?) Simulations seem key to taking
this idea further
Bottom line: unless we can find a way to change the requirements on him or
punch a hole, we may have to go forward with Dr. K and his students on this
one...
Rohit
PS. Everyone should feel free to blame Dave Raggett for planting the silly
fix-DNS idea in Dr. K's head
PPS. Today's Theorist Tip-Off: Only a theorist can latch onto root-n as a
'real number'. As in "Hey, N's a number, and N^1/2 is a *real* number!"
(Thanks to Matt Levine)