[FoRK] PersonalWeb patents "unique" hashes of content

J. Andrew Rogers andrew at jarbox.org
Wed Sep 19 23:12:56 PDT 2012

On Sep 19, 2012, at 10:38 PM, "Stephen D. Williams" <sdw at lig.net> wrote:
> The test here are the kinds of applications that can be applied. The canonical CAS application could not work well on any other kind of database method.  The canonical application is storing any number of any kind of data exactly once for each unique blob, creating a fixed-length short (hash length) key for each blob that is globally and repeatably unique.  Document store, deduplicatating block/backup store, etc.  Store a bunch of files files independently in a distributed fashion so every instances uses the same key for the same data allowing any kind of sharing, merging, or federation to work.

No one I have ever run into defines the terminology so tightly and it is awfully close to what I have been doing for a long time. If that's the way you want to define it for the sake of argument then I won't argue it. I'm not wedded to the notion; this is the first time I have had any pushback on the terminology.

>> Space-filling curves were largely never used for indexing for a reason. Hilbert-based indexes were among the best but still not great.
> I understood that their most successful use was probabilistic clustering of record location for databases.  Apparently someone implemented a color space mapper in hardware using the Hilbert somehow.

Sure, but that was like 20 years ago. The computer science would not support such an argument today. Hilbert curves haven't made sense in years. Even in their prime for indexing, the benefit was on the order of 10-15%.

My point, long lost, was that I can content address arbitrary polygon data in a 12-dimensional space based on a hyper-rectangle intersection -- not simple equality -- relationship for about the computational cost of a fancy crypto hash while using a thousand node cluster.  

We can call it not true "content-addressing" but if that is not content-addressing then I don't want content-addressing. I want whatever the guys doing non-hash fancy content-addressing are doing. What do you suggest we call it, ignoring that it is literally content addressing?

More information about the FoRK mailing list