Haystack, a personal information repository.

I Find Karma (adam@cs.caltech.edu)
Wed, 5 Mar 97 03:34:52 PST


They're in mid-project, but who isn't?

> The Haystack project is aimed at the individual customization end of
> these more realistic ``living'' information retrieval systems. We are
> interested in building on customizable substrates, such as those
> provided by Harvest or Content Routing, to create a community of
> individual but interacting ``haystacks'': personal information
> repositories which archive not only base content but also user-specific
> meta-information, enabling them to adapt to the particular needs of
> their users.

Sounds like a noble goal.

> We believe that such a system will let us address several questions:
> How can individuals use an information retrieval system to organize
> their own personal collection of information?


> How might an information retrieval system learn from its users and
> evolve over time into a more effective system?

| To: khare@w3.org
| From: adam@cs.caltech.edu
| unsubscribe fork-l

> As individuals build up their own collections and information
> retrieval systems, how can they search for information that might be
> located in others' collections, especially when such information is
> organized by information retrieval systems that may differ greatly from
> their own?

Altavista search with host:xent.w3.org

> Our first step towards this goal has been to design a simple and
> convenient user interface to and annotation format for an information
> retrieval system. Our current annotations emphasize user-independent
> text meta-information, but the format for and structure of these
> annotations are intended to encompass hand-generated and automatic
> user-specific annotations. The annotations themselves are first-class
> documents in our system, so that, for example, search information can be
> reified and treated as an indexable object.

I like that annotations are first class objects. But where are the


> Given that individuals are organizing the information they care about,
> it is natural to ask how one user can benefit from the work of other
> users. Consider that the typical way to search for a paper book is to
> ask one's office-neighbor for it. Analogously, we would like to let
> individuals search for information in other people's haystacks. Both to
> limit the costs of a search and to improve the filtering of what is
> returned, it is important for the system to learn over time which other
> individuals are most likely to have information that a given user finds
> relevant---these haystack ``neighbors'' are the systems that should be
> queried first and whose results should be most trusted.

This is cool. Trust networks are right on the ball.

> Another opportunity that this linking of haystacks creates is in
> connecting individuals to other people who can address their information
> need. The information I have stored in my haystack is likely a good
> indicator of my knowledge and interests. A question that matches a lot
> of material in my haystack is likely to be a question I can usefully
> answer. The haystack system can therefore serve as an ``information
> brokerage'' connecting questioners to experts.

Much in the way that http://www.ffly.com/ isn't.

> Sharing haystacks also raises the issue of generalizing from
> individuals' customization of their own haystacks to larger (pooled)
> data-sets. This provides another opportunity to test the adaptability of
> query strategies and a test of the generalization of the underlying
> learning algorithms.

So let me get this straight. Not only is the axiom "Links are
knowledge" true, but also, "Queries are knowledge" is true too?

This work sounds decent, but I couldn't get their software to run at
Caltech, so for now, I'll take their word on it.


Marketing is the creation of long-term demand, while sales is the
execution of marketing strategies. Marketing is buying the land,
choosing what crop to grow, planting the crop, fertilizing it, and then
deciding when to harvest. Sales is harvesting the crop.
-- Robert X. Cringely