The Nareau Project

Date view Thread view Subject view Author view

From: Rahul Dave (rahul@reno.cis.upenn.edu)
Date: Tue Oct 24 2000 - 11:36:02 PDT


Folks,

I outlined some not terribly well formed notions in
http://www.egroups.com/message/decentralization/237 about the structure of a
next generation component system. At that time I mentioned thinking of working
on an umbrella p2p project structured around the notions laid out there.
My thoughts have since crystallized and thus
I wanted to share my notions for this new browser-server-component project to
ask for people opinions and hopefully drum up development interest..:-)

The ambition is to replace the users regular browser, given that the browser
itself is today controlled by two companies firmly locked into the restrict
users choices so that we can get money by being gatekeepers mentality. The
idea is to create a new browser, but not just a browser, also an accompanying
server and a web-based new object model.

This document is also available at:
http://p2pmania.dyndns.org/index.pl?node_id=467,
and was also posted to the decentralization group..

on the p2p web site I talked about in my last post.
The Nareau Project
------------------
Nareau is my name for an umbrella project for a new cross-platform
*open-source* web-wide browser+server+component application.

Nareau is the Spider-God and Creator in Kiribati Mythology.
In the beginning he walked alone in the oppressive darkness or
primeval earth Te Bo-ma-te-maki. From water and earth,
he made Na Atibu and Nei Teuke, man and woman respectively.
Together man and woman procreated the gods.

Many components of this kind of a system already exist. However, it is the
synergy of differemt metadata and the combining of information that delivers
real value. Applications which deliver a fair deal of similar functionality
that I know of presently are Magi, Radio Userland with its nodetypes,
Everything, but Everything is not peer2peer, xns.org, and I just heard of
groove which comes nearest with its COM integration and shared spaces. However
as you'll see in the next section, there is a fair bit of new approach I am
proposing.. a more web like object model, the elimination of files, a shell
language..

Finally Being a Linux user
I feel dis-enfranchised by windows only products and dont want to fall in that
80% of market trap business. I understand people have to do it but I want to
work on cross platform apps.

What it is?
==========
The basic idea ia a web wide, *extensible*, lightweight component
system with a browser-server and an open mechanism to
integrate other applications, documents, and services. The server will spawn a
browser as a co-procee where appropriate, and the browser need not be always
up, and may not even be there on "server" sides.

The key ideas of the
 system, which IMHO are either somewhat new, not developed fully developed
 elsewhere, or transplanted from their regular sphere of operation are:
 
(0) Peer2Peer system with "virtual" peers living in the cloud which handle
presence management and caching and backup(ok, this is not new but key)

(1)Component system with objects and tasks as first class objects, ie all
methods are objects, with interface extension by inheritance, and
implementation replacement on users choice.

(2) an object model which eliminates files and encapsulates storage enabling
easy caching and backup, seamless to the user. Objects publish metadata and it
is the synergy between multidimensional metadata from different objects(what
does the author of snows of Kilimanjaro do..i see he's also a hunter..whats
his hunting advice) that adds value.

(3)A namespace and security model flowing from user-centric cryptography and
capabilities with access control lists, programmed with the assumption of a
hostile network. Digital signature decouples ownership from storage and is a
key component to ensure the integrity of data.

(4) Two way links to urns scoped by namespacing rules. Links can be used for
event management and reputation calculation, as well as, well, the standard
web functions of linking.

(5) Routing system which allows for the creation of adhoc network subnets
and thus shared spaces, or group p2p networks, increasing the scalability of
the system and signal-to-noise ration of the data.

(6) Routing system which spreads queries on metadata in an
almost-but-not-quite gnutella style in which peers check their rumour
caches(see my previously mentioned post on decent list) to spread their
queries, and the downloading of data in a freenet style with caching along the
way saving bandwidth. So this system imposes a distributes querying model, and
a separation of metadata of the object from the data itself.

(7) The routing system allows for explicit or serendipitious sharing of
information such as annotations, links, and bookmarks which in conjunction
with an explicit reputation system allows for people to become experts; other
people to benefit from their expertise; and the serendipitios matching of
people with publically declared similar interests..

(8) Object and Namespace resolution by querying used in an extremely simple
shell scripting language which can be used by itself or embedded in HTML/XUL
web pages to enable regular users to script megawidgets such as calendars. As
much as possible, the idea is to enable authoring that makes programmers more
obsolete :-) Querying, namespace resolution, urn resolution are all built in
to the language and infact enable the simple working of the browser.

The basic application will be a component description and container layer
with access from a web server, IM server(jabber), SMTP+POP+IMAP client and
browser/IM client.

The key software leveraged will be mozilla, specifically mozilla's XPCOM
component technology and XUL for user display, Apache 2.0 for web, jabber for
IM. SOAP will be used as the component serialization/RPC protocol, RDF the
component description language, with local integration provided by
technologies like COM and bonobo, in conjunction with XPCOM.

I have already started developing using python, wxpython, and IE on windows.
This will move to C++, wxwindows-C++, and mozilla on Linux, windows, and Mac.
The first app on windows will allow people to share and browse web sites
together..something useful in shopping together or distance learning..and very
easy to program by just wrapping the IE control and communicating changes in a
SOAP message.

Mozilla +wxwindows will make it instantly portable to unix/mac/windows/....?
By replacing wxwindows with say fltk one can even make it portable to embedded
posixish os's with microwindows, where mozilla has been ported and an entire
 useful system can be had in a 32MB(still big) footprint for web-pads, etc

Features:
=========
  From a users
perspective, this will be a new browser, which integrates the locally
available applications with those at his or hers friends or at the users
favourite web site. Here are the features I am thinking of. These are mostly
aligned along the points made above..

(1)Component model with publishable metadata and first class methods. Thus the
ability to swap methods from one provider with another: eg the print method on
a document object or the buy method on a palm-pilot object.
 Using inheritance, third parties will be
able to add methods to interfaces and distinguish themselves.

(2)Decoupling data from storage through RDF backend drivers. Thus,
no explicit notion of files in the system. Cloud caching onto storage
servers and backup of objects will be automatically done. This will enable
AnywhereAccess for a user to his or her data, whilst preserving the P2P nature
of the system. The caching protocol will take into account the users presence
and permanence of connections.Eg: a visitor will be able to get published
objects from a cache when a users home machine is not online, or resolve to
the users office when the user is there, the storage may be a filesystem, or a
database, or an upto the minute cgi call..

So you could be at a office, or home, or on the road, or at an airport kiosk
and info could be picked up from your virtual presence in the cloud, or
depending upon your on-the-road hardware, even locally..

(3)Data Object Model has two roots, object and objecttype only.
One type of container called a bundle; a bundle representing an object, its
properties, and its methods is called a model. This is a very simple model
based on RDF and the Everything system(www.everydevel.com), but incredibly
powerful. The RDF enables later layering of semantic web operations and the
considerable work going on into extensibility and querying options.

(4)Adressing in a P2P system must deal with three aspects: presence, a
GUID, and a naming using this GUID. Initial global namespace model will give
way to a local namespace model using SPKI cryptography. The namespacing will
provide the urm adressing scheme, query prefixing, and method calling schemes.
That is, a GUID will be generated by the users SPKI public key, and mapping to
strings will be in the local namespace.

(5)SPKI and capabilities based access and execution control model with an
explicit notion of safe and unsafe methods. The security and namespacing model
flow from the same mechanism. In some cases, web sites may want to use the app
just to safely control first access before providing cookies. The notion of
safe methods means that a web site could execute some of its functionalitu
localy on the users web server. In addtion the SPKI based authorization
certificate system will determine who can do exactly what on a local system,
and this will enable safe routing. Safer operations will be faster.

(6)Mozilla and CLI based interface with focus on objects and tasks, where tasks
are a set of pipelined methods on objects, much like a unix shell. Simple
shell language whose design must be such that ordinary users can access it.
Deeper access through local component models and XPCOM, and methods ought to be
available from any language. Users dont care about files or other
representations, they want their data, and to do something on it, or they want
to do tasks, no matter what the object interaction there is.

(7)Authoring using HTML and XUL megawidgets such as calendars written with CSS
and Mozilla's XBL. The authoring is meant to remain as simple as possible
without requiring users to know javascript, but rather to use the object model
itself as exposed by the simple shell. Since each client is also a server,
authoring publishes locally, which is picked up by the cloud. The simple shell
scripting will resolve automatically, removing presence complexity out from
the user.

(8) Since the browser and server are co-processes, their internals are exposed
to each other via XPCOM interfaces. Thus the server can acess and store the
DOM tree in a browser, with all annotations. So instead of parsing xml twice
once at server and client, it can be done at user where appropriate, and
further the users manipulations there may be useful information which the user
might see fit to publish. Historical link trails, bookmarks, annotations may
be useful for sharing, as well as for simple scripting using links, events,
etc..sorta like C-x-e emacs macro recording but somewhat higher level..

(9) The notion of
groups and subnets based on group membership. The subnet could be as large as a
shopping network or as small as a family. ACL's will determine who accesses
what, and a particular machine could be on multiple subnets simultaneously,
without information leak from one subnet into another.

(10)Initially a F2F(friend2friend) network with broadcast rather than a
generalized P2P one with routing will be implementing. However this does not
scale. For larger subnets gnutella+yenta like metadata spreading and freenet
style data download caching will be needed.

(11) The publishing of objects to groups is a process analogous in somw ways
to syndication, and can be used for email where metadata is sent in email, and
the data is looked at only when email is read, even without the nareau
browser. Essentially syndication can be thought of as the process of applying
a query filter. Arbitrary collections of objects can be made and users can
choose their display formats. When objects are published the metadata are
spread to withing many peers horizons, whilst data itself is cached on
download.

(12) Linking. Direct linking will be possible to urms as described using the
namespace prefixes. The system will automatically expand the links at runtime
to a friends computer or the nearest cache, depending on context. The
existence of a server allows now for 2 way links, ie back links to the origin.
The forward as well as backward link structure can be used for implicit
reputation management, and for link change broadcast. Thus such links can be
used as event management tools and for caching and synchronization.

(13) Explicit reputation management using a centralized service is useful for
ranking products and services along multiple metadata axes. Implicit
reputation managemt could be done based on link structure.

(14) Knowledge sharing can be used to implement a yenta system where the
knowledge of experts as encapsulated in their public objects is shared and
consumed, and people can become experts through explicit reputation
management. Web page annotations and marking, bookmarks, links and published
syndications can be used for this purpose. It may even be possible to achieve
some of this in a distributed fashion, though I am not sure how.

Why Nareau is useful
====================
Or why I think I must work on it..

(1)In a web service model, what happens to yur data when your ASP dies? This
model provides a non-adhoc system for all ASP data.

(2)Brand dosent make sense
on the web unless you are already big. How can you compete on quality rather
than name? You publish the product as an object and the product and your
service can be rated on factors such as price, customer service and delivery,
all in one place. Since you may not be coming from a directory, you dont have
to pay deeply to a portal for your visibility.

Standard brand building has been a failure on the web since the web is not TV,
ie push. Many internet companies overspent themselves not realizing this
simple notion and remaining stuck in the old marketing. For the first time the
web offers the small person to compete; this has been so far squandered as due
to the centralizing powers that be there hasnt been integrated AT the user a
mechanism to compete through metadata comparision and reputation.

(3)The browser is stuck in the dark ages and will be as long
as the companies contolling then have the portal's gateway to information
mentality. Presently both portal companies and browser companies treat people
like eyeballs. My hope is that a system like this puts the portal and whatever
else right at the user, and changes the power equation

(4)Users dont manage organizing files well, and dont have web
wide rich metadata options for description and query. In contrast, building a
system based on an object graph allows for deep description, caching, and
replication seamlessly.
 
(5)Methods in an OS
should be cognizant of presence and context and dont do a good job of that. In
a web world, methods should not be tightly bound to objects, as the data
provider may not be the data operator.
 
(6)Dynamic web sites can export their
own metadata, enabling the user to use this sites services along with his other
metadata directly. Furthermore this allows distributed searching and
serendipitios combinations of information that might not have happened.

(7)Since certain parts of a web site may execute on users local or third
party server itself, it alleviates bandwidth and more importantly delay
problems, allowing for lighter load on web sites.
  
(8) Knowledge sharing
enables people to discover their commonalities and organize into groups and
applying the same process inside lear from experts and become experts
themselves as their reputation increases. People can filter content relying on
others.

Points (3) and (8) motivate me the most. As Dave Winer is apt to say...lets
get rid of the middlemen!

Examples of usage
=================
I will expand on these cases later, at a web site.
(1) Man publishing information for everyone and also something only his family
can see, that is, the ability to publish information, documents, photos, etc
into different spaces.
(2) A scientist managing a project team and
collaborating with them, where the app can be used as a shared development
project area.
(3) A young investor learning from other investors,
especially an old pro, by watching publ;ished objects.

Development Path
================
The initial prototype testing system is to be
python+IE+redfoot(RDF)+COM+pisces(SPKI).
The reason for this is mozilla is not yet scriptable from python. Activestate
will be out with this support soon.

Then onto mozilla, in C++ and wxwindows, with integration with local component
systems. The first project there in my mind is user collaborative project
workspace, a generalization of a prototype dictionary based project management
system I wrote last year in my job as systems programmer here at Penn. At this
stage there will be no metadata spreading separate from data, and only
brodcast with no complex routing.

From there onwards I want to generalize to a full consumer application. There
is an entire ecology of services that can be developed around such a system:
cloud storage services, inverse double blind marketing, customized
applications for intranets, vertical metadata extraction from document
services, namespacing, subnet specific services...

So are you interested in developing a new, cross-platform, web wide object
model and browser? What do you guys think? Lets get together and brainstorm
more then... I can be emailed at rahul@reno.cis.upenn.edu..whilst this list is
appropriate for architectural details, project details are best carried
offline, and I will set up sourceforge soon..

Thanks,
Rahul


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Tue Oct 24 2000 - 11:32:55 PDT