[FoRK] Q re: ConceptNet (also FluidDB)

J. Andrew Rogers andrew at ceruleansystems.com
Wed Oct 21 10:03:40 PDT 2009

On Oct 21, 2009, at 7:43 AM, Jeff Bone wrote:
> JAR asks:
>> Are we talking about ConceptNet/OpenMind specifically or semantic  
>> web technologies generally?  And how do you define "interesting"?
> ConceptNet specifically.  It's got some interesting attributes and  
> applications that most of the "strong" ontological systems don't  
> have (at least in my own uses of them.) "Interesting" in this case  
> means just that --- useful beyond what e.g. Cyc, various rdf /  
> semweb technologies, and so on have been in my experience.   
> Particularly when dealing with large and diverse natural language  
> corpuses, CN2/3 (with supplemental data) have proven more useful at  
> various extraction, classification, and extrapolation tasks than  
> other methods including statistical ones, naive bayes, etc.  (For  
> me.  YMMV.)

In my experience, you are entirely correct that most semantic web  
technologies are over-structured to the point of being not very  
useful.  Most of the interesting R&D currently is on very general  
graph analytic systems that subsume the rigid classical models but do  
not require them.  Rigid models are more susceptible to the numerous  
NP land mines that litter this theoretical landscape.

> The actual CN tools themselves aren't as useful (you're correct,  
> toys so far) as the knowledge base per se and its data model ---  
> which can be easily embedded into a slightly more robust model that  
> more easily and effectively handles meta-information such as  
> provenance, etc. --- and does so defeasibly if necessary, important  
> for real-world use.  IMHO, those are the major conceptual problems  
> w/ the rdf-like approaches;  representing defeasible information as- 
> such and handling reification and self-referentiality.  Do-able, but  
> not in a satisfying or particularly practical way.

Yes, which is why a lot of the current R&D is focused on generalized  
graph-like computation, not the narrow case of RDF-like systems.  You  
can do it with RDF-like systems in principle, but not efficiently and  
efficiency is very important for most real work at the scales  
required. Current popular tools and models are badly designed for the  
actual markets for this kind of technology.

The primary real (and "interesting") use case for these types of  
models in commercial and other systems is  induction and prediction in  
highly dynamic data environments at large scales.  There are  
organizations starting to build more generalized graph systems for  
this purpose at very large scales, but it definitely isn't open source  
(or even shrink-wrap) technology at this point. It will probably be a  
few years before this creeps into the web at large.

More information about the FoRK mailing list