[FoRK] large scale dataset mailing list/resources?
Ken Meltsner
<meltsner at alum.mit.edu> on
Wed Feb 20 12:18:25 PST 2008
I've looked a bit at Virtuoso, an open-source "everything but the
kitchen sink" database system, including RDF, a virtual database
(distributed queries), web services, etc.
They would seem to scale up to your friend's application size:
http://virtuoso.openlinksw.com/blog/
Most straight RDF triplestores seem to hit the wall at millions of
triples. I believe the Holy Grail is 1 billion triples; that's not
huge by relational database standards, but it's definitely big enough
for most applications.
And another side note:
Reuters, via a subsidiary named ClearForest, is now offering free
entity tagging for text. There's a browser interface at:
http://autotagger.opensynapse.net/
if you want to try it out before appying for a key to use the web service.
The main developer site is:
http://opencalais.com/
Ken Meltsner
Excerpt:
What is Calais?
We want to make all the world's content more accessible, interoperable
and valuable. Some call it Web 2.0, Web 3.0, the semantic web or the
Giant Global Graph - we call our piece of it Calais.
The core of Calais is our web service. We're working to make this
service more accessible by developing sample applications, supporting
developers and offering bounties for specific capabilities.
More information about the FoRK
mailing list