FoRK and Spam..
Gerald Oskoboiny
gerald@impressive.net
Mon, 18 Mar 2002 17:20:32 -0500
On Mon, Mar 18, 2002 at 02:27:09PM -0500, Dan Brickley wrote:
:
> Coincidentally enough, I finally got around to moving to white-list based
> filtering this weekend. I've a list of 'known senders' harvested from
> various places (my sent-mail, addressbooks etc).
>
> I started out following Gerald's recipe:
>
> http://impressive.net/people/gerald/2000/12/spam-filtering.html
>
> ...and got to thinking about the possibility of white-list sharing, since
> my 'unknown senders' folder was initially at least still getting lost of
> false hits (mostly from people on mailing lists, but also from
> occasional correspondents who are known in the Webby community, but not
> in my sent-mail or addressbook).
very good idea imho; should eventually evolve into a big web of trust
thingy involving thousands of data sources... (and better tools so
you can say "this message got through but is really spam; decrease
the level of trust in the data source that told me it's legit"
with a single keypress)
> I think Gerald and others mostly don't try to whitelist for mailing lists,
> and just pipe mailing lists into separate folders. I was wondering whether
> if one adopted conventions for scrambling mailboxes (sha1/md5 or
> whatever), it'd be possible to harvest lists of 'known senders' from FoRK
> etc's list management tool.
>
> For example, my whitelist exposed as RDF looks like:
>
> http://tux.w3.org/~danbri/rdfweb/foafwhite.xml
>
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:foaf="http://xmlns.com/foaf/0.1/">
> <foaf:NonSpamMailboxURI foaf:sha1Value="721ae0b3232bf1ce6486d952fa6629ff31e6edf6"/>
:
> </rdf:RDF>
cool!
> You can harvest this, and use it (by downcasing and sha1-ing) to see if a
> mailbox is known to me and believed to be a non-spammer. But it doesn't
> readily expose my contacts list, and since it carries no semantic other
> than 'mailboxes dan has heard of that he doesn't believe are used by
> spammers', I'm not compromising my privacy or that of my various
> correspondents.
It is a tiny privacy hole in that it lets others find out that
you have e.g. premium-subscribers@rdfporn.com on your whitelist.
(I don't think I care about that, but some might.)
> So I was thinking I'd have a little whitelist harvesting script(*) pull in
> a few of these each day from friends and colleagues, making it that bit
> less likely that folk from (mumble) "the web community" would find their
> messages languishing in my unknown-senders folder.
This might make it cost-effective for me to start maintaining and
using blacklists as well as whitelists; also needed would be a
"refilter new mail in this mailbox" script (easily doable.)
> How's that sound? Anybody fancy trying this?
I'm all over it! (time/attention permitting; keep bugging me ;)
> (*) rough-cut Ruby code that implements much of this (requires external
> RDF parser) is at http://www.w3.org/2001/12/rubyrdf/util/foafwhite/foafwhite.rb
>
--
Gerald Oskoboiny <gerald@impressive.net>
http://impressive.net/people/gerald/