[Fwd: Mail lives forever (Re: Bill Joy kicks around XML for fun :-)]

From: Stephen D. Williams (sdw@lig.net)
Date: Mon Feb 26 2001 - 05:50:18 PST

sdw@lig.net                 sdw@insta.com                
Stephen D. Williams         Insta, Inc./Jabber.Com, Inc./CCI    
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 

attached mail follows:

On Sun, Feb 25, 2001 at 07:24:24PM -0500, Stephen D. Williams wrote: > "Joseph S. Barrera III" wrote: > > > > Rohit Khare writes: > > > BJ doesn't get it. Data lives forever. Programs are only mortal. > > > > Okay, may I do a little poll, and ask everyone in what format they > > keep their megabytes (or gigabytes) of mail? > > Gigabytes (about 2), in mbox. >

1.2 Gig, mostly in a combination of mbox and gzipped mbox. Procmail splits incoming into about forty folders; I have a short script which flips the most recent mail into a separate dated archive every Sunday or so.

I use grepmail <http://grepmail.sourceforge.net/> and grepm <http://www.privat.schlund.de/b/barsnick/sw/grepm.html> for searching. It works well with mutt and gzipped directories.

I've just moved my home machine onto linux 2.4/reiserfs, and just for the hell of it switched to maildir format <http://www.qmail.org/man/man5/maildir.html>. I figure I can always recompose the files back into mbox if it gets awkward. So far the main problem is that grepmail/grepm doesn't support it, and command line grep balks at the large number of separate files. This shouldn't be a problem long term (I'll get around to writing a script eventually), but to be honest I don't see much long-term advantage either.

I suspect I'll settle on maildir for the incoming mail (better locking), and then turn the files into mbox for archiving. It sounds like if I was in your position, I'd stick with mbox too.

And Rohit's right. Tag the fuck out of everything, then dump it before your vampire code starts eating it to survive.

Hope this helps,


This archive was generated by hypermail 2b29 : Fri Apr 27 2001 - 23:18:34 PDT