Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

While in college, my housemates had a Linux mail server configured with individualized spam filtering; every user had an IMAP "spam" folder and a "ham" folder; they could move false negatives from their inbox to the spam folder, move false positives from spam to ham, and a nightly job would run and generate custom statistics for each user. It was remarkably slick, and for years I've been trying to figure out what that setup was. Does anybody have any ideas, or actual links to similar tutorials?



The "spam" and "ham" IMAP folders correspond to a certain file or folder(1) on the server. Email users have the learning commands in crontab. These are as simple as "sa-learn --spam /path/to/spam_folder; sa-learn --ham /path/to/ham_folder".

(1) file if the backend is mbox, folder if the backend is Maildir.


We did this for my old company, with Cyrus IMAPd and spamassassin. The mail directories are regular directories on the server, and then it's just a cronjob. I regret that we didn't think of Google's priority inbox, which would have been trivial to implement!


It probably was using a bayesian filter of some sort. There are many, I use bogofilter because my email app comes with it, and there is HNer jgc's popfile.

Actually setting up a multi-user SMTP and IMAP server on Linux is straightforward if you've done it before, or it is here-be-dragons stuff if you haven't. (Your distro might configure it all up for you prepackaged, or you could be in for a lot of reading.) Integrating the bayesian filter is probably the easy part.

If it is only for you, it is perhaps easier to integrate it with your email app rather than at the server level. Popfile will do that.


I do something similar for my customers via Roundcube/SpamAssassin. With a little bit of effort, you can get SpamAssassin to store its Bayes statistics in MySQL, and then you can have a "mark as not spam" and "mark as spam" function in Roundcube which ties back to sa-learn which will update the per-user Bayes data in MySQL.

Getting it to work right was a little fiddly though.


I used this setup for years before switching to google apps using dovecot and dspam.

Doing a quick search this seems to be very close and up to date. http://www.owlfish.com/thoughts/dovecot-antispam-2011-03-21....


I'd just e-mail them and ask them. (And hope that your email doesn't fall into the spam box :) )




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: