DSPAM Notes
I've been using SpamAssassin for the last three years, and about six months
ago I decided to switch to DSPAM. There were several
reasons for the switch. First, SpamAssassin has a huge memory footprint, by
my standards. My email is processed on a Linode host, and the "machine" only
has 128MB to be shared among Apache, OpenSSH, Postfix, and all other services.
SpamAssassin hogged memory and forced the machine into swap thrashing.
Second, SpamAssassin's traditional detection methods are outdated, and it
seems to have incorporated statistical learning methods as an afterthought.
DSPAM focuses on the more successful techniques. Third, SpamAssassin has no
simple retraining mechanism for incorrect classification, while DSPAM has a
web interface and also accepts bounced messages.
Unfortunately, DSPAM is a beast to install. I estimate that it took me
about 10 hours to get things basically working. The problem is that DSPAM's
configurability is far more advanced than its documentation. Potential users
of DSPAM are already using Sendmail, Postfix, Exim, QMail, or some other mail
server. They may wish spam filtering to occur at any of a number of stages in
the mail delivery process. They may or may not wish to use the web management
interface, and they may or may not choose to allow retraining by bouncing or
retraining with command-line tools. Each combination of these configuration
choices produces a complex system with a variety of challenging subtleties,
particularly involving file ownership and permissions.
My Setup
I installed DSPAM on a CentOS box (almost identical to Redhat Enterprise
Linux 4). The MTA is Postfix. Procmail is used for the last stage of
delivery. DSPAM's web interface is available, and users can email to
"spam@yourdomain.com" to retrain a message as spam (as opposed to DSPAM's
default of spam-username@yourdomain.com, which is impractical).
The following notes describe what I had to do to get this working. Most
of the details I've included are important. Usually it should be clear why (at
least to experienced system administrators). I've tried to be more descriptive
when discussing the more esoterical details (like anything involving suexec).
If the "why" isn't clear, please send me an email, so I can fix it.
I gathered a lot of information from the DSPAM Wiki. Go there for more
information on setting things up if your mail server configuration and
requirements are considerably different from mine.
Installation of DSPAM
- Compile DSPAM
- Download and unpack the DSPAM distribution (I got dspam-3.6.2, for the
record). I couldn't find an RPM, and it was easy to compile anyway. I did
"./configure --sysconfdir=/etc --with-dspam-home=/var/dspam". Some other
potentially useful options include "--enable-large-scale" and
"--enable-clamav". Make and make install work as expected.
- Create a "dspam" user and a "dspam" group.
- If you're going to be setting up the web management system (which I am
pleased with), the dspam user had better have a UID above 500, and the dspam
group must have a GID greater than 100. I don't think you want to recompile
suexec. If you aren't using RHEL, and you think your particular suexec might
have different requirements, just run "suexec -V" and look at what AP_UID_MIN
and AP_GID_MIN are set to.
- Create /var/dspam.
- The /var/dspam directory holds user quarantines, learning files, log
files, etc. It should be owned by dspam:dspam (user:group) and should be
chmodded to 771, 775, 2771, or similar. You might as well create
/var/dspam/data with the same ownership and permissions.
- Set ownership and permissions for dspam.
- /usr/local/bin/dspam (or where ever you put it) needs to be owned by
root:dspam and permissions should be 2755 (setgid dspam).
- Make /etc/dspam.conf.
- Unless you have a good reason for another storage driver, set
"StorageDriver /usr/local/lib/libhash_drv.so". You'll want to set
"HashAutoExtend on" for sure.
- Make /var/dspam/txt.
- Edit firstrun.txt, firstspam.txt, and quarantinefull.txt to suit your
needs (found in txt/ in the DSPAM distribution), and copy them to
/var/dspam/txt/.
Configuring Delivery (Postfix and DSPAM)
- Make Procmail setuid root.
- DSPAM needs to call Procmail for delivery. If it's not setuid root, it
can't run as the correct user. The ownership for procmail should be
root:mail. Permissions should be 4755.
- Set Procmail as DSPAM's delivery agent.
- Put 'TrustedDeliveryAgent "/usr/bin/procmail"' and 'UntrustedDeliveryAgent
"/usr/bin/procmail -d %u"' into dspam.conf. While you're there, set
'Preference "spamAction=quarantine"' and any other defaults you want.
- Set DSPAM as Postfix's delivery agent.
- Put "mailbox_command = /usr/local/bin/dspam --deliver=innocent --user $USER -- -d %u" into
Postfix's main.cf.
- Setup the retraining addresses.
- Add 'transport_maps = hash:/etc/postfix/transport' and
'local_recipient_maps = proxy:unix:passwd.byname $alias_maps $transport_maps'
to Postfix's main.cf. Create /etc/postfix/transport (with lines
'spam@yourdomain.com dspam-retrain:spam' and 'ham@yourdomain.com
dspam-retrain:innocent'). Create the dspam-retrain transport method by adding
the following to Postfix's master.cf:
dspam-retrain unix - n n - 10 pipe
flags=Ru user=dspam argv=/usr/local/bin/dspam-retrain $nexthop $sender $recipient
Setting up the Web Interface
- Put the CGI files into /var/www/dspam.
- The CGI files must go somewhere under /var/www, and /var/www/dspam works
well for most people. If you try to put it somewhere else, suexec will refuse
to run. If you aren't on RHEL, run "suexec -V" and look for AP_DOC_ROOT to
find the directory under which your dspam directory will have to be. Note
that suexec looks for the full path (symlinks can't be used to avoid the
problem).
- Install mod_auth_something_that_works_for_you.
- Install mod_auth_shadow, mod_auth_imap, or some other Apache authentication
module that will allow your users to authenticate to DSPAM's CGI interface.
- Add /var/www/dspam to httpd.conf
- Make sure to set "SuExecUserGroup dspam dspam" (I think this can only be
done on the VirtualHost level, so be careful about what else is in there).
Under "", you'll want to have "DirectoryIndex
dspam.cgi", "Options ExecCGI", "AddHandler cgi-script .cgi", and the
particular authentication options that your mod_auth_whatever requires
(including "Require valid-user).
Personal Procmail Rules for DSPAM
While I think the web quarantine is great, I personally prefer to use Mutt
for it. I use Procmail rules to separate the wheat from the tares (the ham
from the spam). For this to work, DSPAM needs to be told to mark spam messages
without quarantining them. Just add "spamAction=deliver" to your personal
dspam prefs file.
The following is a simple but incomplete rule for filtering spam. It will
put all spam into a mail box (maildir format, in this case, because of the
trailing slash) called "spam".
:0
* ^X-DSPAM-Result: Spam
spam/
Unfortunately, not all spam is created equal. I personally prefer to dump
obvious spam to a separate mail box or to /dev/null. The following rule
delivers spam with a DSPAM confidence level of .85 or above to the "superspam"
box and sends all other spam to the "spam" box (both in maildir format).
:0
* ^X-DSPAM-Result: Spam
{
:0
* ^X-DSPAM-Confidence: 0\.(9|8[5-9])
superspam/
:0
spam/
}
If you want the cutoff for superspam to be .8 instead of .85, the end of
the regex would become "0\.[89]". For .7, it would be "0\.[7-9]". To further
customize these rules, I recommend consulting the "procmailrc" and "egrep" man
pages.