Wednesday, September 22, 2004

SpamAssassin 3.0

Last night I upgraded our server to SpamAssassin 3.0, from 2.6x. There are some new featres that should help reduce the amount of spam that gets through.
  1. Built-in support for URI Blacklists like SURBL - This allows SpamAssassin to mark messages as spam, if they have urls that are in a lot of reported spam messages. Previously this was a plugin for the older SpamAssassin, which had to be installed separately. Here is my post where I described it.
  2. Built-in support for Sender Policy Framework - This helps SpamAssassin determine when messages may have a forged sender. Here is my post were I talked about SPF.
  3. Support to have the Bayesian databases put into a mysql database. - This one really interst me for two reasons:
    1. To allow backup mail hosts to have the same bayesian database for spam checking. Most people who have a backup mail server only have the up to date bayesian database on their primary mail server.
    2. To have the potential to have a some centralized bayesian database server. Imagine if everyone running SpamAssassin send their bayesian tokens to this centralized server. Then everyone could benefit from this large corpus of spam and ham data

Here are the steps that I used to upgrade:
  1. I first stopped amavisd, so the bayesian database would not be modified
  2. As the user that amavisd runs as: "sa-learn --rebuild"
  3. As root: "perl -MCPAN -e 'shell'"
  4. cpan> install Mail::SpamAssassin
  5. I decided to run the network tests and the sql bayesian tests
    • The network tests failed, since some server was down, so I ran the install again without the network tests
    • I also needed to create the mysql database first, so I followed the instructions here, but I use this script to initialize the correct tables.
  6. Then I ran "sa-learn --sync" as the user that amavisd runs as, to upgrade the bayesian database
  7. The installation was successful, so I ran spamassassin --lint -D, as the user that amavisd runs as
  8. I had to remove my file from /etc/mail/spamassassin/
  9. I also had to change some of the configuration options in, in the same directory
  10. Then when I ran "spamassassin --lint -D" the tests passed
Then I wanted to convert the bayesian database to be mysql based, so I folllwed these steps:
  1. "sa-learn --backup > backup.txt" to backup the bayesian database
  2. Told SpamAssassin to use sql for the database. In directives required to turn on the SQL based bayesian storage are:

    bayes_store_module Mail::SpamAssassin::BayesStore::SQL

    bayes_sql_dsn DBI:driver:database:hostname[:port]
    bayes_sql_username dbusername
    bayes_sql_password dbpassword
  3. "sa-learn --restore backup.txt"
  4. Then when I ran "spamassassin --lint -D" the tests passed
  5. Restarted amavisd

Later on I noticed that dcc was not the latest version, so I upgraded to 1.2.54. Also I switched to using the dccifd so it doesn't have to launch dccproc for every message.