HowTos/SpamAssassin
Scalix Wiki -> How-Tos -> SpamAssassin
Contents
- 1 Introduction
- 2 Installation
- 3 Scalix Configuration
- 4 Sendmail Configuration
- 5 Configure spamassassin and spamass-milter to start on boot
- 6 Restart sendmail
- 7 Restart the Scalix SMTP Relay
- 8 Confirm mail is being scanned
- 9 Configure spamassassin (header and score)
- 10 The Care and Feeding of your Bayes
Introduction
Spamassassin is an Open Source package available for RedHat and SuSE Linux. "Client" processes will communicate with a daemon (spamd) to perform the checking of a message. In most cases, the client will hand the daemon a complete message to check. The daemon will return the message with a series of header lines which indicate the “spamminess” of the contents. For more information about Spamassassin, refer to the Apache Foundation website; http://spamassassin.apache.org.
Installation
Download and install sendmail-devel and spamass-milter RPMs. These RPMs are readily available on the Internet and can be easily located using one of the rpmfind web sites.
If you are using Scalix 11, you will already have installed the sendmail-devel package.
If you are using Debian 4.0 Etch, you can install the packages
apt-get install spamassassin spamass-milter
Using SLES 10.2 (msaavedra@pixelcom.ch):
[Update 05/20/2009 by tonysu@su-networking.com]
Install the following prerequisite files from SuSE sources before anything else
sendmail.devel
C++ GNU compiler
make
Also, the updated URL is
http://mirror.its.uidaho.edu/pub/savannah/spamass-milt/spamass-milter-0.3.1.tar.gz
for SUSE 10.2 download the milter, unpack the tarball and install spamass-milter-x.y.z:
(Download Area at http://download.savannah.nongnu.org/releases/spamass-milt/)
wget http://download.savannah.nongnu.org/releases/spamass-milt/spamass-milter-0.3.1.tar.gz tar xvf spamass-milter-0.3.1.tar.gz cd spamass-milter-0.3.1/ ./configure make make install cd /usr/local/sbin/ ls -la spamass-milter -p /var/run/spamass.sock -D localhost -x
after getting following error: Milter (spamassassin): error connecting to filter: Connection refused by /var/spamass/spamass.sock i changed the .sock and it works
spamass-milter -p /var/run/spamass-milter.sock -f
see result with:
tail –f /var/log/mail.xyz [what ever the file is you what to watch]
Scalix Configuration
Scalix can now be configured to filter mail through Sendmail and in turn Spamassassin by adding one option to the smtpd.cfg configuration file. The linux commands to do this are as follows
Make a copy of the current configuration file.
Scalix 10
cp /var/opt/scalix/sys/smtpd.cfg /var/opt/scalix/sys/smtpd.cfg.orig
Scalix 11
cp /var/opt/scalix/NN/s/sys/smtpd.cfg /var/opt/scalix/NN/s/sys/smtpd.cfg.orig
--> NN is the first and last characters of your hostname.
Open the configuration file with your favorite text editor, in the example we use vi.
Scalix 10
vi /var/opt/scalix/sys/smtpd.cfg
Scalix 11
vi /var/opt/scalix/NN/s/sys/smtpd.cfg
Add this line:
SMTPFILTER=TRUE
above the lines beginning with:
RELAY accept 127.0.0.1
Save the file.
Sendmail Configuration
Please note that depending upon distribution, the socket file that is used for communication between spamass-milter and sendmail may not match the instructions below. To determine the correct socket file name, look at the file /etc/sysconfig/spamass-milter
Scalix 10
Backup the sendmail.cf
cp /etc/mail/sendmail.cf /etc/mail/sendmail.cf.orig
If you are using SuSE, the file will be /etc/sendmail.cf
Uncomment the line:
#O InputMailFilters
and change it to:
O InputMailFilters=spamassassin
Immediately below that line, add the following (the font size is smaller to preserve formatting):
# Milter options #O Milter.LogLevel O Milter.macros.connect=b, j, _, {daemon_name}, {if_name}, {if_addr} O Milter.macros.helo={tls_version}, {cipher}, {cipher_bits}, {cert_subject}, {cert_issuer} O Milter.macros.envfrom=i, {auth_type}, {auth_authen}, {auth_ssf}, {auth_author}, {mail_mailer}, {mail_host}, {mail_addr} O Milter.macros.envrcpt={rcpt_mailer}, {rcpt_host}, {rcpt_addr}
In the section MAIL FILTER DEFINITIONS, add the following line:
Xspamassassin, S=local:/var/run/spamass.sock, F=, T=C:15m;S:4m;R:4m;E:10m
Scalix 11 or if you are using .m4 macros
First, you have to check if there's a sendmail.mc file existing (openSUSE 10.X doesn't have this file as sendmail is configured by YaST by default!). If sendmail.mc is existing, just continue with Step 1 otherwise jump to Step 2:
1. sendmail.mc is existing
You can add the following line to the end of your sendmail.mc file
INPUT_MAIL_FILTER(`spamassassin',`S=local:/var/run/spamass.sock, F=, T=C:15m;S:4m;R:4m;E:10m')dnl
Note that each quoted string begins with a ` and ends in a '.
2. sendmail.mc is NOT existing
We have to do some preperations now. Firstly, we should stop YaST/SuSEconfig generating the sendmail.cf file.
To do that, in /etc/sysconfig/mail set:
MAIL_CREATE_CONFIG="no"
Next, we should backup /etc/sendmail.cf & /etc/mail/linux.mc:
# cp /etc/sendmail.cf /etc/sendmail.cf.backup # cp /etc/mail/linux.mc /etc/mail/linux.mc.backup
Afterwards, edit /etc/mail/linux.mc, adding the following line at the end of the file:
INPUT_MAIL_FILTER(`spamassassin',`S=local:/var/run/spamass.sock, F=, T=C:15m;S:4m;R:4m;E:10m')dnl
[edit by tonysu@su-networking.com] The above new line entry in linux.mc should be prepended by a "dnl" like all other entries in the file
Note that each quoted string begins with a ` and ends in a '.
For all versions:
Now it's time to regenerate the sendmail.cf file by doing:
sudo sh -c "m4 /etc/mail/linux.mc > /etc/sendmail.cf"
If there should appear this error "m4: INTERNAL ERROR: recursive push_string!", just add a new empty line to the end of /etc/mail/linux.mc and regenerate sendmail.cf file again, it should be fine then.
Configure spamassassin and spamass-milter to start on boot
To ensure that you have your spamassassin daemon running on reboot, you should use the commands:
chkconfig --add spamassassin chkconfig --add spamass-milter chkconfig --level 345 spamassassin on chkconfig --level 345 spamass-milter on /etc/init.d/spamassassin start /etc/init.d/spamass-milter start
Under SuSE, spamd is configured by default not to apply any rules that require Internet access (like accessing Pyzor, blocklists etc). To fix this, edit /etc/sysconfig/spamd. Look for the line, and remove the "-L" switch
SPAMD_ARGS="-d -c -L"
for SUSE 10.2 use spamd instead of spammassassin:
-bash: chkconfig --add spamd insserv: Warning, current runlevel(s) of script `spamd' overwrites defaults. spamd 0:off 1:off 2:off 3:on 4:on 5:on 6:off -bash: chkconfig --level 345 spamd on -bash: /etc/init.d/spamd start (to start the Spamassassin daemon in daemon mode. Type: spamd -d)
On a Debian 4.0 Etch installation, the links for starting spamassassin and spamass-milter are set by default. However, spamassassin won't start until you enable it. To do so, edit the /etc/default/spamassassin file
vi /etc/default/spamassassin
Change the option ENABLED=0 to ENABLED=1.
Restart sendmail
Use the command
/etc/init.d/sendmail restart
SUSE 10.2 error-message if spamass-milter isn't installed
Initializing SMTP port (sendmail)WARNING: Xspamassassin: local socket name /var/run/spamass.sock missing
# ps aux | grep spam
Restart the Scalix SMTP Relay
Use the commands
omoff -w -d 0 smtpd omon smtpd
Confirm mail is being scanned
Using the command
tail –f /var/log/mail.log
Successful Spamassassin configuration should produce this type of output in the log file if it is working correctly
Nov 3 09:39:56 scal4 sendmail[27547]: jA3Hdueo027547: from=<Kent.Brake@scalix.com>, size=2089, class=0, nrcpts=1, msgid=<H00000b60014d0c8.1131039536.hagrid.scalix.local@MHS>, proto=ESMTP, daemon=MTA, relay=localhost [127.0.0.1] Nov 3 09:39:56 scal4 spamd[24498]: connection from localhost [127.0.0.1] at port 59807 Nov 3 09:39:56 scal4 spamd[24498]: info: setuid to root succeeded Nov 3 09:39:56 scal4 spamd[24498]: Still running as root: user not specified with -u, not found, or set to root. Fall back to nobody. Nov 3 09:39:56 scal4 spamd[24498]: processing message <H00000b60014d0c8.1131039536.hagrid.scalix.local@MHS> for root:65534. Nov 3 09:39:56 scal4 spamd[24498]: clean message (-1.0/5.0) for root:65534 in 0.1 seconds, 2338 bytes. Nov 3 09:39:56 scal4 spamd[24498]: result: . -1 - ALL_TRUSTED,WEIRD_QUOTING scantime=0.1,size=2338,mid=<H00000b60014d0c8.1131039536.hagrid.scalix.local@MHS>,autolearn=failed
for SUSE 10.2
tail –f /var/log/mail tail –f /var/log/mail.error
Configure spamassassin (header and score)
for SUSE 10.2
If you need to change the score for system-wide processing:
/etc/mail/spamassassin/local.cf
or
/etc/spamassassin/local.cf
Add your own customisations to this file.
# rewrite_header Subject ****SPAM(_SCORE_)**** rewrite_header Subject **** your company ANTI-SPAM(_SCORE_)**** # Set the score required before a mail is considered spam. required_score 3.00 => set it to required_score 5.00
- ALWAYS* lint your rules:
spamassassin --lint
If the --lint output doesn't give you enough information, use:
spamassassin --lint -D
The Care and Feeding of your Bayes
This was initially contributed by Leigh. Many thanks
Spamassassin's Bayesian database needs a balanced supply of both Spam and Ham in order to function properly.
By feeding in false positives as Ham, and feeding false negatives as spam, we can keep the bayes database up to date.
Spamassassin also provides a facility to report spam to various anit-spam sites such as Razor, Pyzor and SpamCop.
Using the mboxadmin facility of Scalix, we can automate this task quite easily.
However, we need to be careful about what we feed into the bayes. We can't always trust our users to put spam into the right folders, and we can't expect them to hand-feed ham into our bayes. Many people use a public folder for their spam. This alows everyone to dump their false-negatives into a single folder, and automatically feed it into the bayes. Unfortunately, this doesn't allow for feeding it ham as well, and bayes needs a balanced diet. The other problem with public folders is that they are just that - public. We can't expect users to place ham into a public folder for all to see.
Here is a method for ensuring your bayes gets fed a proper balanced diet, and only spam gets fed in as spam, and only ham gets fed in as ham.
Requirements
Perl
perl-Mail-IMAPClient (probably available in your package manager)
Before using the tool, a user must exist which has the mboxadmin capability. To add this capability to a user, one can use ommodu. For example:
ommodu -o TestUser -c +mboxadmin
Set up two cron jobs on your server. Run this script every hour:
#!/usr/bin/perl use strict; use warnings; use Mail::IMAPClient; my $host="your_mail_server_ip"; my $username="mboxadmin_user_name"; my $password="mboxadmin_password"; my @real_users=`/opt/scalix/bin/omshowu -m all -i`; # get all real user names. foreach my $punter (@real_users) # Loop over them all. { chomp $punter; # Remove trailing carriage return. print "$punter\n"; # Some output. Feel free to remove. my $user="mboxadmin:$username:$punter"; # Set up superuser login. my $imap = new Mail::IMAPClient( 'Server' => $host , 'User' => $user , 'Password' => $password ) or next; # connect to server. my @folders=$imap->folders; # list folders. foreach my $folder (@folders) # Look through each of them. { if (lc($folder) eq "junk e-mail") # "junk email" folder. { print "Found a spam folder: $folder\n"; $imap->select($folder) or next; # Select the folder. print "Folder $folder selected.\n"; my @list=$imap->messages or next; # List all messages in folder. print scalar(@list)." messages in folder.\n"; foreach my $msg (reverse(@list)) # Loop over them all. { my @email=$imap->fetch($msg,'RFC822'); # Fetch message. open (SALEARN,"|/usr/bin/spamassassin -d | /usr/bin/sa-learn --spam") or print "$!\n"; # Feed to sa-learn. print SALEARN "$email[1]"; close SALEARN; open (REPORT,"|/usr/bin/spamassassin -d | /usr/bin/spamassassin -r") or print "$!\n"; # Report it. (SpamCop and Pyzor). print REPORT "$email[1]"; close REPORT; $imap->delete_message($msg) or next; # Delete it. } $imap->expunge($folder) or next; #Expunge folder. } } }
And this one every week:
#!/usr/bin/perl use strict; use warnings; use Mail::IMAPClient; my $host="your_server_ip_address"; my $username="mboxadmin_user_name"; my $password="mboxadmin_password"; my @real_users=`/opt/scalix/bin/omshowu -m all -i`; # get all real user names. foreach my $punter (@real_users) # Loop over them all. { chomp $punter; # Remove trailing carriage return. print "$punter\n"; # Some output. Feel free to remove. my $user="mboxadmin:$username:$punter"; # Set up superuser login. my $imap = new Mail::IMAPClient( 'Server' => $host , 'User' => $user , 'Password' => $password ) or next; # connect to server. my @folders=$imap->folders; # list folders. foreach my $folder (@folders) # Look through each of them. { if (lc($folder) eq "inbox") # "Inbox" is guaranteed to only have ham in it. { print "Inbox found.\n"; # Some debug output. $imap->select($folder) or next; # Select folder. print "Folder $folder selected.\n"; my @list=$imap->seen or next; # Get only messages which have been read. Saves the possibility of reading in false positives. Also stops us interfering with people's mail. print scalar(@list)." messages in folder.\n"; my $counter=0; # Initialise counter. - we don't want the entire inbox. foreach my $msg (reverse(@list)) # Loop over each message. { my @email=$imap->fetch($msg,'RFC822'); # Fetch it. open (SALEARN,"|/usr/bin/spamassassin -d | /usr/bin/sa-learn --ham") or next; # Feed it to sa-learn. print SALEARN "$email[1]\n"; close SALEARN; $counter +=1; # Increment counter. last if ($counter>100); # We only want 100 messages. } } elsif (lc($folder) eq "possible spam") # "Possible Spam" folder. { print "Found a spam folder: $folder\n"; $imap->select($folder) or next; # Select the folder. print "Folder $folder selected.\n"; my $lastweek=time()-604800; # Get timestamp for this time last week. my @list = $imap->before($lastweek) or next; # List all messages older than that. print scalar(@list)." messages in folder.\n"; foreach my $msg (reverse(@list)) # Loop over them all. { my @email=$imap->fetch($msg,'RFC822'); # Fetch message. open (SALEARN,"|/usr/bin/spamassassin -d | /usr/bin/sa-learn --spam") or print "$!\n"; # Feed to sa-learn. print SALEARN "$email[1]"; close SALEARN; open (REPORT,"|/usr/bin/spamassassin -d | /usr/bin/spamassassin -r") or print "$!\n"; # Report it. (SpamCop and Pyzor). print REPORT "$email[1]"; close REPORT; $imap->delete_message($msg) or next; # Delete it. } $imap->expunge($folder) or next; #Expunge folder. } elsif(lc($folder) eq "non-spam") { $imap->select($folder) or next; # Select the folder. print "Folder $folder selected.\n"; my @list=$imap->messages or next; # List all messages in folder. print scalar(@list)." messages in folder.\n"; foreach my $msg (reverse(@list)) # Loop over them all. { my @email=$imap->fetch($msg,'RFC822'); # Fetch message. open (SALEARN,"|/usr/bin/spamassassin -d | /usr/bin/sa-learn --forget") or print "$!\n";# Sa-learn forget this message if already seen. print SALEARN "$email[1]"; close SALEARN or print "$!\n"; open (SALEARN,"|/usr/bin/spamassassin -d | /usr/bin/sa-learn --ham") or next; # Feed to sa-learn as ham. print SALEARN "$email[1]"; close SALEARN; } } elsif (lc($folder) eq "spam") # "spam" folder. { print "Found a spam folder: $folder\n"; $imap->select($folder) or next; # Select the folder. print "Folder $folder selected.\n"; my $lastweek=time()-604800; # Get timestamp for this time last week. my @list = $imap->before($lastweek) or next; # List all messages older than that. print scalar(@list)." messages in folder.\n"; foreach my $msg (reverse(@list)) # Loop over them all. { my $subject=$imap->subject($msg); # Fetch subject for message. my @email=$imap->fetch($msg,'RFC822'); # Fetch message. unless ($subject=~m/\[SPAM\]/) { print "Learning message with subject: $subject\n"; open (SALEARN,"|/usr/bin/spamassassin -d | /usr/bin/sa-learn --spam") or print "$!\n"; # Feed to sa-learn. print SALEARN "$email[1]"; close SALEARN; } open (REPORT,"|/usr/bin/spamassassin -d | /usr/bin/spamassassin -r") or print "$!\n"; # Report it. (SpamCop and Pyzor). print REPORT "$email[1]"; close REPORT; $imap->delete_message($msg) or next; # Delete it. } $imap->expunge($folder) or next; #Expunge folder. } } }
The first script, run every hour, checks each user's "junk email" folder. Each message it finds has it's spamassassin headers removed and is fed to sa-learn as spam. It is then submtted to spamassassin's reporting facility to be reported to SpamCop, Pyzor, etc.
For the second script to be effective, we need to set up some rules and educate our users a little.
To be as aggressive as possible with spam, but also as safe as possible, set up two folders for each user: "Spam" and "Possible Spam". Each user then needs two server-side rules: All mail marked as spam by spamassassin goes into the "spam" folder, and all mail not marked as spam, but with a score above 3, goes into the "possible spam" folder. Also create a "non-spam" folder for each user. This is where they are to place copies of legitimate email which gets incorrectly tagged as spam.
Our second script then does the following:
Each user's inbox is scanned. The newest 100 messages which have already been read are fed to sa-learn as ham. This assumes that nobody is going to read a piece of spam and then leave it in their inbox. If they do, they deserve to get more spam, quite frankly.
Each user's "possible spam" folder is also read. Messages which are older than a week are fed to sa-learn as spam and then deleted. There is no point reporting these after they are a week old.
If a user does not check their possible spam folder each week, they risk losing mail. This gives them the incentive to keep an eye on it.
The "non-spam" folder is also checked. Anything in here is fed to sa-learn as well. First, sa-learn is told to un-learn this message, in case auto-learn has already classified it as spam, and then it is learnt as ham.
The "spam" folder is then checked. It is almost the same as the "possible spam" folder, except that anything which has already been tagged as spam by spamassassin is not reported. This gives peopole the option of placing spam in the "spam" folder, which is a little more intuitive for them. Also, those not running Outlook 2K3 or later may not have a "junk email" folder. To check whether a message has been already tagged as spam, this script looks at the message subject. If it begins with "[SPAM]", it is simply deleted. Feeding lots of messages into your bayes database which have this tag in the header could do more harm than good. Spamassassin may start to think that all spam has the tag "[spam]" in it's subject, and down-grade any messages which don't.
One thing about mboxadmin is worth noting. As at scalix V10, an mboxadmin user cannot access the mailbox of another mboxadmin user. This means you must not have any other mboxadmins on your system, or our scripts will not be able to read their mail.
The 100-message limit on the inbox can be changed to suit your site by altering this line:
last if ($counter>100); # We only want 100 messages.
Season to taste.
Please note: If the setup described above makes a mess of your Bayes, I will not be held responsible. Use this method at your own risk. Make sure you understand the requirements and use your own judgement. Back up your Bayes database first.
Below Contributed by Mike Lee on 11-9-2006
- I recommend using a username who is not a legit user who will be checking email. The reason for this that if you use a legit username and grant them mbox permissions via ommodu username -c +mboxadmin, then that users' mailbox will not get checked. This has been my experience with this script. Therefore use a username, for example admin@yourdomain.com, as the mboxadmin if you do not receive emails to that account or care about not being able to search for spam in that account.
Below Contributed by Mark Nikkels on 17-4-2009
If your spamassassin subject headers are NOT being rewritten as per your config file, check the following.
That your spamassassin config file has the following;
# Whether to change the subject of suspected spam rewrite_subject 1 # Text to prepend to subject if rewrite_subject is used rewrite_header Subject ****SPAM****
If you’re using spamass-milter then make sure it’s not using the -m flag. (By default on redhat systems .. it does..)
Edit your /etc/init.d/spamass-milter file and change
EXTRA_FLAGS=”-m -r 15″ to EXTRA_FLAGS=”-r 15″ The -m tells spamass-milter NOT to modify any header or body information. So remove this and you should be fine.