Page 1 of 1

Internal sending delays and blackholed emails.

Posted: Wed Feb 21, 2007 3:44 pm
by rex007can
I have two sites running Scalix 10.1 running on RHEL4, site A with Clamav and Mailwasher (although I'm planning on replacing it) and site B running with Clamav and Spamassassin.

There are agreements on both sites and no apparent routing issues. The second site has been up since late November.

Lately, I've started to get complaints that sometimes, users don't get certain messages. Meeting invitations also often just get lost. One time, an email kept getting blackholed if a specific attachement (a small PDF file) was present. Didn't matter to or from who, from where, or even what the file was called. If it was there...poof, gone, no logs.

It seems to having become a little more frequent in the past few weeks. Emails sent to multiple recipients inside the company (from the outside) don't reach some of them. Emails sent from site A to site B get delayed, sometimes for many hours, while emails sent from site B to site A show up immediately. Looking at the logs, it almost seems that messages from A to B go out in bursts that may or may not be at the same time dirsynch occurs. Not sure though. It's just very strange.

Any hints as to where I should look?

Posted: Thu Feb 22, 2007 12:39 pm
by rex007can
Replying to myself here.

Is it possible that running spam filters on my two servers causes some problems between them?

Let me explain.

I have one server with spamassassin and the other had MailWasher.
MailWasher was a resource hog and I was having the problems described in the above post.

So I remover Mailwasher. Problems seem to have immediately gone away. Mail travels instantly between the servers, all delays are gone, and I suspect I won't be losing mail anymore, but that's too soon to tell.

Anyhow. I installed Spamassassin on the server to replace MailWasher. And delays started again, especially is an email for a US user comes in through the Canadian server.
I disabled Spamassassin, and problems went away again.
It just seems a bit weird.

Posted: Thu Feb 22, 2007 2:30 pm
by kanderson
If your servers are busy, I'd recommend using a spam filter on a different box. Things like Razor, Pyzor, DCC, etc can take a very heavy toll on performance. That sounds like what you're seeing.

What is the load average on your box? (type w at a command line).

The biggest concern to me is the lost messages. Is that resolved with mailwasher gone?

Kev.

Posted: Thu Feb 22, 2007 2:36 pm
by rex007can
Well, everything seems to have gone back to normal. There were a few errors that I fixed in Spamassassin and that may have done it.
But all is good now.
Load average is 0.03.
MUCH better than it was with mailwasher running.