Hi.
During the last days I was affected by the following problem, which I
think everyone could be affected using scalix/sendmail and having lower
bandwidth internet connections (lets say less than 1 mbit/s). Meanwhile I have
done many tests and read a lot but did not find a suitable solution. I
would be glad if someone more experienced with sendmail could point me
in the right direction.
One user submitted one large mail (about 10 MB) for delivery to
30 external recipients (BCC) via Scalix SMTPD. When looking at
the Sendmail queue is saw that this mail was splitted into 30 seperate
mails and sendmail spawned one child process for every mail
delivery ending up in 30 processes trying to deliver the mail
simultaneously. The delivery processes will share the bandwith to the
internet for delivery which leads to the fact that every submission
takes a huge amount of time before finishing. On testing I found that
many receiving MTAs on the internet operate on rather short timeout
values for the whole mail transaction to prevent DOS attacks. This leads
to the fact that all submissions one after the other were terminated with
"dsn=4.0.0, stat=Deferred" and were requeued for submission. The
situation was even getting worse as I had a short queue runner
interval and end up having multiple queue runners picking up the
queued mails for resending leading to the initial problem of deferring
and requeueing undtil the 5 day limit is reached.
The easy solution would be to have a max number of children doing
delivery but this seems not to be possible to configure
(neither confMAX_DAEMON_CHILDREN nor confMAX_QUEUE_CHILDREN will do this).
Another solution would be to set the delivery mode to queue-only,
have a short queue runner interval (say 3 minutes) and limit the number
of queue runners e.g to 5 with confMAX_QUEUE_CHILDREN and
confMAX_RUNNERS_PER_QUEUE
This would have prevented the timeout/requeue problem however
until 25 of the 30 mails were delivered not even very small mails would have
been delivered. You also should ensure that confMAX_DAEMON_CHILDREN
is larger than confMAX_RUNNERS_PER_QUEUE in the case you use sendmail
also as receiving MTA, otherwise inbound mails are rejected until the number
of queue runners again drops lower than confMAX_DAEMON_CHILDREN
(in this case when 25 of the 30 mails were sent).
The only optimum solution I see at the moment would be to work in
queue-only mode as described above and have two queue groups, one
for small messages and one for large messages both with a different
number of queue runners assigned. The problem is how to write a rule
which assigns the message to the queue group according to it's size?
As I'm not familiar with the whole sendmail rule thing can anybody
show me how I could achive this goal.
If all this works maybe some good stuff for the wiki.
Regards,
grubi.