Page 1 of 1
Nagios ERROR queue problem.
Posted: Tue Aug 09, 2005 11:31 am
by BigBirdy
I have setup most of the Nagios scripts and commands for checking the various Scalix services and queues. Things "appeared" to be working with regards to the queues as I knew I had one corrupted message in the ERROR queue and the Nagios command returned "Scalix message queue ERROR has 1 message(s)" . But I went into the SAC and deleted that currupted message and so the list of messages in all the queues is correctly showing "0" in the SAC, but the Nagios check command below still shows a message in the ERROR queue? Where is this info comming from if the SAC shows all 0's for messages in the queues?
#/usr/local/nagios/libexec/check_queues.py -c 5 -w 1 -q ERROR
WARNING - Scalix message queue ERROR has 1 message(s)
Re: Nagios ERROR queue problem.
Posted: Wed Aug 10, 2005 1:00 am
by julio
BigBirdy wrote:I have setup most of the Nagios scripts and commands for checking the various Scalix services and queues. Things "appeared" to be working with regards to the queues as I knew I had one corrupted message in the ERROR queue and the Nagios command returned "Scalix message queue ERROR has 1 message(s)" . But I went into the SAC and deleted that currupted message and so the list of messages in all the queues is correctly showing "0" in the SAC, but the Nagios check command below still shows a message in the ERROR queue? Where is this info comming from if the SAC shows all 0's for messages in the queues?
#/usr/local/nagios/libexec/check_queues.py -c 5 -w 1 -q ERROR
WARNING - Scalix message queue ERROR has 1 message(s)
The number of messages is coming from the output of omstat -q ERROR.
Do you observe this for ERROR queue or all the queues?
Re: Nagios ERROR queue problem.
Posted: Wed Aug 10, 2005 6:25 pm
by julio
Also, not that SAC does the check on demand, whereas Nagio does the check on scheduled polling inverval. It is very likely that the next check from nagios has not been scheduled or the htm page is not getting updated.
Still a Problem
Posted: Fri Aug 12, 2005 11:35 am
by BigBirdy
There is still an inconsistency between what omstat -q ERROR and the check_queue python script as seen below. Possibly the author of these python scripts could assist?
[root@pbco-server2 ~]# omstat -q ERROR
omstat : There are no messages on the queue
[root@pbco-server2 ~]# cd /usr/local/nagios/libexec/
[root@pbco-server2 libexec]# ./check_queues.py -c 5 -w 1 -q ERROR
WARNING - Scalix message queue ERROR has 1 message(s)