Page 1 of 1

Queue Manager aborting every 12 hours or more

Posted: Fri Jul 23, 2010 3:02 am
by deyjvu
I've never had many issues with the Queue Manager and the few I have had has been with a corrupt QP file in ~/msgpool directory until now. For the last week I have had the Queue Manager aborting twice a day and over night it aborted about 5 times. The only queue that has a lot of messages on it was the SMERR queue that I moved a whole bunch of looping messages to in order to stop the looping. There was about 13,000 on that queue. I've gradually been whittling that number down but it hasn't improved the situation with the number of messages on the queue decreasing.

Here are the types of errors I am getting from the QM:

SERIOUS ERROR Queue Manager (Queue Manager ) Fri Jul 16 16:02:02 2010
[OM 10270] Process about to terminate due to error.
Signal (Segmentation Violation) trapped by process 5854
Procedure trace follows:
<- ql_GetNextMsgDue
-> ql_GetNextMsgDue
<- ql_GetNextMsgDue
-> ql_GetNextMsgDue
<- ql_GetNextMsgDue
-> ql_GetNextMsgDue
<- ql_GetNextMsgDue
-> ql_GetNextMsgDue
<- ql_GetNextMsgDue
-> ql_GetNextMsgDue
<- ql_GetNextMsgDue
-> ql_GetNextMsgDue
<- ql_GetNextMsgDue
<- qm_RespondToExpectantReaders
-> qm_ProcessReleaseMsg
-> ql_AddMsgToMemList
Pid of logging process: 5854


SERIOUS ERROR Queue Manager (Queue Manager ) Fri Jul 16 16:02:02 2010
[OM 10272] BACKTRACE:
/opt/scalix/lib/libom_er.so(er_add_backtrace+0xb5)[0xf7f98b25]
/opt/scalix/lib/libom_er.so[0xf7f98e13]
/opt/scalix/lib/libom_er.so(er_DumpProcAndExit+0x1f)[0xf7f98f9f]
[0xffffe500]
/opt/scalix/lib/libom_ql.so[0xf7f40a9f]
/opt/scalix/lib/libom_ql.so(ql_AddMsgToMemList+0x87)[0xf7f40baa]
queue.manager[0x804a567]
queue.manager[0x805171c]
queue.manager[0x804bc95]
queue.manager[0x804d601]
queue.manager[0x804de84]
/lib/libc.so.6(__libc_start_main+0xdc)[0x27cdec]
queue.manager[0x8049c71]

Anybody else seen anything like this?

We did have an issue with Swap but that has been resolved and still the problem continues?

Scalix is 11.4.5 and running on RedHat 5.

Re: Queue Manager aborting every 12 hours or more

Posted: Mon Aug 09, 2010 8:37 am
by BaldBoy
I'd suggest to upgrade to 11.4.6

Re: Queue Manager aborting every 12 hours or more

Posted: Tue Aug 10, 2010 7:40 pm
by deyjvu
We ended up building a whole new server and installed 11.4.6 onto that, problem went away!!

We are pretty sure it was the resources on the old server just 'running out' basically.

We wont mention the fact we lost a disk on the new system on the first day due to hardware issues and did not have a backup of the new system... it was due to start an hour after the disk failed :-(.