URGENT - Scalix server keeps crashing

Discuss the Scalix Server software

Moderators: ScalixSupport, admin

gregwatson
Posts: 40
Joined: Tue Aug 22, 2006 11:00 am

URGENT - Scalix server keeps crashing

Postby gregwatson » Fri Dec 14, 2007 2:37 pm

Hi Guys.

Our Scalix 11 server has recently starting crashing. The initial symptoms are that SWA users get a message saying "The evaluation period for this time-limited product has expired" even though it is a permanent license.

The first time it happened I tried restarting the scalix services by running /etc/init.d/scalix-tomcat restart and (when that didn't work) /etc/init.d/scalix restart which also didn't help. I noticed there were a lot of index.browse and mime.browse processes running.

I tried to kill off all the scalix related process but hust couldn't get them back up so I rebooted the server, and it came up fine.

However this morning it happened again, and then again this evening.

I'm posting in a bit of a panic because it's down again now, but in essence there a lot of OM10270 and OM10272 errors in the fatal log file, for all sorts of different modules (Browser, IMAP Server, Administration, Internet Maill etc).

When I run omstat -a or omstat -s I get nothng back at all, just a carriage return.
I have tried omshut but it doesn't seem to succeed in shutting scalix down; when I run ps -eaf | grep scalix I still see lots of processes running.

Any ideas? To make things even worse when we rebooted the server this morning (and it's at a remote office) it hung on restart, and there's nobody there to push the power button so if there's any more troubleshooting I can do now, please let me know...

Prior to this the server has been running fine for almost a year.
I don't actually know which version of Scalix is on there because I don't know if my colleague managed to upgrade it, and because it's down now I can't check, but I think it's 11.0.2 If it would be a good idea to upgrade, will I need a new license file because it's outside UK working hours now so I probably won't be able to get a license...)

(As an example of what is in the fatal log file right now:

SERIOUS ERROR IMAP Server Da(IMAP Server Pr) Fri Dec 14 18:33:17 20
[OM 10272] BACKTRACE:
/opt/scalix/lib/libom_er.so(er_add_backtrace+0xc6)[0x40117366]
/opt/scalix/lib/libom_er.so[0x40117665]
/opt/scalix/lib/libom_er.so(er_DumpProcAndExit+0x1f)[0x401177ef]
[0xffffe420]
/opt/scalix/lib/libom_da.so(da_AttachSMem+0x122)[0x403efa81]
/opt/scalix/lib/libom_da.so[0x403ef357]
/opt/scalix/lib/libom_da.so(da_GetMemAttribDefn+0x4c)[0x403efe60]
/opt/scalix/lib/libom_da.so(da_GetAttribDefn+0x4b)[0x403f000f]
/opt/scalix/lib/libom_da.so[0x40400b60]
/opt/scalix/lib/libom_da.so[0x404007c2]
/opt/scalix/lib/libom_da.so(da_PrepareFilter+0x28c)[0x403ffe00]
/opt/scalix/lib/libom_vi.so(vi_Get+0x131)[0x405c0949]
/opt/scalix/lib/libom_dr.so(dr_IL_Get+0x1be)[0x400f2f46]
/opt/scalix/lib/libom_dr.so(dr_Search+0x363)[0x40106b1d]
/opt/scalix/lib/libom_ul.so(ul_FindAuthId+0xc1)[0x4021cbb1]
/opt/scalix/lib/libom_uald.so(uald_FindPUorAlias+0x1b4)[0x4020fc5a]
in.imap41d[0x807009b]
in.imap41d[0x807169f]
/usr/lib/libsasl2.so.2(_sasl_canon_user+0x114)[0x402bac44]
/opt/scalix/lib/security/libplain.so[0x40d20686]
/usr/lib/libsasl2.so.2(sasl_server_step+0xca)[0x402c303a]
in.imap41d[0x8070aad]
in.imap41d[0x806945a]
in.imap41d[0x8062f7c]
Pid of logging process: 21555)

TIA

gregwatson
Posts: 40
Joined: Tue Aug 22, 2006 11:00 am

Postby gregwatson » Fri Dec 14, 2007 3:47 pm

OK I had to restart the server and fortunately it has come back up.

What can I run to check / repair integrity of the mailstore etc etc to try and stop this from happening again?

omcheck seems happy
Have run omscan -Aavfx to see if that helps
what else should I run to try and fix this recurring problem? How about rebuilding indexes etc?

mikethebike
Posts: 566
Joined: Mon Nov 28, 2005 4:16 pm
Location: England

Postby mikethebike » Fri Dec 14, 2007 3:59 pm

Greg,
there is another post on here somewhere about that 10272 error, have a quick search of the forum....actually I just did :wink:

viewtopic.php?p=6436

I would absolutely avoid killing scalix processes, use the "omoff" command to stop processes, and if all else fails use the "omreset" command (see the man pages).

Mick

gregwatson
Posts: 40
Joined: Tue Aug 22, 2006 11:00 am

Postby gregwatson » Fri Dec 14, 2007 4:05 pm

Hi Mick.

Thanks for the reply.

Problem is - once it gets in that state, none of the om... commands seem to do a thing. omoff, omon, omstat - none of them do anything and omcheck fails with an error about the message catalog not being found, try setting LANG variable.

The only thing that works is a server reboot. Possibly the server is running out of some sort of resource or something... who knows. :-(

mikethebike
Posts: 566
Joined: Mon Nov 28, 2005 4:16 pm
Location: England

Postby mikethebike » Fri Dec 14, 2007 5:04 pm

Greg,

do any commands work? Is the server being hammered? Are all the filesystems still available?
I wonder if it is related to the problem in the other thread.

Mick

gregwatson
Posts: 40
Joined: Tue Aug 22, 2006 11:00 am

Postby gregwatson » Fri Dec 14, 2007 5:33 pm

Well, by a fairly circuitous route, I have found this on bugzilla: http://bugzilla.scalix.com/show_bug.cgi?id=14709

Which I'm pretty sure is the problem. I am on 11.02 on SLES9 and I'm getting a lot of the errors shown in the bug...

gregwatson
Posts: 40
Joined: Tue Aug 22, 2006 11:00 am

Postby gregwatson » Fri Dec 14, 2007 5:39 pm

Can someone help me with a hotfix for this?
Looks like upgrading to 11.0.3 might sort it out but that is not an option for me right now, and we need to get this working over the weekend.

I am supposed to be taking a family holiday on Monday, which I will have to cancel if I can't get this sorted beforehand :roll:

gregwatson
Posts: 40
Joined: Tue Aug 22, 2006 11:00 am

Postby gregwatson » Sun Dec 16, 2007 5:13 pm

Thanks to some more excellent support from the Scalix team we are now upgraded to 11.2 which has eliminated the problem.


Return to “Scalix Server”



Who is online

Users browsing this forum: No registered users and 2 guests

cron