SERIOUS ERROR: Local Delivery

Discuss the Scalix Server software

Moderators: ScalixSupport, admin

kjakkanen
Posts: 125
Joined: Thu Dec 21, 2006 10:09 am
Location: Espoo - Finland

SERIOUS ERROR: Local Delivery

Postby kjakkanen » Wed May 16, 2007 5:37 am

Hello,

Can anyone interpret these messages we get in /var/opt/scalix/nn/s/logs/fatal,
how serious are they and what has actually happened that triggers this log entry?

These appear irregularly, but they look worrying?

Thanks,
Kimmo

---------------------------------------------------------------------------
SERIOUS ERROR Local Delivery(Local Delivery) Wed May 16 12:12:16 2007
[OM 10272] BACKTRACE:
/opt/scalix/lib/libom_er.so(er_add_backtrace+0xc6)[0xf7f2eee6]
/opt/scalix/lib/libom_cvc.so(cvc_enhCnvString+0x107)[0xf7c64230]
/opt/scalix/lib/libom_cvc.so(cvc_ConvertString+0x3d)[0xf7c64cc5]
/opt/scalix/lib/libom_rtfl.so(rtfl_BuildLine+0x3e9)[0xf7c2b897]
/opt/scalix/lib/libom_rtfl.so[0xf7c2da50]
/opt/scalix/lib/libom_rtfl.so(rtfl_Parse+0x175)[0xf7c2f31e]
/opt/scalix/lib/libom_rtfl.so(rtfl_search+0x109)[0xf7c2cd83]
/opt/scalix/lib/libom_flt.so[0xf7f1eef9]
/opt/scalix/lib/libom_flt.so(flt_ApplyTextMatch+0xe1)[0xf7f1f036]
/opt/scalix/lib/libom_flt.so(Test_TextBody_Att+0x1eb)[0xf7f1cafe]
/opt/scalix/lib/libom_flt.so(flt_ApplySingle+0x6c8)[0xf7f1b6c0]
/opt/scalix/lib/libom_flt.so(flt_ApplyNextFilter+0x228)[0xf7f1a9da]
/opt/scalix/lib/libom_flt.so(flt_ApplyOuterGroup+0xcb)[0xf7f1a663]
/opt/scalix/lib/libom_flt.so(flt_ApplyFC+0x140)[0xf7f1a184]
local.delivery[0x8057cd9]
local.delivery[0x80530a4]
local.delivery[0x805c2ab]
local.delivery[0x805dfa3]
local.delivery[0x805ec4b]
/lib/tls/libc.so.6(__libc_start_main+0xd3)[0xc0dde3]
local.delivery[0x804d925]
Pid of logging process: 7630
Last Msg Id: H000031d0054bad0.1179306725.server.name.domain
---------------------------------------------------------------------------

ScalixSupport
Scalix
Scalix
Posts: 5503
Joined: Thu Mar 25, 2004 8:15 pm

Postby ScalixSupport » Wed May 16, 2007 8:33 am

Hi!

Have you restarted (omshut/omrc) your server? If so, does the error re-appear after the restart?

What Scalix version you use and what OS you are using?

Thanks,
Subir

kjakkanen
Posts: 125
Joined: Thu Dec 21, 2006 10:09 am
Location: Espoo - Finland

Postby kjakkanen » Wed May 16, 2007 8:48 am

Hello and thanks for your reply!

Yes the server has been restarted, latest last Wed when 11.0.4 was installed (a complete restart of the whole server). Come to think of it, this was actually done BEFORE starting the update to 11.0.4 - do you think it might help to do omshut&omrc now with the new version up&running?

OS is Red Hat Linux Enterprise version 4, not all the latest Red Hat updates have been applied but it's not too old either since it was updated less than a month ago when we went from Scalix 10 to 11.

There are also irregular messages like below in /var/log/messages:
---
May 14 08:01:38 servername kernel: wvWare[18705]: segfault at 000000005a5a3c3e rip 00000000f7fb9023 rsp 00000000ffff97e0 error 4

May 16 08:03:13 servername kernel: xlhtml[11135]: segfault at 000000000000002c rip 000000000804b8d1 rsp 00000000ffffd520 error 6
---

Scalix internals or RH "externals"? :-)

KR;
Kimmo

ScalixSupport
Scalix
Scalix
Posts: 5503
Joined: Thu Mar 25, 2004 8:15 pm

Postby ScalixSupport » Wed May 16, 2007 9:26 am

Hi Kimmo!

Try restarting (omshut/omrc) your server? See if the error is gone?

Thanks,
Subir

kjakkanen
Posts: 125
Joined: Thu Dec 21, 2006 10:09 am
Location: Espoo - Finland

Postby kjakkanen » Wed May 16, 2007 9:29 am

Hi!

OK - this will anyway be done is we're installing extra RAM on Sunday morning. I will report on Wednesday latest if the error has re-appeared or not. Thanks for your help!

KR;
Kimmo

kjakkanen
Posts: 125
Joined: Thu Dec 21, 2006 10:09 am
Location: Espoo - Finland

Postby kjakkanen » Thu May 24, 2007 9:25 am

Hello,

The error has re-appeared again - about 10 times, even after a complete server reboot last Sunday. I've also noticed errors such as below in the fatal-logfile:
---
SERIOUS ERROR Administration(omstorepm ) Tue May 22 01:26:36 2007
[OM 28880] Bad magic number in an item header record.
Pid of logging process: 26598
---

I have no clue where to start debugging this?

Thanks,
Kimmo

ScalixSupport
Scalix
Scalix
Posts: 5503
Joined: Thu Mar 25, 2004 8:15 pm

Postby ScalixSupport » Thu May 24, 2007 10:25 am

Hi Kimmo!

The above error indicates there is possibly a corrupt message in the message store, see
for yourself:
[root@subir-rhel4 ~]# omsolve -n OM 28880
-------------------------------------------------------------------------------
Error Group: OM Error Number: 28880

Bad magic number in an item header record.

This message indicates that there is possibly a corrupt
file present in the message store.
Try running omscan.

I would recommend you to run omscan for all users, use the below given command, note
that this is a resource intensive so try it at lower load time:

Code: Select all

omscan -Aavfx

Hope this helps.

Thanks,
Subir

kjakkanen
Posts: 125
Joined: Thu Dec 21, 2006 10:09 am
Location: Espoo - Finland

Postby kjakkanen » Fri May 25, 2007 3:52 am

I actually have that omscan-run scheduled as a weekly task in cron, but running it manually produced 9 errors while checking for "Missing Children", they were of two different types:
---
Parent Container : ~/data/00000l0/008glr1:1, RecNum : 0
Missing Child : ~/data/00000bj/008g803:4
Child Type : Data Item.
---
Parent Container : ~/data/00000p4/006pca1:1, RecNum : 1
Missing Child : ~/data/00000p8/006pbeo:3
Child Type : Transaction file.
---

I assume that omscan somehow fixed those, though it didn't say anything about any corrections.

Still, the xlhtml segfaults continue in /var/log/messages:
---
May 25 08:46:37 server kernel: xlhtml[2460]: segfault at 0000000000000015 rip 000000000804bb18 rsp 00000000ffffd8c0 error 6
May 25 09:00:13 server kernel: xlhtml[9133]: segfault at 0000000000000015 rip 000000000804bb18 rsp 00000000ffffd8c0 error 6
---

Probably not worth filing a bug in Bugzilla, it's just be closed as "WORKSFORME"?

Thanks,
Kimmo

gren
Scalix
Scalix
Posts: 264
Joined: Thu Mar 25, 2004 10:27 am
Contact:

Postby gren » Wed May 30, 2007 7:39 am

Hi Kimmo,

If you do "omstat -q ERROR" on your Scalix server, are there any messages reported?
Also, for the POISON queue, does "omstat -q POISON" report anything?

If so, these may be messages we are encountering problems processing.

Which version of Scalix are you using? I know some fixes have been made in a similar area in recent 11.X fix releases.

Things you can try :
omresub -q ERROR
omresub -q POISON

Note that if the messages are still causing problems, then local.delivery will die again and need restarting.

Any messages that end up back on the ERROR queue or the POISON queue
would be interesting.
It would be useful if you could send me examples of these messages dumped from the queues. The "omqdump" command can do this. The password is "A##E" where ## is today's month day + 10, so for 30th May, the password is "A40E".
The "o" command will output a message to a set of files, if you could tar these up and send the result to gren dot elliot at scalix dot com with dots and ats replaced, that would be great. With these, we may be able to pinpoint the exact error and fix it :)

Regards,
Gren.

kjakkanen
Posts: 125
Joined: Thu Dec 21, 2006 10:09 am
Location: Espoo - Finland

Postby kjakkanen » Wed May 30, 2007 8:26 am

Hi Gren,

Thanks for your reply.

It doesn't seem to be the queues though:
---
[root ~]# omstat -q ERROR
omstat : There are no messages on the queue
[root ~]# omstat -q POISON
omstat : There are no messages on the queue
---

Scalix is version 11.0.4 so all fixes should be there, this one seems like a tough one to tackle...

KR;
Kimmo


Return to “Scalix Server”



Who is online

Users browsing this forum: No registered users and 3 guests