I've just upgraded a Scalix 10.0.5 server to 11.0.1, and am experiencing some serious flakiness.
I've searched the forums for the issues and errors I'm seeing and have come up empty, so now I'll post my query here. If I've missed relevant forum entries, feel free to point them out to me and I'll carry one
Anyway... After the upgrade, I'm seeing several issues:
1. Tomcat keeps randomly crapping out and leaving me with no webmail or SAC.
2. ldapmapper has hung 2 times in 36 hours; I have to kill -9 and restart the process.
3. A bunch of scalix Services keep stopping; the daemons are running (omstat -a shows all happiness) but omstat -s shows:
Code: Select all
Service Router Started 04:51:22 0
Local Delivery Partially Abor 04:50:49 1268
Internet Mail Gateway Started 04:51:27 0
Sendmail Interface Started 04:51:21 0
Local Client Interface Enabled 05:50:19 0
Remote Client Interface Enabled 04:51:05 3
Test Server Stopped 02.22.07 0
Request Server Stopped 02.22.07 0
Print Server Stopped 02.22.07 0
Directory Synchronization Stopped 02.22.07 0
Bulletin Board Server Stopped 02.22.07 0
Background Search Service Stopped 02.22.07 0
Dump Server Stopped 02.22.07 0
CDA Server Stopped 02.22.07 0
POP3 interface Started 04:50:57 0
Omscan Server Stopped 02.22.07 0
Archiver Stopped 02.22.07 0
I keep trying to restart the individual services and they run for a moment and die.
When this happens, all I can do is stop Scalix altogether and restart it. Then everything's happy for anywhere from an hour to a day.
Background: RHEL4, fully up2date (man do I miss urpmi)
Big ol' beefy Dell server, loads of RAM, RAIDed drives etc.
As a side note, the indexer is still running at 100% CPU even after nearly 3 days. This server has a 60Gb (!) message store, so I'm thinking that it's just ploughing through a crapload of mail.
Finally, omshowlog gives me this:
Code: Select all
ERROR Local Delivery(Local Delivery) 02.23.07 04:52:03
[OM 24070] Debug message for Lab use :
ct_convFigaroCRec: Failed to convert CreatorORN to UTF8
Current errno value: 2
Last Msg Id: 20070223131520.2605D137939(a)SMTPRelay11.na.blackberry.net
Last Msg DirectRef: 000fd113845139f7
ERROR Local Delivery(Local Delivery) 02.23.07 04:52:03
[OM 28875] Attempt to read a block which does not exist from a blocked item.
Current errno value: 2
Last Msg Id: 20070223131520.2605D137939(a)SMTPRelay11.na.blackberry.net
Last Msg DirectRef: 000fd113845139f7
-> sfl_OpenItem
-> im_ItemRef2FName
<- im_ItemRef2FName
-> sfl_OpenSfl
-> im_OpenItem
-> im_ItemRef2FName
<- im_ItemRef2FName
<- im_OpenItem
<- sfl_OpenSfl
<- sfl_OpenItem
<- im_OpenItem
-> im_ItemRef2FName
<- im_ItemRef2FName
<- /build/11.0.1/src/lib/ombase/sfl/sfl_Blcked.c:1394[100,28875]
<- /build/11.0.1/src/lib/ombase/sfl/sfl_Blcked.c:1697[100,28875]
<- /build/11.0.1/src/lib/ct/ct_rdext.c:153[100,28875]
ERROR Local Delivery(Local Delivery) 02.23.07 04:52:03
[OM 3539] Content Record 0 in container ~/data/00000h9/003v393:1 could not be upgraded.
Current errno value: 2
Last Msg Id: 20070223131520.2605D137939(a)SMTPRelay11.na.blackberry.net
Last Msg DirectRef: 000fd113845139f7
WARNING Local Delivery(Local Delivery) 02.23.07 04:52:03
[OM 3543] Failed to upgrade a Content Record to current container format.
Current errno value: 2
Last Msg Id: 20070223131520.2605D137939(a)SMTPRelay11.na.blackberry.net
Last Msg DirectRef: 000fd113845139f7
-> im_ItemRef2FName
<- im_ItemRef2FName
-> sfl_OpenSfl
-> im_OpenItem
-> im_ItemRef2FName
<- im_ItemRef2FName
<- im_OpenItem
<- sfl_OpenSfl
<- sfl_OpenItem
<- im_OpenItem
-> im_ItemRef2FName
<- im_ItemRef2FName
<- /build/11.0.1/src/lib/ombase/sfl/sfl_Blcked.c:1394[100,28875]
<- /build/11.0.1/src/lib/ombase/sfl/sfl_Blcked.c:1697[100,28875]
<- /build/11.0.1/src/lib/ct/ct_rdext.c:153[100,28875]
<- /build/11.0.1/src/lib/ct/ct_upgrade.c:1136[3,3543]
SERIOUS ERROR Local Delivery(Local Delivery) 02.23.07 04:52:03
[OM 10270] Process about to terminate due to error.
Signal (Segmentation Violation) trapped by process 29044
Procedure trace follows:
-> sfl_OpenSfl
-> im_OpenItem
-> im_ItemRef2FName
<- im_ItemRef2FName
<- im_OpenItem
<- sfl_OpenSfl
<- sfl_OpenItem
<- im_OpenItem
-> im_ItemRef2FName
<- im_ItemRef2FName
<- /build/11.0.1/src/lib/ombase/sfl/sfl_Blcked.c
<- /build/11.0.1/src/lib/ombase/sfl/sfl_Blcked.c
<- /build/11.0.1/src/lib/ct/ct_rdext.c
<- /build/11.0.1/src/lib/ct/ct_upgrade.c
<- /build/11.0.1/src/lib/ct/ct_pend.c
Current errno value: 2
Last Msg Id: 20070223131520.2605D137939(a)SMTPRelay11.na.blackberry.net
Last Msg DirectRef: 000fd113845139f7
SERIOUS ERROR Local Delivery(Local Delivery) 02.23.07 04:52:03
[OM 10272] BACKTRACE:
/opt/scalix/lib/libom_er.so(er_add_backtrace+0xc6)[0xbafee6]
/opt/scalix/lib/libom_er.so[0xbb01e6]
/opt/scalix/lib/libom_er.so(er_DumpProcAndExit+0x1f)[0xbb038f]
/lib/tls/libpthread.so.0[0x3a5898]
/opt/scalix/lib/libom_ct.so(PendOpenCtner+0x63c)[0x2fe4fc]
/opt/scalix/lib/libom_ct.so(PendDelete+0x1d1)[0x2fdad9]
/opt/scalix/lib/libom_ct.so(CloseCtner+0x3c2)[0x2e6c49]
/opt/scalix/lib/libom_ct.so(ct_CloseCtner+0x5d)[0x2d55b5]
local.delivery[0x804f01c]
local.delivery[0x8059ac7]
local.delivery[0x80534a6]
local.delivery[0x805cfa9]
local.delivery[0x805eca1]
local.delivery[0x805f949]
/lib/tls/libc.so.6(__libc_start_main+0xd3)[0x1acde3]
local.delivery[0x804dce1]
Current errno value: 2
Last Msg Id: 20070223131520.2605D137939(a)SMTPRelay11.na.blackberry.net
Last Msg DirectRef: 000fd113845139f7
Ok... sorry for the long post, but I'm somewhat stumped at this point. Any ideas out there as to WTF is going on?
Thanks in advance!
Rubin
