Page 1 of 2

imap server dieing [solved]

Posted: Fri Sep 28, 2007 12:13 pm
by stephan.klein
Hi,

running omshowlog, I see hundreds of these errors, hitting all users on the machine (11.1 and now on 11.2).

The imap server still works, but has to spawn new processes for the users (thunderbird has to open a new connection).

Is there a way to fix this?

Regards
Stephan


Code: Select all

SERIOUS ERROR                  IMAP Server Da(IMAP Server Pr) 28.09.07 18:00:53
[OM 10270] Process about to terminate due to error.
Signal (Segmentation Violation) trapped by process 8558
Procedure trace follows:
  -> aud_StartTransaction
  <- aud_StartTransaction
  -> aud_LogCurrentTime
  -> aud_LogTime
  <- aud_LogTime
  <- aud_LogCurrentTime
  -> nm_ParseORN
  <- nm_ParseORN
  -> aud_LogStr
  <- aud_LogStr
  -> aud_LogInt
  <- aud_LogInt
  -> aud_LogInt
  <- aud_LogInt
  -> aud_EndTransaction
  <- aud_EndTransaction
User Name: Florian / GIGAtec/CN=Florian


SERIOUS ERROR                  IMAP Server Da(IMAP Server Pr) 28.09.07 18:00:53
[OM 10272] BACKTRACE:
/opt/scalix/lib/libom_er.so(er_add_backtrace+0xc6)[0xb7e2e1a6]
/opt/scalix/lib/libom_er.so[0xb7e2e4ba]
/opt/scalix/lib/libom_er.so(er_DumpProcAndExit+0x27)[0xb7e2e647]
[0xffffe420]
/usr/lib/libldap_r.so.2(ldap_pvt_thread_mutex_destroy+0x1d)[0xb6e35afd]
/usr/lib/libldap_r.so.2(ldap_pvt_sasl_mutex_dispose+0x26)[0xb6e3bd36]
/usr/lib/libsasl2.so.2[0xb7ca9fd7]
/usr/lib/libsasl2.so.2(sasl_done+0x28)[0xb7ca4838]
in.imap41d[0x8064475]
in.imap41d[0x80627f6]
in.imap41d[0x8062292]
/lib/libc.so.6(__libc_start_main+0xe0)[0xb7b69030]
in.imap41d[0x804e031]
User Name: Florian / GIGAtec/CN=Florian


SERIOUS ERROR                  IMAP Server Da(IMAP Server Pr) 28.09.07 18:00:53
[OM 10270] Process about to terminate due to error.
Signal (Segmentation Violation) trapped by process 6186
Procedure trace follows:
  -> aud_StartTransaction
  <- aud_StartTransaction
  -> aud_LogCurrentTime
  -> aud_LogTime
  <- aud_LogTime
  <- aud_LogCurrentTime
  -> nm_ParseORN
  <- nm_ParseORN
  -> aud_LogStr
  <- aud_LogStr
  -> aud_LogInt
  <- aud_LogInt
  -> aud_LogInt
  <- aud_LogInt
  -> aud_EndTransaction
  <- aud_EndTransaction
User Name: Florian / GIGAtec/CN=Florian


SERIOUS ERROR                  IMAP Server Da(IMAP Server Pr) 28.09.07 18:00:53
[OM 10272] BACKTRACE:
/opt/scalix/lib/libom_er.so(er_add_backtrace+0xc6)[0xb7e2e1a6]
/opt/scalix/lib/libom_er.so[0xb7e2e4ba]
/opt/scalix/lib/libom_er.so(er_DumpProcAndExit+0x27)[0xb7e2e647]
[0xffffe420]
/usr/lib/libldap_r.so.2(ldap_pvt_thread_mutex_destroy+0x1d)[0xb6e35afd]
/usr/lib/libldap_r.so.2(ldap_pvt_sasl_mutex_dispose+0x26)[0xb6e3bd36]
/usr/lib/libsasl2.so.2[0xb7ca9fd7]
/usr/lib/libsasl2.so.2(sasl_done+0x28)[0xb7ca4838]
in.imap41d[0x8064475]
in.imap41d[0x80627f6]
in.imap41d[0x8062292]
/lib/libc.so.6(__libc_start_main+0xe0)[0xb7b69030]
in.imap41d[0x804e031]
User Name: Florian / GIGAtec/CN=Florian

Posted: Sun Sep 30, 2007 1:49 pm
by chris
Stephan,

I don't see anything too obvious, if it's causing issues for your users you should probably open a support case to take care of it.

Chris

Posted: Mon Oct 01, 2007 4:28 am
by stephan.klein
Chris,

thank you for your suggestion!

I am using the community edition, so I can't open a support case (can I?).

What I tried to fix this issue:

updated the libc6 library as well as the ldap and sasl2 libraries on the system (debian).

Unfortunately there are still lots of dieing imap processes in the log - if it continues like that, I have to open a cemetery :-)

The problem with this issue is that every time the imap process dies, the client has to logon again, reopen the folder etc. This takes a lot of time, especially on slow connections.

Thank you for any help!

Regards
Stephan

Posted: Mon Oct 01, 2007 12:24 pm
by chris
Have you guys both checked everything with omscan?

Read the manpage, but you'll want to make sure the server process has run correctly, then do an omscan -Aafxv over the whole mailstore.

If you search the forums you'll find lots of wisdom about using omscan effectively.

Chris

Posted: Mon Oct 01, 2007 1:49 pm
by stephan.klein
Hi Chris,

I did the suggested omscan and all the usual suspects ,-)

No error were displayed.

I'm still getting these imap crashes. It goes well for a number of operations (for example deleting messages via thunderbird), and then I get the error in the log and thunderbird has to reconnect.

Is there any information I can provide you to help me solve the problem?

Regards
Stephan

Posted: Mon Oct 01, 2007 1:53 pm
by chris
1) have you tried deleting imap-caches? there are a number of threads about that as well.

2) do you see any relevant log message why the indexer is partially aborted?

3) i'm starting to wonder if maybe the file system has issues when multiple services start to choke. any chance you could do an fsck run?

Chris

Posted: Mon Oct 01, 2007 2:20 pm
by stephan.klein
I tried deleting the imap-cache, I did an omtidyallu -M, but the error still remains. not only for one account but for all...

Any further suggestions?

:?:

Posted: Mon Oct 01, 2007 2:26 pm
by chris
neoroot wrote:centos 4.5 full upgrade

from /var/log/messages [restart service ]
Oct 1 14:17:38 mail scalix: Stopping Scalix services (mail): succeeded
Oct 1 14:17:42 mail omrc: omshowmn : [OM 6623]
Oct 1 14:17:42 mail omrc: No mailnodes currently configured
Oct 1 14:17:42 mail scalix: Starting Scalix services (mail): succeeded


First off, CentOS is not a supported operating system. Were any errors logged during upgrade? Installer log in /var/log/scalix-installer*

stephan.klein wrote:I tried deleting the imap-cache, I did an omtidyallu -M, but the error still remains. not only for one account but for all...

Any further suggestions?


chris wrote:2) do you see any relevant log message why the indexer is partially aborted?

3) i'm starting to wonder if maybe the file system has issues when multiple services start to choke. any chance you could do an fsck run?


I'd be curious about 2 & 3 first. I'd suggest looking into the indexer's partially aborted state first - maybe that will provide a clue as to the general issues.

And I would definitely like to see a clean fsck run over the file system.

Chris

Posted: Mon Oct 01, 2007 2:35 pm
by stephan.klein
We are mixing topics, aren't we ,-)

Mine is about imap process crashes (only the spawned user processes, not the main imap process), everything else is working as expected on my box. I am running debian, not centos.

neoroot seems to have different problems on his system with more components of scalix and one of them is the imap process.

maybe we should split the thread?

Posted: Mon Oct 01, 2007 11:16 pm
by chris
stephan.klein wrote:maybe we should split the thread?


agreed. thread split to viewtopic.php?t=8929

Getting back to the IMAP-Satellites. Any update on the previous questions?

Posted: Wed Oct 03, 2007 6:05 am
by stephan.klein
sorry for delay - I am going to visit the data center the scalix box is housed in in a few hours and do an fsck in single user mode to check if there are any issues.

Regards
Stephan

Posted: Thu Oct 04, 2007 10:43 am
by stephan.klein
no news after fsck. sorry.

what was new to me: running omscan -a gives in the log

Code: Select all

WARNING                        Omscan Server (Omscan Tool   ) 04.10.07 16:40:32
[OM 2205] Invalid attempt to unlink directory
File Name: ~/temp/imap-admin.24462
        <- cvc_enhCnvString
        -> scn_ScanNameSubDir
        <- scn_ScanNameSubDir
        <- scn_ScanNameDirs
        -> scn_ScanTempDirs
        -> cvc_enhCnvString
        -> cvc_CnvStringTryIconv
        <- cvc_CnvStringTryIconv
        <- cvc_enhCnvString
        -> scn_DeleteOldFiles
        <- scn_DeleteOldFiles
        -> scn_DeleteOldFiles
        -> scn_DeleteOldFiles
        <- scn_DeleteOldFiles
        <- /build/11.2.0/src/lib/ombase/os/os_unlink.c:91[3,2205]
        <- /build/11.2.0/src/lib/ombase/os/os_unlink.c:91[3,2205]


Regards
Stephan

Posted: Fri Oct 05, 2007 2:22 am
by chris
Sounds like you might want to have a look at your directories:

Check the file /var/opt/scalix/*/s/sys/dir.index to find where yours directories are located.

Then run /opt/scalix/diag/dbcheck -s vi_dir within each reported database directory to validate database consistency.

Also, have you already checked permissions with omcheck?

Chris

Posted: Fri Oct 05, 2007 5:28 am
by stephan.klein
Hi Chris,

I tried both, no errors were reported.

Strange :-(

Stephan

Posted: Fri Oct 05, 2007 5:46 am
by chris
stephan.klein wrote:no news after fsck. sorry.

Code: Select all

WARNING                        Omscan Server (Omscan Tool   ) 04.10.07 16:40:32
[OM 2205] Invalid attempt to unlink directory
File Name: ~/temp/imap-admin.24462

Regards
Stephan


Why don't you nuke the caches? empty out temp, and remove all the users imap-cache folders. I should have thought of that earlier.

Chris