Page 1 of 1

Continued corruption errors.. 11-GA

Posted: Mon Jan 15, 2007 3:52 am
by hindog
Scalix 11 release

Oulook clients complain about "invalid UID", thunderbird complains that "some of the requested messages no longer exist".

The log is filled with these:

WARNING Internet Mail (Incoming ) 01.15.07 00:30:29
[OM.UX 1401] Cannot read message data.
-> ux_InReadMailText
<- ux_InReadMailText
-> ux_InParseCommandLine
<- ux_InParseCommandLine
-> ux_InReadMailText
<- ux_InReadMailText
-> ux_InParseCommandLine
<- ux_InParseCommandLine
-> ux_InReadMailText
<- ux_InReadMailText
-> ux_InParseCommandLine
<- ux_InParseCommandLine
-> ux_InReadMailText
<- /build/scalix-MAIN/src/bin/ux/ux_in.c:1118[102,1401]
<- /build/scalix-MAIN/src/bin/ux/ux_in.c:4608[102,1401]
<- /build/scalix-MAIN/src/bin/ux/ux_in.c:2009[102,1401]

omscan -Aafx doesn't report any errors or problems.

I've had this problem with Scalix 11 Beta as well. The only way I've found to fix is to recreate the affected accounts and imapsync the old to the new. But with some users with 1-3 GB of mail in their inboxes, this takes a very long time and I can't keep working around this problem.

Anyone else have this problem and what causes it, how do you fix it? I want to use scalix because it's open API will integrate nicely with our ERP, but if we grow to 50+ employees (currently at 23 employees), and this problem persists, we won't be able to sustain Scalix as a solution. Any help here would be greatly appreciated!

Aaron

Posted: Mon Jan 15, 2007 8:37 am
by ScalixSupport
Hi Aaron,

An efficient way to perform omscan to the mailbox is as below:

Code: Select all

omoff -d0 omscan
omscan -Z     
omon omscan

You need to watch the status of the omscan process using the command:

Code: Select all

omscan -t

Wait until you get a message that says "Current server cycle not started;
service reset or delayed".

Now try to fix the mailstore related issues in active mode using the
below given command:

Code: Select all

omscan -Avfx -U <username>

Did you follow the above steps?

Regards,
Subir

Posted: Mon Jan 15, 2007 2:18 pm
by hindog
I ran the commands you provided, but the problem still persists. Same stack trace shows in log.

What else could I try?

Aaron

Re: Continued corruption errors.. 11-GA

Posted: Mon Jan 15, 2007 2:28 pm
by dkelly
hindog wrote:WARNING Internet Mail (Incoming ) 01.15.07 00:30:29
[OM.UX 1401] Cannot read message data.
-> ux_InReadMailText
<- ux_InReadMailText
-> ux_InParseCommandLine
<- ux_InParseCommandLine
-> ux_InReadMailText
<- ux_InReadMailText
-> ux_InParseCommandLine
<- ux_InParseCommandLine
-> ux_InReadMailText
<- ux_InReadMailText
-> ux_InParseCommandLine
<- ux_InParseCommandLine
-> ux_InReadMailText
<- /build/scalix-MAIN/src/bin/ux/ux_in.c:1118[102,1401]
<- /build/scalix-MAIN/src/bin/ux/ux_in.c:4608[102,1401]
<- /build/scalix-MAIN/src/bin/ux/ux_in.c:2009[102,1401]

I'll take this one as you haven't supplied any logging for the other 2 issues and this one isn't connected.

This error typically occurs if the sending end of the incoming SMTP conversation has closed down unexpectedly. You should set

Code: Select all

DEBUG_LOG=TRUE
in /var/opt/scalix/NN/s/sys/smtpd.cfg and restart the SMTP Relay to see what data is coming in that could be causing this. The file is /var/opt/scalix/NN/s/tmp/smtpd-SMTP.log. Be warned that, depending on your traffic, it can get to be a big file but, if this is happening as regularly as you describe, it won't be too difficult to get a problem message.

In the past, we've seen that some spam messages contain NULL characters which caused the incoming gateway to get confused. I believe this was fixed for Scalix 11 GA though.

Cheers

Dave

Posted: Mon Jan 15, 2007 2:43 pm
by hindog
This problem seems to occur after I do a restart of the server (for instance, the other day I stopped Scalix, updated the server's time, and then restarted Scalix).

Also, when this problem happens, it affects anywhere from 20-1000 or more messages. In Outlook, it will display a message box for each affected message, and users need to hold down the ENTER key to acknowledge each message. So when the problem occurs, it does not occur one message at a time, it occurs with hundreds of messages at a time, and generally after a server restart.

Does this still seem to be linked to the SMTP server?

Posted: Mon Jan 15, 2007 3:03 pm
by dkelly
No, this isn't linked to the SMTP Relay which is why I'm trying to separate them out.

If it was a message store corruption, I'd expect to see more information from the command

Code: Select all

omshowlog -e
. What logs have you checked so far ?

Cheers

Dave

Posted: Mon Jan 15, 2007 3:16 pm
by hindog
As far as what logs to view, I am only familiar with the smtpd log (with DEBUG_SMTP=true) and omshowlog. (btw, where does the tomcat log write to?).

There seem to be hundreds or so of errors like these in the logs:

ERROR Browser (Service 14 ) 01.12.07 16:00:32
[OM.MIME 4000] Browser Args :index.browse -c -o /var/opt/scalix/el/s/temp/mime_cache/mimepTo6zk 0001ab2681967e36


ERROR Browser (Service 14 ) 01.12.07 16:00:33
[OM.MIME 4000] Browser Args :index.browse -c -o /var/opt/scalix/el/s/temp/mime_cache/mimeXLH7xE 0001ab290ff36829


ERROR Browser (Service 14 ) 01.12.07 16:00:33
[OM.MIME 4000] Browser Args :index.browse -c -o /var/opt/scalix/el/s/temp/mime_cache/mimed9vv0q 0001ab2681967e36


ERROR Browser (Service 14 ) 01.12.07 16:00:34
[OM.MIME 4000] Browser Args :index.browse -c -o /var/opt/scalix/el/s/temp/mime_cache/mimezQfYxl 0001ab290ff36829


ERROR Browser (Service 14 ) 01.12.07 16:00:35
[OM.MIME 4000] Browser Args :index.browse -c -o /var/opt/scalix/el/s/temp/mime_cache/mimeHRIQxc 0001ab290ff36829


ERROR Browser (Service 14 ) 01.12.07 16:00:37
[OM.MIME 4000] Browser Args :index.browse -c -o /var/opt/scalix/el/s/temp/mime_cache/mime9B749L 0001ab290ff36829


ERROR Browser (Service 14 ) 01.12.07 16:00:39
[OM.MIME 4000] Browser Args :index.browse -c -o /var/opt/scalix/el/s/temp/mime_cache/mimeda22OT 0001ab290ff36829


ERROR Browser (Service 14 ) 01.12.07 16:00:39
[OM.MIME 4000] Browser Args :index.browse -c -o /var/opt/scalix/el/s/temp/mime_cache/mime9up6ty 0003519d824bc301

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------







Some are slightly different (Providing a "Last Msg Id" and "Last Msg DirectRef"):


ERROR Browser (Service 14 ) 01.12.07 16:16:59
[OM.MIME 4000] Browser Args :index.browse -c -o /var/opt/scalix/el/s/temp/mime_cache/mimebXxOz8 0001ab5fe32075d2
Last Msg Id: OLELKIMOAJINGIBDOCMLGEDACCAA.bsnell(a)edicorp.com
Last Msg DirectRef: 000367a7b57e74ec


I don't see anything else in regards to "ERROR" in the logs

Aaron

Posted: Mon Jan 15, 2007 3:33 pm
by dkelly
hindog wrote:There seem to be hundreds or so of errors like these in the logs:

ERROR Browser (Service 14 ) 01.12.07 16:00:32
[OM.MIME 4000] Browser Args :index.browse -c -o /var/opt/scalix/el/s/temp/mime_cache/mimepTo6zk 0001ab2681967e36


ERROR Browser (Service 14 ) 01.12.07 16:00:33
[OM.MIME 4000] Browser Args :index.browse -c -o /var/opt/scalix/el/s/temp/mime_cache/mimeXLH7xE 0001ab290ff36829


The direct reference at the end of each of those messages seems to repeat.

From the command line, run the command

Code: Select all

index.browse -c 0001ab290ff36829
to see if that's returning anything. It should return the MIME output for the message.

If nothing is returned, it maybe because that direct reference no longer exists because you recreated the mailbox. If that's the case, please use a more recent error message to take the direct reference from.

One other thing I'll have you check is that the permissions are all correct on the Scalix message store.

Run the command:

Code: Select all

omcheck -s -d > /tmp/omcheck.sh
and this will generate a script that will reset any permissions that are incorrect. You can run

Code: Select all

sh /tmp/omcheck.sh
to execute the script.

Cheers

Dave

Posted: Mon Jan 15, 2007 4:01 pm
by hindog
I ran

Code: Select all

index.browse -c 0003f545b6a85d77

on the last error in the logs and the command immediately returned (no output).

I also ran the permissions script:

Code: Select all

email:~ # omcheck -s -d > /tmp/omcheck.sh
email:~ # sh /tmp/omcheck.sh


And the command completed. After this, I tried running the above command again and same results (no output). Restarted my email client and problem still persists.

Anything else we can try?

I appreciate the quick responses...

Posted: Tue Jan 16, 2007 1:01 pm
by hindog
Should I report this as a bug to scalixzilla?

Another new install seeing this error

Posted: Thu Jan 18, 2007 3:15 am
by deyjvu
We have a new site that has just installed a new server with V11 and they are seeing this error OM.MIME 4000 (I'll get the specifics of their error) but they are also seeing Browser Service 14 OM 3010 "file to convert doesn't exist".

Don't know if the two errors are related but didn't want you to think this was an isolated case.

Have just found that they also got the OM3420 and POP 1047 errors that have been fixed in V11.0.1.1 (?) according to bugz.