Mails disappearing on omresub -q ERROR

Discuss the Scalix Server software

Moderators: ScalixSupport, admin

hklygre
Posts: 28
Joined: Tue Jun 26, 2007 11:40 am

Mails disappearing on omresub -q ERROR

Postby hklygre » Wed Oct 31, 2007 11:48 am

Hi,

yesterday I had three messages show up on the ERROR queue, with about 30 minutes between them. (Lots of messages get delivered in the meantime).

Each time there is a corresponding error in the FATAL-log:

Code: Select all

ERROR                          Local Delivery(Local Delivery) 10.30.07 15:29:43
[OM 1001] Transaction File record size is out of bounds
  Last Msg Id: H00002f5000837a9.1193754583.mail.datalab.no
        -> tf_ReadRecord
        -> tf_GetINT32
        <- tf_GetINT32
        <- tf_ReadRecord 40000 141
        -> tf_GetINT32
        <- tf_GetINT32
        -> tf_ReadRecord
        -> tf_GetINT32
        <- tf_GetINT32
        <- tf_ReadRecord 40000 141
        -> tf_GetINT32
        <- tf_GetINT32
        -> tf_ReadRecord
        <- /build/11.1.0/src/lib/tf/tf_ReadRec.c:73[3,1001]
        <- /build/11.1.0/src/lib/tf/tf_ReadRec.c:88[3,1001]
        <- /build/11.1.0/src/lib/red/red_gen.c:1173[3,1001]


Server is 11.1. In all cases, senders are on Outlook with the 11.2 connector.

Now - we're in a transition period, and some users are Scalix users, and some are on the old email-system - they have accounts on Scalix, but there are redirect-rules sending all mail to the old system via SMTP. In this case, at least one of the mails got delivered to all users actively using the Scalix-system, while the mails getting redirected was not. (Nothing in the sendmail log nor in the logs on the smarthost). I don't know if this is relevant, it is in any case not consistent - mails are flying across the two servers all the time with no problems.


According to the omresub man-page, "If you resubmit messages on an error queue without having corrected the problem that caused the messages to be there, the messages will simply return to the error queue again". Not so. I did an 'omresub -q ERROR', getting

Code: Select all

REPORT                         Administration(Resubmit      ) 10.30.07 15:42:19
[OM 4004] Resubmitted 3 messages


in the logs in return. And now, the messages are gone. I have no idea where they went, they were not delivered and are not in the ERROR queue.

I realise that there is little to be done with the lost messages (except notify the users sending them, and turning on auditing before resubmitting next time), but where did they go? Is this an expected result from an omresub?


- HÃ¥vard

dannyt
Scalix
Scalix
Posts: 140
Joined: Mon Aug 08, 2005 11:52 am
Location: UK

Postby dannyt » Thu Nov 08, 2007 1:35 pm

Hi,

This sounds similar to bug 15614 which is fixed in 11.2 - see bugzilla.scalix.com for details.

Regards,
Danny

fkienker
Posts: 79
Joined: Sat Nov 18, 2006 1:08 pm
Location: Atlanta GA USA

Postby fkienker » Fri Nov 09, 2007 11:28 am

Gone but not forgotten! I'm getting almost exactly the same error under exactly the same circumstances but this time in 11.2. Never had the problem in 11.1.

ERROR Local Delivery(Local Delivery) 11.09.07 10:09:32
[OM 1001] Transaction File record size is out of bounds
Last Msg Id: H0000072001a0784.1194620972.mail.gardnergroff.net
<- tf_GetINT32
<- tf_ReadRecord 40000 177
-> tf_GetINT32
<- tf_GetINT32
-> tf_ReadRecord
-> tf_GetINT32
<- tf_GetINT32
<- tf_ReadRecord 40000 177
-> tf_GetINT32
<- tf_GetINT32
-> tf_ReadRecord
-> tf_GetINT32
<- tf_GetINT32
<- /build/11.2.0/src/lib/tf/tf_ReadRec.c:96[3,1001]
<- /build/11.2.0/src/lib/tf/tf_ReadRec.c:112[3,1001]
<- /build/11.2.0/src/lib/red/red_gen.c:1173[3,1001]

hklygre
Posts: 28
Joined: Tue Jun 26, 2007 11:40 am

Postby hklygre » Thu Nov 15, 2007 10:40 am

dannyt wrote:Hi,

This sounds similar to bug 15614 which is fixed in 11.2 - see bugzilla.scalix.com for details.

Regards,
Danny


OK, let's hope so. I've just upgraded to 11.2, so we'll see. I see another user still has the problem.

- HÃ¥vard

hklygre
Posts: 28
Joined: Tue Jun 26, 2007 11:40 am

Postby hklygre » Mon Nov 19, 2007 8:29 am

dannyt wrote:Hi,

This sounds similar to bug 15614 which is fixed in 11.2 - see bugzilla.scalix.com for details.

Regards,
Danny


Unfortunately, the problem still exists - now with Scalix enterprise 11.2, Outlook and Scalix plugin 11.2.

Code: Select all

WARNING                        Local Delivery(Local Delivery) 11.19.07 09:44:37
[OM 24070] Debug message for Lab use :
tf_ReadRecord bad rec. offset 6627. Unexpected size=0. LastReadRecord Type=40000 size=133.
File=.
  Last Msg Id: H00002fb0019c154.1195461877.mail.datalab.no


ERROR                          Local Delivery(Local Delivery) 11.19.07 09:44:37
[OM 1001] Transaction File record size is out of bounds
  Last Msg Id: H00002fb0019c154.1195461877.mail.datalab.no
        <- tf_GetINT32
        <- tf_ReadRecord 40000 133
        -> tf_GetINT32
        <- tf_GetINT32
        -> tf_ReadRecord
        -> tf_GetINT32
        <- tf_GetINT32
        <- tf_ReadRecord 40000 133
        -> tf_GetINT32
        <- tf_GetINT32
        -> tf_ReadRecord
        -> tf_GetINT32
        <- tf_GetINT32
        <- /build/11.2.0/src/lib/tf/tf_ReadRec.c:96[3,1001]
        <- /build/11.2.0/src/lib/tf/tf_ReadRec.c:112[3,1001]
        <- /build/11.2.0/src/lib/red/red_gen.c:1173[3,1001]


As I mentioned in the first post, we're in a transition period, so for some users we have REDIRECT-rules (created through sxaa), and this mail was delivered to all users except those with redirect-rules. It was delivered both to external users as well as "normal" Scalix users.

The rules work - all (other) mail gets delivered normally - from and to the same users.

Richard Hall
Scalix
Scalix
Posts: 147
Joined: Fri May 20, 2005 5:37 am

Postby Richard Hall » Wed Nov 21, 2007 5:33 am

Hi,

As you say - it looks like a problem specifically to do with redirection.

I assume from what you say that this doesn't occur for all redirected msgs, just a few from time to time?
I'd like to get hold of a problem message from your ERROR queue, and also the auto-actions for the user associated with the failure. Do you know how to get this information?

Cheers - Richard

hklygre
Posts: 28
Joined: Tue Jun 26, 2007 11:40 am

Postby hklygre » Wed Nov 21, 2007 6:00 am

Richard Hall wrote:Hi,

As you say - it looks like a problem specifically to do with redirection.

I assume from what you say that this doesn't occur for all redirected msgs, just a few from time to time?

That is correct. It happened on Oct 30., and again (twice) on Nov 19. - It's happened previously as well. There are hundreds of redirected messages each day which work normally.

It appears to happen in "groups" - i.e. several messages during a span of a few hours. They didn't have anything in common - different senders and recipients.

I did resubmit the messages (omresub), and this time they got re-delivered to the external users, and not to the internal redirected users, nor to the internal "normal" users.

I'd like to get hold of a problem message from your ERROR queue, and also the auto-actions for the user associated with the failure. Do you know how to get this information?


The only auto-actions are the ones created with sxaa

Code: Select all

# sxaa --user 'User Name/example' --redirect username@oldmail.example.com
# sxaa --user username@example.com --info 600
Action
------
REDIRECT to S=username/OU1=internet/DDT1=RFC-822/DDV1=username@oldmail.example.com/CN=username/INTERNET-ADDR=username@oldmail.example.com
#



- it's the same for all users. (Slightly obfuscated for the forum)

Regarding the message - they're gone now, but next time I can send one to you. I might need a bit of hand-holding re. what to do - I've looked at the man-pages for omqdump and omcontain, but I don't know exactly what you're looking for.


Cheers - Richard


- HÃ¥vard

Richard Hall
Scalix
Scalix
Posts: 147
Joined: Fri May 20, 2005 5:37 am

Postby Richard Hall » Wed Nov 21, 2007 6:54 am

Hi HÃ¥vard,

I'm PM you later today with instructions on how to export a msg from the ERROR queue and also how to get the 'raw' auto-action file.

Richard

hklygre
Posts: 28
Joined: Tue Jun 26, 2007 11:40 am

Postby hklygre » Wed Nov 21, 2007 8:10 am

Richard Hall wrote:Hi HÃ¥vard,

I'm PM you later today with instructions on how to export a msg from the ERROR queue and also how to get the 'raw' auto-action file.

Richard


There's no time like the present - it's happened again.

- HÃ¥vard

mikethebike
Posts: 566
Joined: Mon Nov 28, 2005 4:16 pm
Location: England

Postby mikethebike » Wed Nov 21, 2007 10:15 am

Hi,
this suggests to me something wrong with the redirct rule (either teh 3d file or 3e.xxx file in the users "g" directory).
The reason I say this is because the error is occuring at local delivery, rather than service router (local delivery checked autoactions before anything else, and seems to be messing up here).

What do you get is you tfbrowse those two files (3d and 3e)?

It may also be worth increasing the event log level for local delivery (omconflvl local 11), to see if that shows any more info

Mick

hklygre
Posts: 28
Joined: Tue Jun 26, 2007 11:40 am

Postby hklygre » Thu Nov 22, 2007 2:49 am

mikethebike wrote:Hi,
this suggests to me something wrong with the redirct rule (either teh 3d file or 3e.xxx file in the users "g" directory).
The reason I say this is because the error is occuring at local delivery, rather than service router (local delivery checked autoactions before anything else, and seems to be messing up here).

What do you get is you tfbrowse those two files (3d and 3e)?

It may also be worth increasing the event log level for local delivery (omconflvl local 11), to see if that shows any more info

Mick


The thing is - in almost every case the redirect rule works, it's just once in a while it doesn't, and then it doesn't work for any recipients of that mail which have a redirect rule. The same users get plenty of mails otherwise.

Per Scalix' suggestions I've turned on auditing, so now I'm just waiting for it to happen again.

- HÃ¥vard

fkienker
Posts: 79
Joined: Sat Nov 18, 2006 1:08 pm
Location: Atlanta GA USA

Postby fkienker » Fri Nov 30, 2007 7:21 pm

I don't know if this is helpful or not.

We have four premium users which have redirects to their Blackberry's. The redirects were set up with sxaa. There are no conditionals on the redirects. Every incoming message for all four is forwarded to their respective Blackberry's. Only one user has the problem - the other three NEVER have the OM 1001 error message. And for the one user it's maybe one message out of every 150 which creates an error. There is always a OM 24070 error proceeding each OM 1001error.

I'm going to try completely clearing out the 3d file for the affected user and reconstructing the rules to see if this makes a difference.

If there is anymore information I can provide please contact me on or off the list.


Return to “Scalix Server”



Who is online

Users browsing this forum: Google [Bot] and 2 guests

cron