Problem after upgrade to 10 with moved /var/opt/scalix

Discuss the Scalix Server software

Moderators: ScalixSupport, admin

BigBirdy
Posts: 133
Joined: Sun Mar 13, 2005 2:10 pm
Location: Squamish, BC
Contact:

Problem after upgrade to 10 with moved /var/opt/scalix

Postby BigBirdy » Sat Mar 11, 2006 3:04 am

Prior to upgrading to Scalix 10, I had moved the /var/opt/scalix folder to /home/scalix since I had more space on the home drive. After upgrading to 10, all worked fine but I had not rebooted. Today I rebooted and no mail is getting to the local accounts and the "local delivery" wont start and mail is piling up in the "local delivery" que. The symbolic link from /var/opt/scalix still exists but I am not sure what the problem is with the local delivery service not starting. I am confident that the mail is not lost, its on the que, but I dont know how to get that service back and running and the mail delivery returning to normal?

Thanks

ScalixSupport
Scalix
Scalix
Posts: 5503
Joined: Thu Mar 25, 2004 8:15 pm

Postby ScalixSupport » Sat Mar 11, 2006 11:22 am

What do the Scalix error logs show ?

Cheers

Dave

BigBirdy
Posts: 133
Joined: Sun Mar 13, 2005 2:10 pm
Location: Squamish, BC
Contact:

Postby BigBirdy » Sat Mar 11, 2006 12:07 pm

Which errror logs? Location?

ScalixSupport
Scalix
Scalix
Posts: 5503
Joined: Thu Mar 25, 2004 8:15 pm

Postby ScalixSupport » Sat Mar 11, 2006 12:19 pm

Please read through the sticky note at the top of this forum http://www.scalix.com/community/viewtopic.php?t=1397

Cheers

Dave

BigBirdy
Posts: 133
Joined: Sun Mar 13, 2005 2:10 pm
Location: Squamish, BC
Contact:

Postby BigBirdy » Sat Mar 11, 2006 12:20 pm

Here is what goes into the local delivery service error log after a restart

SERIOUS ERROR Local Delivery(Local Delivery) 03.11.06 08:19:08
[OM 10270] Process about to terminate due to error.
Signal (Segmentation Violation) trapped by process 23400
Procedure trace follows:
-> nm_AppendFieldMem
-> nm_AddSeparators
<- nm_AddSeparators
-> nm_ParseORN
<- nm_ParseORN
<- nm_AppendFieldMem
<- nm_PutFieldMem
-> nm_PutFieldMem
-> nm_AppendFieldMem
<- nm_AppendFieldMem
<- nm_PutFieldMem
-> nm_ParseORN
<- nm_ParseORN
<- ul_utUnpackUserEnt
<- ul_FindPrimeUser
<- ul_IsEnu


SERIOUS ERROR Local Delivery(Local Delivery) 03.11.06 08:19:08
[OM 10272] BACKTRACE:
/opt/scalix/lib/libom_er.so(er_add_backtrace+0xc6)[0x51df06]
/opt/scalix/lib/libom_er.so[0x51e1d6]
/opt/scalix/lib/libom_er.so(er_DumpProcAndExit+0x1f)[0x51e37f]
/lib/tls/libpthread.so.0[0xc2a7c8]
[0x696c6e69]

BigBirdy
Posts: 133
Joined: Sun Mar 13, 2005 2:10 pm
Location: Squamish, BC
Contact:

Postby BigBirdy » Sat Mar 11, 2006 3:00 pm

I am not sure if this is the cause or a related problem, but the first reboot of the server yesterday, a week or so after the upgrade to 10, the reboot was abrupt due to a powerbar reset. I use Webmin for management and going into the Sendmail configuration and trying to go into "Read User Mail" I get the message below.

HTTP/1.0 500 Perl execution failed Server: MiniServ/0.01 Date: Sat, 11 Mar 2006 17:41:03 GMT Content-type: text/html Connection: close Error - Perl execution failed Illegal modulus zero at ./mailboxes-lib.pl line 471.

Nothing in the messages or maillog to indicate what the problem is and I "seem" to be able to run mail -s "Test" user@localhost without any problems?

This is a serious problem for us right now

florian
Scalix
Scalix
Posts: 3852
Joined: Fri Dec 24, 2004 8:16 am
Location: Frankfurt, Germany
Contact:

Postby florian » Sun Mar 12, 2006 11:03 am

Birdy,

believe we'll need to check out what is running and what not - after restart, what is the full status of the system as provided by omstat -s and omstat -a. The local delivery service is critical for mail successfully delivered, but not one of the most important services - it has others it depends on, so we should check those first.

The webmin error message, most likely, has nothing to do with it. Scalix has its own message store and does not store any data in the same mbox or mdir directories that sendmail puts its direct data in.

-- f.
Florian von Kurnatowski, Die Harder!

tom__b_

Got the same problem here

Postby tom__b_ » Fri Mar 24, 2006 4:56 pm

On a clean Scalix 10 install. When I delete all the messages in the que and start 'Local Delivery', it'll run until the next mail comes in when it throws this error again.

This is a list of the components that are installed on this server.
scalix-resversion 10.0.0.354release 1installed on Mon 06 Mar 2006 08:38:50 PM GMT
scalix-serverversion 10.0.0.175release 1.rhel4installed on Mon 06 Mar 2006 08:38:48 PM GMT
scalix-sacversion 10.0.0.354release 1installed on Mon 06 Mar 2006 08:38:50 PM GMT
scalix-swaversion 10.0.0.343release 1installed on Mon 06 Mar 2006 08:38:50 PM GMT


REPORT Local Delivery(Error Manager ) 03.24.06 20:22:16
[OM 8801] Error Manager Server Started Up


REPORT Local Delivery(Local Delivery) 03.24.06 20:22:16
[OM 7601] Local Delivery Started Up


SERIOUS ERROR Local Delivery(Local Delivery) 03.24.06 20:22:16
[OM 10270] Process about to terminate due to error.
Signal (Segmentation Violation) trapped by process 6470
Procedure trace follows:
-> nm_AppendFieldMem
-> nm_AddSeparators
<- nm_AddSeparators
-> nm_ParseORN
<- nm_ParseORN
<- nm_AppendFieldMem
<- nm_PutFieldMem
-> nm_PutFieldMem
-> nm_AppendFieldMem
<- nm_AppendFieldMem
<- nm_PutFieldMem
-> nm_ParseORN
<- nm_ParseORN
<- ul_utUnpackUserEnt
<- ul_FindPrimeUser
<- ul_IsEnu


SERIOUS ERROR Local Delivery(Local Delivery) 03.24.06 20:22:16
[OM 10272] BACKTRACE:
/opt/scalix/lib/libom_er.so(er_add_backtrace+0xc6)[0xd66f06]
/opt/scalix/lib/libom_er.so[0xd671d6]
/opt/scalix/lib/libom_er.so(er_DumpProcAndExit+0x1f)[0xd6737f]
/lib/tls/libpthread.so.0[0x143888]
[0x2e333231]

Has anybody come across this? A little nudge in the right direction would be really appreciated.

ScalixSupport
Scalix
Scalix
Posts: 5503
Joined: Thu Mar 25, 2004 8:15 pm

Postby ScalixSupport » Fri Mar 24, 2006 5:57 pm

First type:

omshowenu

and see if it's correctly set to a valid user. Assuming it is, logon as that user and it's likely you'll find that mailbox has 65536 messages in it's inbox. Clean out those messages (you can use omtidyu to bulk delete messages - see the man page for details) and restart LD.

Thanks,
Rachel

tom__b_

Postby tom__b_ » Sat Mar 25, 2006 7:42 am

[root@mail # omshowenu
sxadmin /mail,xxxx/CN=sxadmin

Logged in and found only 2 mails. Deleted them. Still now luck.

Disabled calmd and spamassassin. Didn't help.

Uninstalled and reinstalled scalix but kept the message store. Didn't help.

Uninstalled and reinstalled all components including the store. Yes now its working again (duh)

The only thing I can think of is the short power interrupt last week.

And I was so close to moving our mails over to scalix. If anybody gets to fix this error without reinstalling, please post it, thanks.
Last edited by tom__b_ on Sat Apr 01, 2006 6:12 am, edited 1 time in total.

florian
Scalix
Scalix
Posts: 3852
Joined: Fri Dec 24, 2004 8:16 am
Location: Frankfurt, Germany
Contact:

Postby florian » Sat Mar 25, 2006 8:13 am

Hi Tom,

sorry to hear that.

I would assume that the power outage resulted in a system crash which resulted in partial message store corruption, obviously in particular to the mailbox of your sxadmin user.

Good that it works for you again - assume as this was still a test install, you didn't lose any important data?

Now, what I want to be very clear about is that there are a number of troubleshooting steps that we would normally have recommended before going that far. We've had almost no occasion with our product where a full reset (or restore from backup) of the message store was necessary. This is important for us as a lot of our server team's hard work and pride is put into the stability of this component.

So, if something like this or similar happens to you, you should....
... run omscan -A on the affected user's mailbox, also in verbose (-v) and fix (-f) modes
... see if you can login to the mailbox using any other client type and find/see/delete the corrupted message
... use the low-level container editor omcontain to look and debug the mailbox
... delete and recreate (and restore from backup) this one users mailbox while everybody else is using the system.

The whole theory is that if ever something is corrupted inside the message store, due to the file-based nature of the architecture and everything being spread out across many files, corruption is in almost all isolated to either a single message, a single folder or a single user. This is very different from a database driven approach where while corruption of a key database file usually affects all users.

Hope this helps - if you still have a copy of your old store available, we can also discuss here how to walk you throgh the steps outlined above.

Cheers,
Florian.
Florian von Kurnatowski, Die Harder!

tom__b_

Again!

Postby tom__b_ » Thu Apr 13, 2006 8:08 am

I'm back:

SERIOUS ERROR Local Delivery(Local Delivery) 04.13.06 12:53:35
[OM 10270] Process about to terminate due to error.
Signal (Segmentation Violation) trapped by process 17856
Procedure trace follows:
-> nm_AppendFieldMem
-> nm_AddSeparators
<- nm_AddSeparators
-> nm_ParseORN
<- nm_ParseORN
<- nm_AppendFieldMem
<- nm_PutFieldMem
-> nm_PutFieldMem
-> nm_AppendFieldMem
<- nm_AppendFieldMem
<- nm_PutFieldMem
-> nm_ParseORN
<- nm_ParseORN
<- ul_utUnpackUserEnt
<- ul_FindPrimeUser
<- ul_IsEnu


SERIOUS ERROR Local Delivery(Local Delivery) 04.13.06 12:53:35
[OM 10272] BACKTRACE:
/opt/scalix/lib/libom_er.so(er_add_backtrace+0xc6)[0x15ef06]
/opt/scalix/lib/libom_er.so[0x15f1d6]
/opt/scalix/lib/libom_er.so(er_DumpProcAndExit+0x1f)[0x15f37f]
/lib/tls/libpthread.so.0[0xd6f888]
[0x63656c75]

Only this time, I noticed this happen right after I tried to delete a user via SAC. I couldn't delete the user either. I tried omdelu -n S=Sxxxx/G=Ixxxxx, no luck. User is still there. I then tried omdelent -e S=Sxxxx/G=Ixxxx, that deleted it from the directory. I tried omdelu -n S=Sxxxx/G=Ixxxxx, again, still no luck.

Any ideas please?

ScalixSupport
Scalix
Scalix
Posts: 5503
Joined: Thu Mar 25, 2004 8:15 pm

Postby ScalixSupport » Thu Apr 13, 2006 10:06 am

You might want to look at the man pages for omdelu. The syntax you are using is not correct. It's more like

omdelu -n "user name/mailnode"

Please don't talk to me about consistency. ;-)

Regards,
Don

tom__b_

Postby tom__b_ » Thu Apr 13, 2006 10:47 am

omdelu -n "Ixxxxx Sxxxx/mail,crxxxxxxxx" done.

I do a # omsearch -e S=*sxxxx* -d userlist -t h
UL-PMN=448/S=Sxxxx/G=IxxxxOU1=mail/OU2=cxxxxx/CN=IxxxSxxxxr/OM-UID=103/INTERNET-ADDR="Ixxxxx Sxxxx" <is@cxxxxxxx>/UL-AUTHID=is@cxxxxxxxxUL-UXID=55001/UL-IL=AMERICAN/UL-CAPS=6/UL-FLAGS=0/UL-CLASS=Limited/\
UL-TYPE=PrimeRecip/UL-PWD=$1$.WCmR9SI$retf3ssU4oIdnAmeNfGPM./UL-SASL-PWD=$1$UI2eIyx1ZCA\=/UL-PWDCHDT=1143283528/UL-BADPWD=0

I think if I get to remove this user, my local service will start again. This reminds, it's exactly what happened the last time. I was trying to delete a user and couldn't. After that I never got the local service to start again.

ScalixSupport
Scalix
Scalix
Posts: 5503
Joined: Thu Mar 25, 2004 8:15 pm

Postby ScalixSupport » Thu Apr 13, 2006 10:53 am

What? I thought the omdelent was successful. Can you try it again? Are you saying local delivery isn't running/delivering mail?

Thanks,
Don


Return to “Scalix Server”



Who is online

Users browsing this forum: No registered users and 11 guests