Page 1 of 1
Tomcat crashing- unable to access SWA until restarted
Posted: Mon Feb 26, 2007 1:26 pm
by WPSTech
We just upgraded from 10.0.5 to 11.0.1 on SLSE10
Everything works fine until we get some load on the server. Tomcat hits 100% of one cpu, somekind of a loop and does not respond. It does not physically crash, but inorder to access swa, we have to restart/kill tomcat. It works for another 15-20 minutes then crashes again.
Not sure if this is related, but upon restart this is in the log file..
Feb 26, 2007 12:05:06 PM org.apache.catalina.loader.WebappClassLoader validateJarFile
INFO: validateJarFile(/var/opt/scalix/pm/tomcat/webapps/caa/WEB-INF/lib/j2ee.jar) - jar not loaded. See Servlet Spec 2.3, section 9.7.2. Offending class: javax/servlet/Servlet.class
Posted: Mon Feb 26, 2007 3:47 pm
by kanderson
at the top of the forum, there is a FAQ that discusses how to throttle back Tomcat's utilization. This will be at 100% because of the indexer service. Cut that back to 10-30% and you should be fine.
Actually, here's the link...
http://www.scalix.com/wiki/index.php?ti ... ix_11_FAQs
Kev.
Posted: Mon Feb 26, 2007 4:03 pm
by WPSTech
Tried the min and max load and they do not seem to work; We have restarted scalix
general.cfg
IDX_MAXLOAD=1.5
IDX_MINLOAD=.05
System load is at 2.5 and the indexer and the indexer is still processing?
Posted: Mon Feb 26, 2007 4:12 pm
by WPSTech
I did start to find records that show sisrc=503.
the wiki says to delete these messages, anyone know the command?
I figure:
ondelete -u (user) -m (is the message number the same as the item number below?)
WAIT_FOR_SIS FLAGS 7b0a1aac-412e6615-45e30843-c001 user=2757
dref=001d4634d71eaa56 pdref=0008fe8daf0e0362 flags=0
sis=0x806b530 sisrc=503 attempt=0
omdref 001d4634d71eaa56
USER FOLDER: John Smith / wps/CN=Smith, John
IN TRAY: User 2757 Intray : RecNo : ItemNo
MESSAGE: 2.26 am attendance : 39 : 20858454
Thanks
Posted: Tue Feb 27, 2007 11:05 am
by WPSTech
UPDATE:
Tomcat Crashes continued after adjusting the MIN/MAX cpu load in general.cf
our current settings are
Max 0.5
Min 0.2
We ran another sxmkindex and tomcat still has issues with load on the server.
(Approx 500 ACTIVE SWA Users)
In the indexer.log we noticed a lot of the following errors:
/var/opt/scalix/pm/s/indexwork/1172545786.0 still has 100 active requests and end not found
It seems that when tomcat crashes it corrupts the indexwork file (My guess is when we had to kill -9 the process). This "may" be causing tomcat to get confused ?????
This article mentioned to delete the indexwork file and run the sxmkindex. Not too sure if we want to reindex...We will attempt to delete the older indexwork files and see if it works.
viewtopic.php?t=5727&highlight=indexwork
----
WPSTech
UPDATE!
Posted: Tue Feb 27, 2007 12:14 pm
by WPSTech
FYI....Tomcat still crashing...despite the process above.
I am thinking the indexer has nothing to do with tomcat/java crashing.
Anyone have any other ideas...
We just upgraded to SELS10..When we were on RH3 we experienced tomcat issues from time to time as well, but not every 10 minutes!
Posted: Sat Mar 03, 2007 5:51 am
by Katagia
I have the same problem here. Tomcat is crashing.
I use Opensuse 10.1 with Scalix 11.0.1
I son't see any errors in the logfiles itself.
This is the error log:
http://rafb.net/p/nbYbEy82.html
webmail hasn't been accessed for hours when it's crashing.
Posted: Sat Mar 03, 2007 11:49 am
by kanderson
11.0.2 addresses some container issues encountered during an upgrade (Bugzilla issue 14734). Not sure if that's what you're seeing or not, but I'd advise an upgrade to 11.0.2 in any case as there are some compelling fixes included.
Thanks.
Kev.
Posted: Tue Mar 13, 2007 10:13 pm
by tpohl
I'm convinced that the java portion of the sis indexer has serious issues. In my situation, I've imported about 45GB of mail (600,000 plus messages) from lotus notes into my scalix install.
I setup the indexer to throttle back, but when the java portion of the indexer gets to messages with dodgy attachments (probably corrupt / character set issues or something), the non-java side of the indexer recives 503 errors from the tomcat side and I look at the ~/sxx/s/tmp/indexer.log generated by following the instructions on
http://www.scalix.com/wiki/index.php?ti ... ix_11_FAQs I find the message with omdref, get the user to delete the message, and restart tomcat only to get stuck on another message a little while later. Most of the messages are really old (like 2003) and filed away in folders other than the inbox due to the mail import (i.e. omdelete isn't an option for deleting the message from the command line). I've had to turn off the recovery folder system wide because when my users delete the offending message and empty their trash, the indexer still wants to index the file even though it's in the recovery folder. At any rate, once I can get omdref to no longer find the message because it's been disposed of, restarting tomcat allows the indexer to continue on.
First off, this is a really annoying, convoluted proceedure!
Question: Can you delete a message from the command line when it's filed away in a folder?
For example:
omdref 00159e9d1cfbc9c1
USER FOLDER: Joe User / scalix01/CN=Joe User
FILING AREA: User 142 Filing Tray : RecNo : ItemNo
FOLDER: Folder12345 : 56 : 2161604
MESSAGE: Re: some message : 18585 : 2194494
If this message were in the inbox, I understand that I can use omdelete, but what do I do since it isn't? This folder has a CRACK TON of messages in it and webmail can't even open it. It's only openable in Outlook after much waiting. From browsing several threads, I believe that a LOT of folks who are experiencing high tomcat loads is due to my same issue, because I get 1 crapped up message to peg all 4 of my CPU cores and run my load up to about 65 if I'm not watching it.
Posted: Wed Mar 14, 2007 12:59 pm
by KevinAnderson
You can use omcontain to delete messages directly.
I'm surprised that you're seeing that many errors. 45 Gigs of mail really isn't alot, I have a number of customers with drastically more, and the indexer had no issues for them. I suppose it's possibly due to the type of migration you did, but even there, I'd be suprised that you'd be seeing a significant number of screwed up messages.
Kev.
Posted: Wed Mar 14, 2007 1:43 pm
by tpohl
KevinAnderson wrote:You can use omcontain to delete messages directly.
I'm surprised that you're seeing that many errors. 45 Gigs of mail really isn't alot, I have a number of customers with drastically more, and the indexer had no issues for them. I suppose it's possibly due to the type of migration you did, but even there, I'd be suprised that you'd be seeing a significant number of screwed up messages.
Kev.
All of the screwed up messages are dated in the range of April 2003 and seem to be the same group of messages across a multitude of users. I've been chalking it up to the messages being corrupt in the old system before the move, but I'd still like to think that scalix could do more to not totally make my server choke on a single bad message!
Any thoughts about deleting a specific dref from the command line (when the message isn't in the inbox)?
-Tom
Posted: Tue Jan 08, 2008 12:04 pm
by slorente
Anyone have an answer to Tomcat crashes when server reches 500 SWA users?I 'm having 2 or 3 crashes in my scalix server, always the java process takes 95%-220% CPU...(yes, 220%...).
The solution always was the same:
Restart scalix services in MAIL server
Restart Tomcat-Scalix service in MAIL server
Restart Tomcat-Scalix service in RELAY server
Can you give a solution or a way to continue investigating the problem?
Thanks!
Re: Tomcat crashing- unable to access SWA until restarted
Posted: Mon Jun 01, 2009 10:54 am
by stephan.klein
Hi,
to bring this topic back - since a few days, I have a similar problem. Java eats up all 4 cpu cores.
I tried to enable logging as mentioned in the wiki link above, but I get different output:
Code: Select all
[2009-06-01 16:37:56] Logging Started
[2009-06-01 16:38:24] Awake liveRequests=0
[2009-06-01 16:38:24] /var/opt/scalix/m7/s/indexwork/1243866492.0 still has 0 active requests and end not found
[2009-06-01 16:39:27] Awake liveRequests=0
[2009-06-01 16:39:27] /var/opt/scalix/m7/s/indexwork/1243866492.0 still has 0 active requests and end not found
[2009-06-01 16:40:29] Awake liveRequests=0
[2009-06-01 16:40:29] /var/opt/scalix/m7/s/indexwork/1243866492.0 still has 0 active requests and end not found
[2009-06-01 16:41:32] Awake liveRequests=0
[2009-06-01 16:41:32] /var/opt/scalix/m7/s/indexwork/1243866492.0 still has 0 active requests and end not found
[2009-06-01 16:42:34] Awake liveRequests=0
[2009-06-01 16:42:34] /var/opt/scalix/m7/s/indexwork/1243866492.0 still has 0 active requests and end not found
I'm not really sure, if this is really an indexer problem.
After restarting scalix-tomcat, everything is fine for about five minutes to half an hour, then the load increases again. At the moment, I am using cpulimit to keep the system usable.
Any idea how to debug this?
Thank you & regards
stephan
PS - nothing special in all the other logs
PPS - Scalix 11.4.3 on debian etch
Re: Tomcat crashing- unable to access SWA until restarted
Posted: Tue Jun 09, 2009 2:49 pm
by LeslieW
Do you have any errors in your tomcat log files?
/var/opt/scalix/??/tomcat/logs/*
Re: Tomcat crashing- unable to access SWA until restarted
Posted: Tue Jun 09, 2009 2:51 pm
by stephan.klein
Sorry, none... the same for the other tomcat logs. Everything seems to work fine.
regards
stephan