This is for Scalix 10.0.1 running in RHEL3. It's been up and running for over 2 years.
I don't think this is related to the time change anymore, but you never know. Anyhow.
I've narrowed it down to ONE thing. There is an error somewhere in a config file that I cant find. This causes the Tomcat server to start a multitude of useless threads. They all seem to be missing information, which may be whats causing the system to start so many (attempting to get one that actually works?) I know this because the thread line on my broken server looks like this. And there are over 30 of those with not users loged in:
root 29244 0.0 12.8 1074644 131672 pts/0 S 08:22 0:00 /usr/java/jre1.5.0_04/bin/java -server -Djava.net.preferIPv4Stack=true -Xms512m -Xmx768m -Djava.endorsed.dirs=/opt/scalix-tomcat/common/endorsed -classpath /usr/java/jre1.5.0_04/lib/tools.jar:/opt/scalix-tomcat/bin/bootstrap.jar:/opt/scalix-tomcat/bin/commo
And the server that works shows this line for the tomcat thread, and there's only ONE of them with no users loged in:
root 3839 0.0 19.8 746220 205508 ? Sl Mar08 0:04 /usr/java/jre1.5.0_04/bin/java -server -Djava.net.preferIPv4Stack=true -Xms512m -Xmx512m -Djava.endorsed.dirs=/opt/scalix-tomcat/common/endorsed -classpath /usr/java/jre1.5.0_04/lib/tools.jar:/opt/scalix-tomcat/bin/bootstrap.jar:/opt/scalix-tomcat/bin/commons-logging-api.jar -Dcatalina.base=/opt/scalix-tomcat -Dcatalina.home=/opt/scalix-tomcat -Djava.io.tmpdir=/opt/scalix-tomcat/temp org.apache.catalina.startup.Bootstrap start
As you can see, it seems pretty obvious that the line gets cutoff. Maybe it's a EL3 vs EL4 difference though. But I can't see why. So I'd REALLY like if someone, anyone, could tell me what config file(s) this line gets generated from so that I could go into it and check that all the needed info is there.
My admin console has been down for the third day now. And my Webmail crashes at least 6 times a day because all those useless threads cause Tomcat to suffer a memory error,the threads take up all it's memory. The solution is NOT to add more memory, that only delays the problem because no matter how much I put, it eventually fills up (I tried).
My users are getting pissed.
So is my boss.
Thanks.