Any consultants? Ideas for performance issues?

BayFedCU · Postby **BayFedCU** » Wed Jun 13, 2007 10:54 pm

Hi Guys!

Since converting from Exchange 5.5 to Scalix we've been suffering from very poor performance.

At this point we're desperate, we need to get this under control and get our performance up.

The following problems are happening:

1. Poor performance on large mailboxes via webmail.

We've done a few things to alleviate this, but it's still a problem. To date we've noticed that the amount of requests to Apache was being throttled by a small site configuration. I've since increased the number of available processes, and tweaked other Apache settings. This has somewhat improved things, but not enough.

2. Poor performance for Outlook users with large mailboxes.

All of our clients run a mixture of Office XP, 2002, and 2003. All users with large calendars and inboxes are noticing a slow-down.

3. Erratic outlook behaviors.

Some of our users have errors when doing things like hitting ctrl-z after deleting a message (this normally just undoes the last action) causes an error to come up saying that they dont have permissions.

4. High server loads during Outlook setups.

When a new user is added and Outlook setup, the server experiences a load often > 4.00.

The server is a BL35p blade from HP running 2xDual Core AMD Opteron 250 processors with 8 gigs of ram, the storage is an EVA3000 with "boot from SAN" and separate LUNs for the mailstore and OS. We're running RHEL 4 x86 (not x86_64). We have about 185 users, most using webmail.

Let me know if you have any suggestions. We're not afraid to pay some money if thats what it takes to fix this. As of now, we're fighting an overwhelming negative sentiment towards Scalix from our Senior Management.

Thanks for your time.

Shredder · Postby **Shredder** » Thu Jun 14, 2007 10:16 am

Can you explain how your SAN is setup (Raid level) and how the LUNs are mounted on the Scalix server?

Just as a note, Scalix does not recommend using NFS to mount the mail store and they do not recommend using Raid 5. (See viewtopic.php?t=3942&highlight=raid)

Shredder

BayFedCU · Postby **BayFedCU** » Thu Jun 14, 2007 11:34 am

Direct FCAL connection to the SAN, so the LUNs are presented to the host directly. The SAN volumes are RAID5.

Shredder · Postby **Shredder** » Thu Jun 14, 2007 12:00 pm

So the issue is probably the Raid 5. As said in the link, can you set up a test Raid 1 or something similar to see if that helps?

Shredder

BayFedCU · Postby **BayFedCU** » Thu Jun 14, 2007 1:40 pm

Thanks for the suggestion, however I'm pretty confident that we're not bound on IO in that aspect.

However I did make some more discoveries:

Postgresql was not properly tuned out of the box. I had to go in and crank the number of maximum users, shared_buffers, effective_cache, sort_mem, and random_page_cost. These were all pretty close to defaults and heavily tuned towards low memory systems. Increasing these (and vm.shmmax in sysctl.conf) greatly improved performance.

There is a wonderful tuning guide for postgres at http://www.varlena.com/GeneralBits/Tidbits/perf.html It's a little dated but still very applicable. Before you do *any* tuning to PG you should heavily read their docs, and be knowledgable about what you're doing. :-)

So far, since tuning this, webmail is VERY responsive (even on large boxes) and outlook's little quirks seem to have subsided. Most of our waits seemed to center around waiting for the database.

KevinAnderson · Postby **KevinAnderson** » Wed Jun 20, 2007 2:59 pm

I'd really be intereted to know what changes you made. Could you post the config here so we could all see, and benefit from your findings?

Also, did you notice any performance benefit from moving to 11.1? (Or have you moved yet?)

Thanks
Kev.

Derek · Postby **Derek** » Mon Jul 09, 2007 1:53 pm

Pushing this one up for some attention.

kjakkanen · Postby **kjakkanen** » Wed Jul 11, 2007 6:27 am

Reporting as another very interested person to see the changes that helped with the Webmail performance, BayFedCU - would it be possible to post the exact changes made for the benefit of the forum readers? Thanks in advance!

I'm also sad if RAID-5 is the bottleneck in Scalix performance, RAID 1+0 is surely good but the storage efficiency being only 50% makes it very expensive and you need basically another disk chassis (for e.g. HP ProLiant DL-series) to get enough HDD for a large site installation.

Is there anyone who's running on RAID-5 (with fast 15K RPM disks for example) and is HAPPY with the performance?

We're moving our installation to a brand new HP ProLiant DL380 G5 server with dual quad-core Xeons, 6 GB RAM and a full-packed 8x72 GB SAS disks (15.000 rpm) because of a hardware fault diagnosed by Scalix Support in our old server. During the purchase I had to make a decision whether to go with 10K 146GB disks (and run RAID1+0) or use 15K 72GB disks on RAID-5 (there were'nt 15K 146GB discs avail. yet). Somehow I trust the RPM makes a bigger difference, hopefully I'm not that far off...

-Kimmo

BayFedCU · Postby **BayFedCU** » Wed Jul 11, 2007 4:25 pm

Sorry guys, I realized that I had posted this and forgotten to respond about what was done exactly. First, these settings MAY NOT work for you, these are heavily dependent on the amount of ram available to you as well as the hardware you're running on. I make no guarantees about this, YMMV etc etc.

1. Increased kernel.shmmax (/etc/sysctl.conf) to 383450112 this is so we can get more shared memory to postgres.

2. In the file '/var/opt/postgresql/data/postgresql.conf':

a. increased max_connections to 350 (default 100?)
b. shared_buffers increased to 32768
c. sort_mem increased to 4096
d. vacuum_mem increased to 4096
e. wal_buffers set to 128
f. effective_cache_size set to 9600
g. random_page_cost set to 2

3. Set up a cron job to actively vacuum and analyze the tables on a weekly basis.
crontab:

Code: Select all

0 0 * * * vacuumdb -a -f -z -q -h{serverip} -p5733 > /dev/null 2>&1

These things are all covered in the postgresql tuning guide available at this location: http://www.varlena.com/GeneralBits/Tidbits/perf.html again, YMMV. Please read and understand what you're changing. In my case, this was tuned to fast disk (stock tuning is for IDE), and big memory (server has 8 gigs of ram). I'd say about the only thing I want to change is to possibly move the PG installation to another server where we already have a large PG installation.

One last thing, you must make sure that your shmmax is commiserate with your postgres memory paramaters. If you dont, your PG installation will not start!

-- BFCU

BayFedCU · Postby **BayFedCU** » Wed Jul 11, 2007 4:28 pm

KevinAnderson wrote:I'd really be intereted to know what changes you made. Could you post the config here so we could all see, and benefit from your findings?

Also, did you notice any performance benefit from moving to 11.1? (Or have you moved yet?)

Thanks
Kev.

Kevin, we have not gone to 11.1 yet. We were waiting for the pioneers to move first. :-)

However, I'm looking at doing it one of these weekends coming up soon.

BayFedCU · Postby **BayFedCU** » Wed Jul 11, 2007 4:45 pm

Shredder wrote:Can you explain how your SAN is setup (Raid level) and how the LUNs are mounted on the Scalix server?

Just as a note, Scalix does not recommend using NFS to mount the mail store and they do not recommend using Raid 5. (See viewtopic.php?t=3942&highlight=raid)

Shredder

After reading this too, I'm going to try a raid 1 volume (EVA doesnt do 1+0).

Scalix Forums

Any consultants? Ideas for performance issues?

Any consultants? Ideas for performance issues?

READ THE ENTIRE MESSAGE BEFORE ACTING!

Who is online