Heavy IMAP Usage is Killing the Server

Discuss the Scalix Server software

Moderators: ScalixSupport, admin

dresdn
Posts: 92
Joined: Wed Apr 05, 2006 5:11 pm

Postby dresdn » Fri Jun 09, 2006 11:34 am

tenaciousC wrote:There are 4400 files in my temp which doesn't seem enough to start causing bad performance.


Out of curiousity, how do you have your /var/opt/scalix setup? In a seperate partition? Using LVM? Formatted as what?

For mine, I have an LVM partition using EXT3. I'm going to do some testing when I get the time, but I'm wondering if my performance problems are due to Red Hat's LVM configuration/version.

-Mike

tenaciousC
Posts: 89
Joined: Thu Mar 30, 2006 5:41 pm
Location: Manchester, UK.

Postby tenaciousC » Sat Jun 10, 2006 7:44 am

Hi Mike,

Originally LVMs using ext3. (software RAID1)
Now I have the /var/opt/scalix/temp on LVM ext2 partition. This had made no difference to performance!

C

eckman00
Posts: 28
Joined: Fri Jan 06, 2006 7:11 pm

Postby eckman00 » Sat Jun 10, 2006 4:21 pm

We are also seeing the IMAP timeouts. Searching for mail in thunderbird rarely works for most users due to this.

At our site, our /var/opt/scalix/temp has 228000+ items, 179000 of which were before that last time I restarted the services (for additional licenses).

One downside of so many files is that the directory becomes very large:
drwxrwx--x 5 scalix scalix 4636672 Jun 10 13:12 .

so even if I remove a lot of the items, it will still have the slowness to search through the directory, unless I create a new directory.

Right now, the temp file is 19G in size.

I am planning on adding another disk for the temp area (ext2). Are there some general sizing guidelines? Or is it okay to remove items that haven't been used in a while (a la the find script posted earlier)?

And my final question, what do people think of having a symlink for /var/opt/scalix/temp -> /new/disk/temp? Will the overhead of the symlink be too much?

thanks,
eric

tenaciousC
Posts: 89
Joined: Thu Mar 30, 2006 5:41 pm
Location: Manchester, UK.

Postby tenaciousC » Sun Jun 11, 2006 1:41 am

Hi Eric,

In the past (when Scalix was Openmail) I used a symlink on the /var/opt/openmail/data directory to point to another disk without issue. This was a relatively small installation, though, 30 users or so and store of 4GB!

Sounds like yours is a bit of monster! How many simultaneous IMAP users do you have and on what hardware / OS / disk setup are you running?

Cheers!

C

eckman00
Posts: 28
Joined: Fri Jan 06, 2006 7:11 pm

Postby eckman00 » Mon Jun 12, 2006 8:57 pm

hi,

we have about 500 users overall, with about 50% using IMAP on thunderbird (and SWA for the calendar). The rest usually use Outlook. A quick look at our system shows 210 imap processes from 84 users. Right now though, the system is fairly quiet (less than 1 on the load average).

The system has 2 Xeon 3.00Ghz cpus with 4GB of memory. I have about 1TB of disk, and we are using ~200GB for scalix (171GB for data, 19Gb for temp).

If the temp directory is a bottleneck, what options do we have? Is there a list of best practices for this area? Or do we need to have a change in the temp structure so that items starting with 000 go to one directory, and items starting with 001 go to the next, etc.

I have no idea if it will help, but I am planning to purchase a 73GB 15K rpm disk to be /var/opt/scalix/temp.

Is there any reason to leave these files in place?

thanks,

eric

tenaciousC
Posts: 89
Joined: Thu Mar 30, 2006 5:41 pm
Location: Manchester, UK.

Postby tenaciousC » Tue Jun 13, 2006 3:09 am

HI Eric,

I would be tempted to try Kev's solution on this to reduce the number of files in your temp directory.

kanderson wrote:
find /var/opt/scalix/temp/ -type f -ctime +7 -exec rm -f '{}' \;



The temp files can certainly be completely removed if Scalix not running. Trashing the older ones whilst it is running would be slightly riskier I imagine. I don't think I have seen anyone from Scalix confirm or deny this yet!

Are you experiencing the same issues noted at the beginning of this discussion- that you are getting high iowait times when manipulating large folders (>1000 msgs) via IMAP?

Cheers!

C

kanderson

Postby kanderson » Tue Jun 13, 2006 6:17 pm

I'll just add that my find command therewill only remove files (so sockets aren't touched), and it only touches things that are a week old. I tried to be as conservative as possible.

But a comment from Scalix support as to the wisdom of actually using this would be great...

I will say that *I* use it on production servers with no problems...

Kev.

florian
Scalix
Scalix
Posts: 3852
Joined: Fri Dec 24, 2004 8:16 am
Location: Frankfurt, Germany
Contact:

Postby florian » Thu Jun 15, 2006 5:00 am

When deleting stuff from the "temp" directory, you should be careful to leave stuff under temp/mime_cache intact - this is what is used by the IMAP server for rendering Scalix messages into mime. The cache will be automatically aged by the omscan command as part of monthly message store maintenance.

Out of curiosity - are the majority of these files found in temp or in temp/mime_cache?

Thx,
Florian.
Florian von Kurnatowski, Die Harder!

jch
Scalix
Scalix
Posts: 202
Joined: Thu Mar 25, 2004 10:25 am

Postby jch » Thu Jun 15, 2006 6:25 am

Some comments in addition to the good stuff that's already here. I think I may be repeating what others have said as well to a certain extent.


I'm slightly surprised that you have so many files in ~/temp, but they can be cleaned up. A safe way to clean things up is

Code: Select all

omscan -a
This removes files over a week old.

You're not going to get much performance difference between ext2 and ext3. If these files are genuine (that is, not as a result of something being broken) then you can enable directory hashing. I just did this:

Code: Select all

tune2fs -O dir_index /dev/vg/build
umount /dev/vg/build
e2fsck -fD /dev/vg/build
mount /dev/vg/build

(/dev/vg/build is where I do my builds, I thought I might try it to see if it makes builds faster.) You'll notice that you have to take the file system off-line to build the indexes, but it doesn't seem to take long on my 3G build partition.

There are other options. You want ~/tmp and ~/temp to be as fast as possible but you don't care what happens to them if the machine crashes (or at least you shouldn't). Create fresh file systems for ~/tmp and ~/temp and mount them with options to make them "fast and dangerous". Good ones are noatime, nodiratime, data=writeback -- see the mount(8) and Google for details. Mount these file systems as ~/tmp and ~/temp (usually /var/opt/scalix/tmp and /var/opt/scalix/temp) -- please don't use symlinks and have them mounted somewhere else, it'll only slow things down.

It's best to re-create fast and dangerous partitions like these on reboot. You'll need to set the dir_index option when you create the file system, of course. If you have a ramdisk then you can use that for the journal and that will also speed things up -- it might make sense to use "data=journal" in this case so that writes to the disk are buffered through the journal; I don't know if this would be faster or not.

If you have more memory than is good for you, you can try using tmpfs for these file systems, but bear in mind that, of course, they'll be competing for memory with running processes. I've used tmpfs for /tmp on Solaris and it's blindingly fast and falls back to swap when real memory runs short; I've not tried it at all on Linux (but perhaps I should). It could be that this is far and away the fastest way to get a file system, but, equally, since I/O from the swap partition(s) isn't optimised for a file system it could be rather slow unless you have loads of real memory.

If you're using an array for disks, then configure ~/tmp and ~/temp to be on striped but not mirrored volumes -- you'll get the most throughput that way but, of course, you're vulnerable to losing that valuable throw away data :-) What we're after here is speed, not safey. There's a theme developing here. A single disk for any significant number of users is going to glow red hot and it'll be a bottle neck.

Dave mentioned UAL_SINGLE_TEMP_DIR and that's especially useful when you've arranged to have fast file systems for tmp and temp. Several people have mentioned ownership and permissions of those directories -- get it right or Scalix won't start.

I'm still slightly surprised that there are so many files in ~/temp though -- I'd love to know what they are.

I'm not a bit surprised about the speed of an IMAP search. You'll be in for a very pleasant surprise in the next release though.

jch

kanderson

Postby kanderson » Thu Jun 15, 2006 11:24 am

Files in temp are almost all *.mim or a file with no extention and a name like 00seucl or 00sa2c1.

The mime_cache subdirectory as well as tmp are close enough to empty that it isn't a concern. Less than 100 files combined.

Quickly checking one server, omscan last ran automatically 3 days ago, for about 2.5 hours, and didn't seem to do anything to clean that directory out. I ran it manually in January of this year. There were files back from a year and a half ago (when the server went live) when I started the crontab job to clean them out a few weeks ago.

Kev.

jch
Scalix
Scalix
Posts: 202
Joined: Thu Mar 25, 2004 10:25 am

Postby jch » Thu Jun 15, 2006 12:11 pm

Unfortunately, the background omscan doesn't delete old files. We think it's probably a bug ... so you need to periodically run omscan -a explicitly to clean up files. If it doesn't it's a bug and the fix we did ages ago didn't work :-(

jch

jch
Scalix
Scalix
Posts: 202
Joined: Thu Mar 25, 2004 10:25 am

Postby jch » Thu Jun 15, 2006 4:10 pm

Mark remiinded me that the zillions of files in ~/temp are almost certainly from the RTF to HTML conversion. You can quite safely get rid of old ones and omscan -a from time to time will keep them under control, with any luck.

jch

paintbuoy
Posts: 14
Joined: Tue Oct 10, 2006 9:54 pm

Any progress on this problem?

Postby paintbuoy » Tue Oct 10, 2006 9:59 pm

I was wondering if there was any progress in identifying this problem or a concise set of steps to resolve it.

I am running Scalix 10 and a single OSX Mail client using an IMAP connection can take down the server if they choose to synchronise their account after not being online for a long period of time.

Here is the IMAP log:

Code: Select all

ERROR IMAP Server Da(IMAP Server Pr) 10.11.06 14:39:21
[OM 24070] Debug message for Lab use :
imapSatAuthenticate:Could not register with Session Monitor.


Scalix is running on a Suse 10 server with LVM and 1.5gb RAM.

I've read in this thread that the problem can be relieved but not solved by having the temp directory not located on LVM. Is this definitely the case and are there any other measures that can be taken to resolve this issue?

florian
Scalix
Scalix
Posts: 3852
Joined: Fri Dec 24, 2004 8:16 am
Location: Frankfurt, Germany
Contact:

Postby florian » Tue Oct 10, 2006 11:35 pm

We believe to have identified this to be a bug in the IMAP server that currently limits the maximum number of connections to a single user account to 17.

Can you verify how many imapd processes are running for this user when this happens?

Cheers,
Florian.
Florian von Kurnatowski, Die Harder!

paintbuoy
Posts: 14
Joined: Tue Oct 10, 2006 9:54 pm

Postby paintbuoy » Wed Oct 11, 2006 12:02 am

florian wrote:We believe to have identified this to be a bug in the IMAP server that currently limits the maximum number of connections to a single user account to 17.

Can you verify how many imapd processes are running for this user when this happens?

Thanks for your prompt reply.

What command lists the processes with relation to the client connection?

Code: Select all

ps ax | grep in.imap41d

Shows the running processes but doesn't indicate which clients are being served.
Is this something available through the SAC or should I be looking for a shell command?

Looking through OSX Mail it doesn't seem like there is any option to limit the number of connections at the client end.


Return to “Scalix Server”



Who is online

Users browsing this forum: No registered users and 15 guests