Heavy IMAP Usage is Killing the Server

Discuss the Scalix Server software

Moderators: ScalixSupport, admin

dresdn
Posts: 92
Joined: Wed Apr 05, 2006 5:11 pm

Heavy IMAP Usage is Killing the Server

Postby dresdn » Tue Jun 06, 2006 1:45 pm

I've been running Scalix for just over a week, and I've noticed a pattern which extremely concerns me. Basically, when a user tries to access a folder with say 3,000+ messages, moves hundreds of messages around, or does any really heavy IMAP usage, the server can't handle it.

Once the performed action reaches a certain point, the server continues to climb, and I've stopped it once it got to 47+.

What I do then is try an omshut, which hangs. Then I have to killall imap41d and the mime.browse. Then I run omshut and after a while it will finally shut everything down.

I give it a few minutes, and then I omrc and it will eventually start up again.

Today, a user (Rich Wallace) accessed a folder with 3,000+ e-mails, and 5 minutes later, the server went down hard. I had to kill pretty much every process that I couldn't omoff. A few minutes later (and having everyone close down their e-mail), I started it back up and it worked.

What is interesting is the log that it generated. Below is the showlog -f today.

The biggest concern I have is the "Maximum IMAP connection rate exceeded" error, along with the last 2 errors that pertained to the user who "crashed" the server.

Any ideas on how I can tune my setup to prevent these errors from happening?

Thanks,
Mike

Code: Select all

SERIOUS ERROR                  IMAP Server Da(IMAP Server Pr) 06.06.06 09:03:07
[OM 10270] Process about to terminate due to error.
Signal (Segmentation Violation) trapped by process 4747
Procedure trace follows:
  -> imap_getTweaks
  -> imap_parseCapabilities
  -> imaplex_setUnsyncLiterals
  <- imaplex_setUnsyncLiterals
  -> imaplex_setLogging
  <- imaplex_setLogging
  <- imap_getTweaks
  -> imaplex_nextRequest
  <- imaplex_nextRequest
  -> imaplex_nextRequest
  <- imaplex_nextRequest
  -> imaplex_nextRequest
  <- imaplex_nextRequest
  -> usr_PidSignoff
  <- usr_PidSignoff
  -> imapFreeSession


SERIOUS ERROR                  IMAP Server Da(IMAP Server Pr) 06.06.06 09:03:08
[OM 10272] BACKTRACE:
/opt/scalix/lib/libom_er.so(er_add_backtrace+0xc6)[0x586f06]
/opt/scalix/lib/libom_er.so[0x5871d6]
/opt/scalix/lib/libom_er.so(er_DumpProcAndExit+0x1f)[0x58737f]
/lib/tls/libpthread.so.0[0x4bd888]
/usr/lib/libsasl2.so.2(sasl_dispose+0x54)[0x20fe43]
in.imap41d[0x805f782]
in.imap41d[0x8060022]
in.imap41d[0x8060c06]
/lib/tls/libc.so.6(__libc_start_main+0xd3)[0x6eee23]
in.imap41d[0x804d9c1]


WARNING                        Background Sea(Background Sea) 06.06.06 09:22:44
[OM 24142] Conversion from ISO8859_1 to OMKOREAN not found.


WARNING                        Remote Client (HTML Access   ) 06.06.06 09:24:24
[OM 8013] The name/mailnode contains invalid characters


        -> htuICE_SetProcessLang
        -> cvc_IsMappingAvail2
        -> uni_GetCharset
        -> cvc_IsMappingAvail3
        -> cvc_AttemptSMem
        <- cvc_AttemptSMem
        <- cvc_IsMappingAvail3
        <- uni_GetCharset
        <- cvc_IsMappingAvail2
        <- htuICE_SetProcessLang
        -> htuICE_ParseORNToTF
        -> htuICE_GetProfile
        -> htuICE_GetConfig
        <- htuICE_GetConfig
        <- htuICE_GetProfile
        <- /build/10.0.1.3/src/lib/ombase/cl/cl_parse.c:387[100,8013]


WARNING                        Remote Client (HTML Access   ) 06.06.06 09:24:56
[OM 8009] The initials are too long


        -> tf_PutINT16
        <- tf_PutINT16
        -> tf_PutINT32
        <- tf_PutINT32
        -> tf_PutINT16
        <- tf_PutINT16
        -> tf_PutINT32
        <- tf_PutINT32
        -> tf_PutINT32
        <- tf_PutINT32
        -> htuICE_ParseORNToTF
        -> htuICE_GetProfile
        -> htuICE_GetConfig
        <- htuICE_GetConfig
        <- htuICE_GetProfile
        <- /build/10.0.1.3/src/lib/ombase/cl/cl_parse.c:344[100,8009]


WARNING                        PC Monitor    (Socket Monitor) 06.06.06 09:51:07
[OM 9797] Cannot read initial request: Connection reset by peer


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:03:54
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:03:54
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:03:54
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:03:55
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:04:26
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:05:21
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:05:23
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:05:23
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:06:21
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:07:22
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:16:53
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:16:53
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:16:53
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:16:53
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:16:53
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:16:53
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:16:53
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:16:53
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:16:54
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:16:54
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:16:54
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:16:54
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:16:54
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:16:54
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


WARNING                        IMAP Server Da(IMAP Server Pr) 06.06.06 10:16:54
[OM.DMON 2108] Maximum IMAP connection rate exceeded, sleeping.


ERROR                          Service Router(Service Router) 06.06.06 10:23:54
[OM 5181] Reply timed out or invalid - Mapper protocol problem.
Command sent: SCAN:/var/opt/scalix/data/0000034/001ob3r
Reply received:


ERROR                          Service Router(Service Router) 06.06.06 10:24:42
[OM 5181] Reply timed out or invalid - Mapper protocol problem.
Command sent: QUIT Please Close This Session
Reply received:


ERROR                          Service Router(Service Router) 06.06.06 10:24:47
[OM 5183] A Mapper error has been detected.
        -> tf_GetINT32
        <- tf_GetINT32
        -> tf_GetINT16
        <- tf_GetINT16
        -> vs_CheckAndScanFile
        -> vs_ScanActive
        <- vs_ScanActive
        -> tf_GetINT32
        <- tf_GetINT32
        -> tf_GetINT32
        <- tf_GetINT32
        -> vs_omCheckAndScan
        -> vs_omScanFile
        -> vs_GenericScanFile
        <- /build/10.0.1.3/src/lib/rsl/rsl_match.c:397[100,5183]
        <- /build/10.0.1.3/src/lib/rsl/rsl_match.c:397[100,5183]


ERROR                          Service Router(Service Router) 06.06.06 10:25:15
[OM 5183] A Mapper error has been detected.
        <- tf_GetINT32
        -> tf_GetINT16
        <- tf_GetINT16
        -> vs_CheckAndScanFile
        -> vs_ScanActive
        <- vs_ScanActive
        -> tf_GetINT32
        <- tf_GetINT32
        -> tf_GetINT32
        <- tf_GetINT32
        -> vs_omCheckAndScan
        -> vs_omScanFile
        -> vs_GenericScanFile
        <- /build/10.0.1.3/src/lib/rsl/rsl_match.c:397[100,5183]
        <- /build/10.0.1.3/src/lib/rsl/rsl_match.c:1558[100,5183]
        <- /build/10.0.1.3/src/bin/sr/sr_main.c:3988[100,5183]


ERROR                          IMAP Server Da(IMAP Server Pr) 06.06.06 10:26:53
[OM 24070] Debug message for Lab use :
imapSatAuthenticate:Could not register with Session Monitor.
User Name: Richard Wallace / mail, contentconnections/CN=Richard Wallace


ERROR                          IMAP Server Da(IMAP Server Pr) 06.06.06 10:26:53
[OM 24070] Debug message for Lab use :
imapSatAuthenticate:Could not register with Session Monitor.
User Name: Richard Wallace / mail, contentconnections/CN=Richard Wallace

ScalixSupport
Scalix
Scalix
Posts: 5503
Joined: Thu Mar 25, 2004 8:15 pm

Postby ScalixSupport » Tue Jun 06, 2006 2:25 pm

What client is this user using ?

The connection rate warnings indicate that the number of connections per second to the IMAP server has exceeded the limit set in general.cfg by IMAP_CONNRATE_LIMIT which is 10 by default.

Cheers

Dave

dresdn
Posts: 92
Joined: Wed Apr 05, 2006 5:11 pm

Postby dresdn » Tue Jun 06, 2006 2:30 pm

ScalixSupport wrote:What client is this user using ?



Most of the clients are Thunderbird 1.5.x. We have 2 people using Outlook with the MAPI connector.

The default for Thunderbird is to cache 5 IMAP connections, so I'm not sure how it's hitting that 10. On my client, I had set it to 1 a long time ago, and I checked on that user, and it's set to the default 5.

Thanks.

-Mike

tenaciousC
Posts: 89
Joined: Thu Mar 30, 2006 5:41 pm
Location: Manchester, UK.

IMAP usage increases %iowait

Postby tenaciousC » Tue Jun 06, 2006 5:50 pm

This is what I am noticing aswell.(see my post from earlier today regarding diskIO)

Manipulating Folders with large amounts of messages in causes the server's iowait (the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request) to rise. This causes the whole server to slow down. It only takes one person to delete or move a large folder and the trouble starts.

High IOwait times are indictative of a disk bottleneck i.e. the CPU is waiting for the disk.

Is this the downside of having a filesystem based message store???

C

dresdn
Posts: 92
Joined: Wed Apr 05, 2006 5:11 pm

Re: IMAP usage increases %iowait

Postby dresdn » Wed Jun 07, 2006 12:56 pm

tenaciousC wrote:High IOwait times are indictative of a disk bottleneck i.e. the CPU is waiting for the disk.


Tenacious, you hit the nail right on the head. I did some monitoring of the server - not of Scalix - and found that my IO Wait times get really high when I access or work with a large IMAP folder. I was too busy looking at Scalix rather than looking at the OS.

What OS are you running by the way? I'm using RHEL4 and I've seen other people complain about high IO Waits with RHEL3/4.

Is this the downside of having a filesystem based message store???


As opposed to a proprietary database that no one but Scalix has the API to? It's a "pick your poison" decision I think ...

-Mike

tenaciousC
Posts: 89
Joined: Thu Mar 30, 2006 5:41 pm
Location: Manchester, UK.

%Iowait

Postby tenaciousC » Wed Jun 07, 2006 1:05 pm

Hi Mike,

I'm on FC4 here with 2 big SATA disks with RAID1 software mirroring.

This is my sar for a typical day...

Code: Select all

[root@scalix2 ~]# sar
Linux 2.6.11-1.1369_FC4smp (scalix2.malmaison.com)      06/07/2006

12:00:01 AM       CPU     %user     %nice   %system   %iowait     %idle
12:10:01 AM       all      1.75      0.02      1.00      0.54     96.68
12:20:01 AM       all      1.29      0.02      0.80      0.59     97.30
12:30:01 AM       all      1.09      0.01      0.70      0.31     97.89
12:40:01 AM       all      1.18      0.02      0.68      0.29     97.83
12:50:01 AM       all      1.54      0.02      0.90      0.26     97.28
01:00:01 AM       all      1.69      0.01      0.80      0.10     97.39
01:10:02 AM       all      2.03      0.03      1.12      1.37     95.45
01:20:02 AM       all      1.29      0.01      0.76      0.19     97.75
01:30:02 AM       all      1.55      0.02      0.98      0.54     96.90
01:40:02 AM       all      1.24      0.01      0.74      0.11     97.91
01:50:02 AM       all      1.66      0.03      0.98      0.98     96.35
02:00:02 AM       all      1.27      0.02      0.83      0.17     97.71
02:10:02 AM       all      1.35      0.01      0.72      0.16     97.75
02:20:01 AM       all      1.22      0.02      0.76      0.14     97.87
02:30:01 AM       all      1.50      0.03      0.95      0.48     97.03
02:40:01 AM       all      1.74      0.02      0.91      0.39     96.94
02:50:01 AM       all      1.18      0.01      0.71      0.14     97.96
03:00:01 AM       all      1.33      0.02      0.77      0.23     97.66
03:10:02 AM       all      3.68      0.09      9.74     34.39     52.10
03:20:02 AM       all      3.20      0.06      7.82     28.46     60.46
03:30:02 AM       all      1.59      0.03      2.26      5.97     90.15
03:40:01 AM       all      1.11      0.01      0.75      0.92     97.22

03:40:01 AM       CPU     %user     %nice   %system   %iowait     %idle
03:50:01 AM       all      0.98      0.05      0.91      7.40     90.66
04:00:01 AM       all      1.12      0.02      0.72      0.58     97.57
04:10:03 AM       all      7.61      4.21      6.50     54.32     27.36
04:20:03 AM       all      1.54     17.69      5.95     33.81     41.01
04:30:02 AM       all      1.30      4.11      4.99     46.65     42.94
04:40:03 AM       all      1.82      1.28      5.02     52.65     39.23
04:50:02 AM       all      1.10      1.30      3.81     51.42     42.38
05:00:04 AM       all      1.60      1.07      4.48     54.27     38.58
05:10:03 AM       all      1.71      0.78      4.14     57.31     36.06
05:20:03 AM       all      1.16      2.28      5.80     52.19     38.56
05:30:02 AM       all      1.49      0.19      1.86     13.83     82.64
05:40:01 AM       all      1.37      0.02      0.79      2.01     95.80
05:50:01 AM       all      1.30      0.02      0.79      0.64     97.25
06:00:02 AM       all      0.92      0.01      0.65      0.07     98.36
06:10:01 AM       all      1.49      0.02      0.87      0.82     96.80
06:20:02 AM       all      1.15      0.02      0.75      0.33     97.75
06:30:02 AM       all      2.14      0.03      1.03      1.13     95.67
06:40:02 AM       all      1.60      0.02      0.95      0.61     96.82
06:50:02 AM       all      1.95      0.03      0.96      0.83     96.24
07:00:02 AM       all      1.76      0.03      1.03      0.79     96.40
07:10:01 AM       all      3.84      0.05      2.34      1.02     92.76
07:20:01 AM       all      1.68      0.03      1.04      0.78     96.47

07:20:01 AM       CPU     %user     %nice   %system   %iowait     %idle
07:30:01 AM       all      1.93      0.03      1.02      0.47     96.55
07:40:01 AM       all      2.13      0.02      1.10      0.69     96.06
07:50:02 AM       all      1.90      0.02      1.27      4.93     91.88
08:00:02 AM       all      2.96      0.02      1.73      7.51     87.77
08:10:02 AM       all      2.48      0.02      1.56      3.25     92.69
08:20:02 AM       all      2.53      0.03      1.43      2.93     93.08
08:30:02 AM       all      2.08      0.03      1.15      1.77     94.97
08:40:01 AM       all      2.24      0.05      1.21      2.32     94.18
08:50:03 AM       all      4.20      0.04      2.12      6.58     87.05
09:00:02 AM       all      3.52      0.04      2.15      8.17     86.12
09:10:02 AM       all      3.16      0.04      1.72      3.80     91.28
09:20:02 AM       all      3.02      0.04      1.52      3.86     91.56
09:30:02 AM       all      3.55      0.04      1.64      2.59     92.18
09:40:02 AM       all      3.93      0.04      2.41      8.83     84.79
09:50:02 AM       all      4.41      0.02      2.07      3.14     90.37
10:00:02 AM       all      3.20      0.01      1.55      1.46     93.78
10:10:01 AM       all      3.48      0.02      1.70      2.85     91.94
10:20:01 AM       all      3.65      0.03      1.73      1.23     93.36
10:30:01 AM       all      3.17      0.01      1.39      1.14     94.29
10:40:01 AM       all      3.39      0.04      1.53      1.54     93.50
10:50:01 AM       all      4.54      0.01      2.49      8.26     84.70
11:00:01 AM       all      4.16      0.02      2.48      4.53     88.81

11:00:01 AM       CPU     %user     %nice   %system   %iowait     %idle
11:10:02 AM       all      7.19      0.14      4.54     12.93     75.20
11:20:02 AM       all      4.44      0.04      2.62      8.50     84.40
11:30:03 AM       all      4.29      0.01      2.75     11.33     81.62
11:40:03 AM       all      3.78      0.01      2.56     11.24     82.40
11:50:02 AM       all      4.08      0.01      2.67     15.45     77.78
12:00:02 PM       all      4.07      0.01      2.78     12.81     80.33
12:10:02 PM       all      4.39      0.02      2.79     12.41     80.39
12:20:02 PM       all      3.95      0.03      2.44      9.95     83.64
12:30:02 PM       all      3.62      0.02      2.52     11.26     82.59
12:40:02 PM       all      5.96      0.04      2.58     10.10     81.31
12:50:02 PM       all      3.50      0.07      2.09      6.26     88.08
01:00:02 PM       all      3.01      0.03      1.57      2.11     93.28
01:10:02 PM       all      3.05      0.07      1.79      3.65     91.45
01:20:02 PM       all      2.61      0.04      1.47      3.59     92.29
01:30:02 PM       all      3.97      0.03      1.87      3.12     91.00
01:40:02 PM       all      5.40      0.05      2.21      4.35     88.00
01:50:02 PM       all      4.17      0.06      1.96      5.66     88.15
02:00:01 PM       all      4.40      0.03      2.04      2.94     90.59
02:10:01 PM       all      4.58      0.02      2.30      5.83     87.27
02:20:01 PM       all      3.02      0.01      1.66      2.08     93.22
02:30:02 PM       all      3.66      0.02      1.76      2.93     91.62
02:40:01 PM       all      3.66      0.08      1.95      3.30     91.01

02:40:01 PM       CPU     %user     %nice   %system   %iowait     %idle
02:50:01 PM       all      4.10      0.01      2.20      5.52     88.17
03:00:01 PM       all      3.47      0.02      1.86      3.31     91.34
03:10:02 PM       all      5.55      0.02      3.42     11.02     79.99
03:20:03 PM       all      4.23      0.01      1.98      5.29     88.48
03:30:02 PM       all      2.84      0.01      1.58      2.70     92.86
03:40:02 PM       all      3.54      0.02      2.04      3.56     90.84
03:50:02 PM       all      3.38      0.01      2.05      6.45     88.11
04:00:02 PM       all      4.23      0.03      2.09      4.20     89.46
04:10:02 PM       all      2.96      0.01      1.65      1.57     93.80
04:20:02 PM       all      4.79      0.03      2.37      6.25     86.56
04:30:02 PM       all      4.78      0.03      2.44      4.61     88.14
04:40:01 PM       all      4.50      0.04      2.38      5.57     87.51
04:50:01 PM       all      5.35      0.05      2.67      3.96     87.97
05:00:01 PM       all      5.00      0.04      2.64      2.49     89.84
05:10:01 PM       all      4.40      0.03      2.20      3.25     90.11
05:20:01 PM       all      4.37      0.02      2.11      3.30     90.21
05:30:02 PM       all      3.43      0.01      1.88      5.69     88.98
05:40:02 PM       all      7.28      0.02      4.29      7.28     81.13
05:50:01 PM       all      4.48      0.04      2.87      8.01     84.59
06:00:01 PM       all      2.34      0.03      1.31      1.40     94.92
Average:          all      2.98      0.33      2.08      7.87     86.74
[root@scalix2 ~]#



for me iowait is bad at 0300hrs because I do an omcpoutu for each mail box ( to enable a single user restore) and an rsync of the whole scalix dir to another server.

How does your sar report look??

C

dresdn
Posts: 92
Joined: Wed Apr 05, 2006 5:11 pm

Postby dresdn » Wed Jun 07, 2006 1:16 pm

Hi C,

I'm pretty new to Red Hat / Fedora (I'm a Gentoo server / Ubuntu Desktop guy). I didn't have the systat package installed, so I installed it and we'll see what happens.

Hopefully will have something interesting tomorrow. For now, I guess the only thing to do is find out if there is a way to tune the box.

-Mike

ScalixSupport
Scalix
Scalix
Posts: 5503
Joined: Thu Mar 25, 2004 8:15 pm

Postby ScalixSupport » Wed Jun 07, 2006 1:17 pm

You can run some empirical tests on throughput with the following scripts:

Code: Select all

----------------------------addfile----------------------------------------
#!/bin/sh
#
if [ $# -eq 0 ]
then
   echo "Usage:  `basename $0` <number_of_~scalix/temp_files_to_be_created>"
   exit 100
fi

i=0
j=$1
while [ $i -lt $j ]
do
       i=`expr $i + 1`
       FILE=~scalix/temp/temp-file-$i
#       echo "Adding file: $FILE"
       echo "Content for $FILE" > $FILE
done

Code: Select all

----------------------------------------rmfile-------------------------------------------
#!/bin/sh
#
if [ $# -eq 0 ]
then
   echo "Usage:  `basename $0` <number_of_~scalix/temp_files_to_be_deleted>"
   exit 100
fi

i=0
j=$1
while [ $i -lt $j ]
do
       i=`expr $i + 1`
       FILE=~scalix/temp/temp-file-$i
#       echo "Deleting file: $FILE"
       rm -f $FILE
done


Run them using time against a Scalix filesystem and the root filesystem where possible by changing the FILE variable.

These scripts came in useful when we were seeing throughput problems on a customer site. It turned out that a combination of VMWare ESX, HP SAN and 2.4 Kernel slowed things down considerably but obviously this isn't the case in your specific environment.

Cheers

Dave

ScalixSupport
Scalix
Scalix
Posts: 5503
Joined: Thu Mar 25, 2004 8:15 pm

Postby ScalixSupport » Wed Jun 07, 2006 1:21 pm

There are some other alternatives, if you're looking eek out some more performance. Page 349 of the Administration Guide details a setting in general.cfg called UAL_SINGLE_TEMP_DIR which you might find useful.

Also, mounting /var/opt/scalix/temp on a separate filesystem has been known to offer some performance benefits.

In this case, we recommend that you mount the file system with "kamikaze" options, i.e. no journalling etc, to increase speed. This is based on the fact that if the server crashed, the files in that particular directory are not used again so there is no need to recover them.

Cheers

Dave

dresdn
Posts: 92
Joined: Wed Apr 05, 2006 5:11 pm

Postby dresdn » Wed Jun 07, 2006 1:53 pm

ScalixSupport wrote:
Run them using time against a Scalix filesystem and the root filesystem where possible by changing the FILE variable.



This absolutely did the trick. The root partition performed beautifully for 50k files, but as soon as I did the temp Scalix directory, the IO Wait shot up to 100% for each CPU.

The only thing I can think of that is causing the slowdown is that I am using LVM for the Scalix partition.

I'll move the temp directory according to your other post and it should clear the problem right up.

I'll report my success or failure here.

Thanks.

-Mike

tenaciousC
Posts: 89
Joined: Thu Mar 30, 2006 5:41 pm
Location: Manchester, UK.

Moving Scalix temp to a non journalling fs.

Postby tenaciousC » Wed Jun 07, 2006 5:25 pm

I created a 2GB Logical Volume and did an mkfs.ext2 to prevent journalling.

Then omshut followed by

mount /dev/VolGroup00/scalixtemp /var/opt/scalix/temp -t ext2

then omrc

But bad things happened. Local delivery and other services failed to start.

This is based on the fact that if the server crashed, the files in that particular directory are not used again so there is no need to recover them.


Perhaps, Dave, this temp directory is not as expendible as you thought??

All restarted OK after a swift

umount /var/opt/scalix/temp

Phew!

C :shock:[/quote]

PS I ran those test scripts for 50000 files and could NOT get the iowait up past 10% - The plot thickens!

pete
Posts: 111
Joined: Tue Nov 09, 2004 10:26 pm
Location: San Diego, CA

Postby pete » Thu Jun 08, 2006 11:39 am

When you mounted /var/opt/scalix/tmp on an ext2fs, did you make surethe permissions
on the mounted directory point were correct?

drwxrwx--x 4 scalix scalix 1097728 Jun 8 08:38 /var/opt/scalix/temp

P

kanderson

Postby kanderson » Thu Jun 08, 2006 11:14 pm

Note too that directory access speeds will drop off drastically if there are a REALLY large number of files in a single directory. Depending on usage, that temp directory will grow drastically. I've seen well over 300000 files in there for one install.

I generally add a cronjob to my installs which dies the following.

find /var/opt/scalix/temp/ -type f -ctime +7 -exec rm -f '{}' \;

This will delete anything which hasn't changed over the past week. Perhaps that would help in your case as well?

Dave, any thoughts on a better way to do this, or is this OK.

Thanks
Kev.

dresdn
Posts: 92
Joined: Wed Apr 05, 2006 5:11 pm

Postby dresdn » Fri Jun 09, 2006 12:28 am

HI Dave,

Is there a good way to set the entire temp directory to the seperate ext2 partition that has been created?

The reason I ask is because, I *could* mount it to /var/opt/scalix/temp, but personally, I really don't like to sub-mount partitions (there's no guarantee that /var/opt/scalix/temp will be there when the OS tries to boot and mount it).

I like Kev's idea of cleaning the temp directory out, are there any reasons why this would be a bad idea? Perhaps a socket gets created when a service starts and could technically be weeks old ... ?

Thanks,
Mike

tenaciousC
Posts: 89
Joined: Thu Mar 30, 2006 5:41 pm
Location: Manchester, UK.

Postby tenaciousC » Fri Jun 09, 2006 1:29 am

putting a reference to /var/opt/scalix/temp in /etc/fstab would be the way to mount it at boot. If it did fail to mount then the system would use the original ext3 version of /var/opt/scalix/temp.

Pete's tip on permissions may be very pertinent here also.

There are 4400 files in my temp which doesn't seem enough to start causing bad performance.


Return to “Scalix Server”



Who is online

Users browsing this forum: No registered users and 7 guests