Page 1 of 1

Scalix Data Directory Growing Insanely

Posted: Wed Jan 30, 2008 7:40 am
by propagandhi
Yesterday I checked the size of the scalix data directory. We mount /var/opt on a 2TB local drive, and when I checked, it was at exactly 59GB. Today we had a problem where POP connections were not being accepted, so I rebooted the scalix server after the scalix services restart using init scripts did not work.

Suddenly I find that the scalix data directory is now at 259GB.

How on earth could the data directory have grown so suddenly since the previous day when I checked it? I guarantee we have not received 200GB of data overnight!!!

I do a sxdu and it shows that the actual mail size is in fact 31GB.

This is crazy!! How is the scalix data directory 228GB larger than the actual mailbox sizes.

Please help, my job depends very much on fixing this up.....

Posted: Wed Jan 30, 2008 8:26 am
by Valerion
By data directory, I assume you mean ~/data, not the whole Scalix mailstore itself?

First run an active omscan and see if there's orphans taking up space. Then check your queues with omqdump and see if one of them is causing issues. I once had this when the archiver shut down (the drive it was archiving to had a hardware failure and Linux unmounted it). Also check if omshowlog reports anything and if there's hints in ~/logs/fatal.

I wonder if the failing POP3 connections didn't leave data behind? Or whether the POP3 issues are related to the data size (though that's unlikely).

Check p. 183 in the Administration Guide for some options for changing the behaviour of omscan. Be careful though, and make sure you understand the implications first, before changing it.

It's also not unusual for data to be larger than the reported size. On my server the total mailbox size is 2.9GB, while ~/data takes up 5.6GB. However a growth of 200GB overnight is unusual.

Still Not Solved!

Posted: Thu Jan 31, 2008 6:41 am
by propagandhi
Yes mean anything under /var/opt and yes, primarily the data directory itself.

I've still not located the source of the huge growth, it has not occurred again yet, but I am very concerned that the mailbox total that scalix reports is approximately 30GB but I have 258GB used on the drive.

I really need to solve this.

I have used omscan, in just about all the variations I have discovered on the forums but I dont want to be too aggressive. I need to find a way of shrinking the data directory back to a logical size!!!

Please I really need some feedback here. We are using the enterprise edition of 11.3 with update 1 applied.

More Info

Posted: Thu Jan 31, 2008 6:46 am
by propagandhi
In My fatal log, I see the following:

ERROR Administration(Scalix Admin C) Wed Jan 30 14:21:38 2008
[OM 3400] Invalid direct ref found
Pid of logging process: 4304

HUNDREDS of those ones.

Also these:

ERROR Directory Sync(Directory Sync) Wed Jan 30 13:30:07 2008
[OM.DS 1506] (DS IMPORT)
A REPLY Updates has been timed out and the
configured transaction retry count has been exceeded.


ERROR Directory Sync(Directory Sync) Wed Jan 30 12:33:03 2008
[OM.DS 1608] #(DS IMPORT) REPLY Updates with wrong Request_Id received. Message ignored.

Pid of logging process: 4304
Current errno value: 42

Getting there

Posted: Thu Jan 31, 2008 6:51 am
by propagandhi
OK

Even further on this problem.

There is a file in the data directory that is 200GB in size on its own!!!!!


/var/opt/scalix/sx/s/data/00000ha/006jd98

Its properties (ls -lah):
-rw-rw---- 1 scalix scalix 200G Jan 30 09:12 /var/opt/scalix/sx/s/data/00000ha/006jd98


Thats the file right there.

If I cat that file I see this repeatedly, thousands of times:

tent-Transfer-Encoding: base64
ename="N72061.pdf"
7759_16140_10311"
D

tent-Transfer-Encoding: base64
ename="N72061.pdf"
7759_16140_10311"
D

tent-Transfer-Encoding: base64
ename="N72061.pdf"
7759_16140_10311"
D

tent-Transfer-Encoding: base64
ename="N72061.pdf"
7759_16140_10311"


What do I do? How can I safely remove it.

Your help is much anticipated and appreciated.

Posted: Thu Jan 31, 2008 8:54 am
by Valerion
Not sure. Since you have dirsync configured I assume you're an EE customer. There seems to be an error there. I would suggest you contact your reseller or Scalix support for help with this.