Setting up LVM

mrirh · Postby **mrirh** » Thu Apr 26, 2007 1:32 pm

Hello,

I am setting up a fresh install of Scalix and I created the LVM on /dev/sdb1. The fstab has this entry;

/dev/vgscalix/lvscalix /var/opt/scalix xfs defaults 1 2

and df-h;

/dev/mapper/vgscalix-lvscalix
100G 272K 100G 1% /var/opt/scalix

I have two questions about best practice for setting up the LVM for Scalix: 1) Is XFS ok to format the mount or is another fs recommended?
2) Do I need another volume to take the backup snapshots?

Thank you,

~James

Shredder · Postby **Shredder** » Thu Apr 26, 2007 2:56 pm

You could try looking at these.
http://portal.knowledgebase.net/display/1n/index.asp?c=4664&cpc=XobrFeR5Lsov624U2PCS4Eyx6TOWaiM7gw3an4YYt77&cid=4912&cat=&catURL=&r=3.574771E-02

Found these in the knowledge base under LVM

Shredder

mrirh · Postby **mrirh** » Thu Apr 26, 2007 3:21 pm

Hello,

Thanks for the response. I am actually looking at that doc, but it doesn't say which FS would be best. I am guessing that XFS would better than ext3 since Scalix is using postgres, although I really don't know.

Does anyone have a recommendation?

Thank you,

~James

craig · Postby **craig** » Thu Apr 26, 2007 3:42 pm

I would in my opinion use ext3, just for it's all purpose abilit, XFS is good for really large files.

So your choice should fall on your needs.

Craig

jaime.pinto · Postby **jaime.pinto** » Thu Apr 26, 2007 4:12 pm

There are a number of file systems better than ext3, since they allow for very easy growth and shrinkage of the file system, xfs being one of them, but under RHE4 it's just not available, so we have to settle for ext3 (a pain to grow, impossible to shrink!).

As far is snapshot goes, you do require a different volume or partition for it, anywhere from 10% to 30% of the whole file system, depending on how long each backup to tape will take. The longer it takes, the larger that partition will need to be. A good strategy is to start with 50% for normal use and 10% for snapshot. As time goes by and you learn a little better about your system's requirements/behavior you can grow the partitions on the direction you want. You should develop a script to turn snapshot on just prior to start the tape backup and off right after.

Under RHE4 (3,2,1,...) snapshot just doesn't work. It will surely crash and lock the whole machine the moment the snapshot reaches 2GB. In the RHE5 release notes they claim to have fixed this, and to have added xfs, but I had no change to check that yet. Under SUSE 10 and ubuntu breeze/dapper snapshot works fine, and so do the additional options of file systems. Under solaris/irix it's even better.

I hope this helps.

On a side note I'm looking into the option of not using local storage at all in the scalix server, neither under the cluster setup. I'd will keep all the mail folders on a 3rd server, independent of the scalix servers, and mounted as NFS. The NFS server will take care of the snaphot/backup.

Jaime

Shredder · Postby **Shredder** » Thu Apr 26, 2007 4:19 pm

Using NFS is really highly unrecommended because NFS is not optimized for high volume small transactions.

Shredder

jaime.pinto · Postby **jaime.pinto** » Thu Apr 26, 2007 4:41 pm

NFS is not optimized for high volume small transactions.

That is questinable. For example, I have a situation of about 300 hundred computer nodes rendering frames of animation/special effects 24 hours a day, week after week, to a single NFS mounting point, common to all the nodes. The renderfarm is constantly hitting the NFS server from every direction, with little tiny and/or large bits/fragments of data continuously being written to and re-read from filesystem without a problem. How can a mail server even compare to this?

Jaime

Shredder · Postby **Shredder** » Thu Apr 26, 2007 5:04 pm

NFS mounted data is not supported by Scalix, and I think the installer will not let you upgrade if your data is on a NFS mount point.

If you want to have your data elsewhere, it would be best to use a FC SAN.

Shredder

jaime.pinto · Postby **jaime.pinto** » Thu Apr 26, 2007 6:39 pm

FC SAN is probably a very good solution if you have a couple of thousand users, at least 2TB of storage, and who knows, maybe even 3-4 mail servers in the cluster, sharing that storage in some sort of load balancing serving system, in particular if your sole business is email.

But in our case, with just about 120 users in total. a SAN is an overkill, and not cost effective.

In my view, the only purpose of having a second mail server in a cluster setup is for it to be a fail-over to the first using heartbeat (CPU, power supply, system disk, etc) . Storage is a completely different issue. If you add a SAN to the equation you need to worry about a fail-over for the SAN as well.

We already have a very failsafe storage solution for the whole facility, available via NFS/(gigE,10gigE, infiniband). We are looking into further consolidating storage, common to several serving requirements, not pulverizing it. Adding yet another storage device just to make scalix happy goes against that prime directive.

There is a difference between supporting/not supporting a configuration and a solution that works well or not. I couldn't care less if scalix supports or not NFS. That's politics. What I do care about is to not waste money.

On that note, in another post I asked scalix if they could be more specific on what makes them unhappy about NFS. I hope they can take the time to answer considering the cost involved on setting up SANs. viewtopic.php?t=7032&highlight=

Jaime

dkelly · Postby **dkelly** » Thu Apr 26, 2007 11:56 pm

The answer is in the release notes at http://downloads.scalix.com/.community/11.0.3/RELEASE_NOTES.html#supported

Cheers

Dave

jaime.pinto · Postby **jaime.pinto** » Fri Apr 27, 2007 12:58 am

Also note that we do not support putting a Scalix Message Store on a NFS- or SMB-based storage device such as a NAS Filer. Performance issues and/or Message Store corruption can occur. This is most notable for Linux distributions using Linux 2.6.x kernels as the NFS client software in these kernels have known issues with file locking operations, which Scalix heavily relies on.

I've been hearing about such claims under linux for a few years now. That is probably the reason we held off most of our NFS servers still on Irix and Solaris until very recently. The scalix claim does mention the client side, which is still a point well taken on my part. But I've been using redhat and ubuntu NFS clients for 4 and 2 years respectively, in very large deployments, and I'm still to see a real issue personally that could be attributed to sync or locking operations. It just hasn't happen yet on my watch in practice!

So much so that since early January I deployed a 8TB RHE4 NFS server on a trial basis, and late March I made it officially production certified, with over 40 very heavy and hungry NFS clients. I must say most of our users are developers and extremely demanding, subjecting this server to a beating I have not seen in years. I'm still to hear a report of a problem related to IO operations (except for the snapshot attempt).

In conclusion, I'm sure there must have been problems out there that gave origin to the warning, but these days they seem overblown and exaggerated. More like a legend or a taboo. I did a google search on the subject and practically all the posts a came across date back to 2002 or earlier, or people just repeating what they heard from other people without real support to their claims.

So, my request to Scalix remains. What have you seen in your lab regarding NFS performance/behavior that justifies the above objection on the release notes?

Jaime

florian · Postby **florian** » Fri Apr 27, 2007 12:40 pm

Hi Jaime,

I believe you are very likely to be correct about running NFS servers on Linux - and I would assume that this is really the case, given that even a couple of storage vendors NAS/NFS implementations are based on Linux (not sure about Netapps, but others are).

Linux NFS clients are a more complex case. While I believe they have been well-tested and are mature for standard filesystem access, such as for sharing home directories, etc., the case of a heavy-duty server application running on a NFS client system is a very different story.

As an example, in RHEL4/Kernel 2.6 up until Update 3, the implementation of the "nolock" mount option had suddenly changed meaning. In 2.4 Kernels, the meaning of the option meant that there would be no NFS locking calls across the network, however, multiple processes running on the system mounting the NFS file system with this option would still see each others locks. Effectively, the semantics of the option were more like a "local locking" option - the original Linux NFS code 'copied' that behavior and naming from Solaris. In the initial 2.6 implementation, semantics changed and "nolock" was suddenly implemented as what the name might imply - no locking at all.

As Scalix relies on some level of file locking for coordinating file access between multiple processes, it fell over badly and the message store would be corrupted on those systems. The Linux-NFS folks had recognized this error at some point and RedHat took it in, after working on a bug I filed as a result, see https://bugzilla.redhat.com/bugzilla/sh ... ?id=167192 for details.

We recently had a customer try this on RHEL4U4 and, while it did not expose the basic problem described in this bug anymore, we did experience message store corruption as a result of some inadequate locking that we are still working to fully understand. The solution seems to be to rely on NFS network locking, however in that case, performance is affected dramatically.

This gets us to the final point - performance patterns. While providers of NFS-based storage solutions highlight the great scaleability of their products, very often, they do not make such statements specific to applications and use cases - and this is highly important for storage. NFS performs great for large-block-size, read-mostly, few-files, async disk access situation. Again, this is what you see in users sharing home directories or shared application installation instances. Both these are what NFS was originally designed for.

Scalix on the other hand, with it's file system based message store, is quite the opposite. We are pretty write-intensive (given that there are small changes such as marking messages read, etc., all the time), require synchronous writes for maximum data integrity and also have variable block sizes and I/O chunks, plus our store is spread across a large number of small files by design. The layout has a lot of advantages in terms of stability, data integrity and general scaleability, however, it constitutes a worst case scenario for a NFS situation.

Bottom line is that from a Scalix perspective we have enough evidence of issues, both data integrity and performance related, that we have decided to explicitly discourage people from using NFS-based storage and in most situations and with very few exceptions at this point won't commit to supporting customers doing this.

Hope this helps to provide some insight into the rationale behind these statements,
Florian.

Scalix Forums

Setting up LVM

Setting up LVM

Setting up LVM

Who is online