HowTos/DiagnosingDirsync

From Scalix Wiki
Jump to: navigation, search

Scalix Wiki -> How-Tos -> Diagnosing Dirsync

Diagnosing Dirsync

Dirsync is (relatively) easy to set-up but, if you configure everything all at once and then set it off to run all at the same time, it can be difficult to diagnose a problem with so many things moving in lots of different directions.

Whenever I'm asked to help diagnose dirsync setups, the very first thing I suggest is to start again but using small steps, i.e configure 1 direction first and make sure that works by setting up auditing on the dirsync service.

So, delete your existing dirsync agreements on serverA and serverB using omdelds -i nnn and omdelds -e nnn where nnn is the number returned from omlistds -i or omlistds -e.

It's best to work through the HowTo at HowTos/AddingAScalixServer to make sure that you have your routes set up correctly.

Configure auditing:

omoff -d 0 dirsync
omconfaud dirsync 15
omon dirsync

on both machines.

To explain a little about dirsync, it's a command-reply protocol between the servers. The standard mode of operation is

  1. Importing server sends a REQUEST_UPDATES message with a timestamp, i.e. it asks for any directory updates AFTER the date specified.
  2. If the exporting server has never received a message from the importing server before or the timestamp is 0, it will send a REPLY_RELOAD message back otherwise, it will send a REPLY_UPDATES message with all the updates.
  3. If the importing server receives a REPLY_RELOAD message, it will issue a REQUEST_ALL message back to the exporting server.
  4. If the importing server receives a REPLY_UPDATES message, it will apply those changes.
  5. When the exporting server receives a REQUEST_ALL, it will reply with a REPLY_ALL which includes every entry in its directory.

By enabling auditing, you will be able to use tail -f on /var/opt/scalix/logs/audit with two windows (1 for the importing server and 1 for the exporting server ) and see the sync in "real time" with the messages as I described above.

On serverA, configure the import agreement using omaddds -i and on serverB, configure the export agreement using omaddds -e.

There are a couple of tweaks you can add to /var/opt/scalix/sys/general.cfg to help the diagnosis

DS_CUST_SEND_REQ_NOW=TRUE
DS_CUST_MSGQ_TIMEOUT=2

DS_CUST_SEND_REQ_NOW=TRUE causes the importing server to issue a REQUEST_UPDATES when dirsync is started using omon or omrc.

DS_CUST_MSGQ_TIMEOUT=2 tells the dirsync service to wake up more frequently to check for any import agreements that need to be started.

You should then be able to restart dirsync

omoff -d 0 -w dirsync ; omon dirsync

Using the audit logs, you should see the REQUEST_UPDATES sent from serverA, also check /var/log/mail to make sure that the message was sent through sendmail with the address scalix@other.host.com. On serverB, you should be able see the REQUEST_UPDATES being received and a REPLY_UPDATES being sent back.

If there is no REQUEST_UPDATES received, there are a couple of things that you need to check:

  1. You can mail to serverB from serverA without needing Scalix to do it, i.e. send to root@other.host.com using the 'mail command.
  2. You are using CNAME records for your servers. Again, read through the HowTo at HowTos/AddingAScalixServer to find out how to get that working.

If the REQUEST_UPDATES 'was received, but no REPLY_UPDATES is sent back, this can mean one of two things

  1. No agreement was set up between the two servers
  2. The agreement doesn't contain the correct +DIRSYNC address. Check that you've specified the mailnodes correctly.

If a REPLY_UPDATES or a REPLY_RELOAD is sent back but the importing server never receives it, the same rules apply as for the initial REQUEST_UPDATES. Check mail between both servers, the host names and also that the agreements are set up correctly.

If the reply is received successfully, the work is done. Communication is happening in both directions and both sides have their agreements set up.