MySQL Master-Master desynchronization

Settings a master-master synchronization is pretty easy. You can find a quick guide to do this on google, just try.

What is a little bit more problematic is when you lose your loved sync. And that can happen. It happened to me yesterday, I upgraded my two servers from Debian 4.0 to Debian 5.0. The reason is that the old version (something like 5.0.32) used the Relay_Log_File ${hostname}-relay-bin.XXXXXX. And the new version (5.0.51a-24-log) decided to use mysqld-relay-bin.XXXXXX.

I tried to solve it the easy way with just a :

STOP SLAVE;
CHANGE MASTER TO RELAY_LOG_FILE="mysqld-relay-bin.000001";

but the server wasn’t happy with that. So I took the time to take the current slave settings :

mysql> show slave status\G
*************************** 1. row ***************************
             Slave_IO_State:
                Master_Host: **HOST**
                Master_User: slave1
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: mysql-bin.000521
        Read_Master_Log_Pos: 306043
             Relay_Log_File: mysqld-relay-bin.000050
              Relay_Log_Pos: 306137
      Relay_Master_Log_File: mysql-bin.000521
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB:
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                 Last_Errno: 0
                 Last_Error:
               Skip_Counter: 0
        Exec_Master_Log_Pos: 306043
            Relay_Log_Space: 306137
            Until_Condition: None
             Until_Log_File:
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File:
         Master_SSL_CA_Path:
            Master_SSL_Cert:
          Master_SSL_Cipher:
             Master_SSL_Key:
      Seconds_Behind_Master: 0
1 row in set (0.00 sec)

On both server, I did something like that :

STOP SLAVE;
RESET SLAVE;
CHANGE MASTER TO RELAY_LOG_FILE="mysqld-relay-bin.000001", MASTER_LOG_FILE="mysql-bin.000521", MASTER_LOG_POS=306043,
RELAY_MASTER_LOG_FILE='mysql-bin.000521', RELAY_MASTER_LOG_POS=306043;
START SLAVE;

The position of the log is pretty important.

It might sometimes occurs that you RELAY_MASTER_LOG_POS on the slave is higher that the Position that the server possesses for the same file. The reason is that sometimes the master sends its logs, crashs and haven’t saved it to his logs. In this case, you might have a little bit more data on the slave than the server. That means that you should lower the MASTER_LOG_POS of your slave so that it isn’t higher than the Position shown by the “show master status” command.

I really recommend using this precise method instead of restarting the whole synchronization process. Which occasionally requires you to stop your servers to copy the files. On my servers, the databases take 15 GB. It isn’t a big database bug that already makes you think twice before doing a copy (or even a rsync).

By the way, I really talked about it, but InnoDB/MySQL is a very simple/efficient/sure choice for a database. The only big drawback of the InnoDB is that there isn’t a simple online backuping solution. I like to save the database and its logs regulary (a bad raw copy of a InnoDB database can be corrected by its logs) and to extract the whole database in SQL every night.

By the way, if you don’t understand everything and particularly why each slave must have two threads to read master logs, you should read this.