Sunday, February 1, 2015

MariaDB(MySQL) Master-Master Replication

The simplest and probably most common replication method is master-slave replication. Basically, data is replicated from master database to the slave. In case of master database failure you must get the slave database up-to-date before failover and then promote slave to be new master.

Another method is to set up replication in both directions called master-master replication. But you must be aware that this setup brings some potential issues as data changes are happening on both nodes. It could be problem if you have tables with auto_increment fields. If both servers are inserting or updating in the same table replication will break on one server due to “duplicate entry” error. To resolve this issue you have "auto_increment_increment" and "auto_increment_offset" settings.

In my case its best to use master-master setup as active-passive replication. If we know that only one node is performing data modifications we can avoid many possible problems. In case of the failover "slave" could be easily promoted to a new master. Data modifications are automatically replicated to failed node when it comes back up.

Of course, this simple setup is not suitable for all situations and it has it's drawbacks but luckily you have several other options at your disposal, like MariaDB Galera Cluster.



Servers setup:
OS: Debian 7.8
DB: MariaDB 10.0.16




Install MariaDB 10 (both nodes).

$ sudo apt-get install python-software-properties
$ sudo apt-key adv --recv-keys --keyserver keyserver.ubuntu.com 0xcbcb082a1bb943db
$ sudo add-apt-repository 'deb http://mirror3.layerjet.com/mariadb/repo/10.0/debian wheezy main'
$ sudo apt-get update
$ sudo apt-get install mariadb-server


Stop MariaDB on both nodes:
$ sudo service mysql stop


MASTER1

Edit /etc/mysql/my.cnf parameter file.

# bind-address          = 127.0.0.1
server-id               = 61
report_host             = master1
log_bin                 = /var/log/mysql/mariadb-bin
log_bin_index           = /var/log/mysql/mariadb-bin.index
relay_log               = /var/log/mysql/relay-bin
relay_log_index         = /var/log/mysql/relay-bin.index
# replicate-do-db       = testdb
auto_increment_increment = 5
auto_increment_offset = 1


# bind-address = 127.0.0.1
By default mysql will accept connections only from local host. We will comment this line to enable connections from other hosts. This is important for replication to work.

server-id = 61
report_host = master1
Choose ID that will uniquely identify your host. I will use last two digits of my IP address. Optionally you could set report_host parameter for servers to report each other their hostnames.

log_bin = /var/log/mysql/mariadb-bin
log_bin_index = /var/log/mysql/mariadb-bin.index
Enable binary logging.

relay_log = /var/log/mysql/relay-bin
relay_log_index = /var/log/mysql/relay-bin.index
Enable creating relay log files. Events that are read from master’s binary log are written to slave relay log.

replicate-do-db = testdb
With this parameter we are telling to MariaDB which databases to replicate. This parameter is optional.




Now we can start MariaDB server.

$ sudo service mysql start


Login as root and create user that will be used for replicating data between our servers. Grant appropriate privileges to the user.

$ sudo mysql -uroot -p
MariaDB [(none)]> create user 'replusr'@'%' identified by 'replusr';
MariaDB [(none)]> grant replication slave on *.* to 'replusr'@'%';


For the last step check status information about binary log files as we will use this information to start replication on another node.

MariaDB [(none)]> show master status;
+--------------------+----------+--------------+------------------+
| File               | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+--------------------+----------+--------------+------------------+
| mariadb-bin.000009 |      634 |              |                  |
+--------------------+----------+--------------+------------------+



MASTER2

Edit /etc/mysql/my.cnf parameter file.

# bind-address          = 127.0.0.1
server-id               = 62
report_host             = master2
log_bin                 = /var/log/mysql/mariadb-bin
log_bin_index           = /var/log/mysql/mariadb-bin.index
relay_log               = /var/log/mysql/relay-bin
relay_log_index         = /var/log/mysql/relay-bin.index
# replicate-do-db       = testdb
auto_increment_increment = 5
auto_increment_offset = 2


Start MariaDB server.

$ sudo service mysql start


Create user which will be used for replication and grant privileges to the user.

$ sudo mysql -uroot -p
MariaDB [(none)]> create user 'replusr'@'%' identified by 'replusr';
MariaDB [(none)]> grant replication slave on *.* to 'replusr'@'%';


To start replication enter following commands.

MariaDB [(none)]> STOP SLAVE;

MariaDB [(none)]> CHANGE MASTER TO MASTER_HOST='master1', MASTER_USER='replusr',
-> MASTER_PASSWORD='replusr', MASTER_LOG_FILE='mariadb-bin.000009', MASTER_LOG_POS=634;

MariaDB [(none)]> START SLAVE;

For MASTER_LOG_FILE and MASTER_LOG_POS I have used information from "show master status" on the first node.

Check status information of the slave threads.

MariaDB [(none)]> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: master1
                  Master_User: replusr
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mariadb-bin.000009
          Read_Master_Log_Pos: 634
               Relay_Log_File: relay-bin.000002
                Relay_Log_Pos: 537
        Relay_Master_Log_File: mariadb-bin.000009
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: testdb
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 634
              Relay_Log_Space: 828
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: No
                  Gtid_IO_Pos:


Notice that Read_Master_Log_Pos and Exec_Master_Log_Pos are in sync which is good indicator that our databases are in sync.


Check status information about binary log files of the MASTER2 node. We will need this information to start replication on MASTER1 node.

MariaDB [(none)]> show master status;
+--------------------+----------+--------------+------------------+
| File               | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+--------------------+----------+--------------+------------------+
| mariadb-bin.000009 |      759 |              |                  |
+--------------------+----------+--------------+------------------+



MASTER1

Start replicating data from MASTER2 to MASTER1 node.

MariaDB [(none)]> STOP SLAVE;

MariaDB [(none)]> CHANGE MASTER TO MASTER_HOST='master2', MASTER_USER='replusr',
-> MASTER_PASSWORD='replusr', MASTER_LOG_FILE='mariadb-bin.000009', MASTER_LOG_POS=759;

MariaDB [(none)]> START SLAVE;


MariaDB [(none)]> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: master2
                  Master_User: replusr
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mariadb-bin.000009
          Read_Master_Log_Pos: 759
               Relay_Log_File: relay-bin.000002
                Relay_Log_Pos: 537
        Relay_Master_Log_File: mariadb-bin.000009
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: testdb
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 759
              Relay_Log_Space: 828
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 62
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: No
                  Gtid_IO_Pos:


Everything seems to be OK.



Let’s create test table and insert some rows to test our replication.


MASTER1

MariaDB [(none)]> create database testdb;

MariaDB [(none)]> use testdb;
Database changed

MariaDB [testdb]> CREATE TABLE users (id INT AUTO_INCREMENT,
    -> name VARCHAR(30),
    -> datum TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    -> PRIMARY KEY(id));
Query OK, 0 rows affected (0.50 sec)

MariaDB [testdb]> INSERT INTO users(name) VALUES ('Marko');
Query OK, 1 row affected (0.06 sec)

MariaDB [testdb]> select * from users;
+----+-------+---------------------+
| id | name  | datum               |
+----+-------+---------------------+
|  1 | Marko | 2015-02-01 00:41:41 |
+----+-------+---------------------+
1 row in set (0.00 sec)


MASTER2

MariaDB [testdb]> use testdb
Database changed

MariaDB [testdb]> select * from users;
+----+-------+---------------------+
| id | name  | datum               |
+----+-------+---------------------+
|  1 | Marko | 2015-02-01 00:41:41 |
+----+-------+---------------------+
1 row in set (0.00 sec)

MariaDB [testdb]> INSERT INTO users(name) VALUES('John');
Query OK, 1 row affected (0.39 sec)

MariaDB [testdb]> select * from users;
+----+-------+---------------------+
| id | name  | datum               |
+----+-------+---------------------+
|  1 | Marko | 2015-02-01 00:41:41 |
|  2 | John  | 2015-01-31 16:17:55 |
+----+-------+---------------------+
2 rows in set (0.00 sec)


MASTER1

MariaDB [testdb]> select * from users;
+----+-------+---------------------+
| id | name  | datum               |
+----+-------+---------------------+
|  1 | Marko | 2015-02-01 00:41:41 |
|  2 | John  | 2015-01-31 16:17:55 |
+----+-------+---------------------+
2 rows in set (0.00 sec)

As we can see our table and rows are replicated successfully.



Let’s simulate crash of the MASTER1 node and power off the server.

$ sudo shutdown -h now

While server is down insert some rows on MASTER2 node.

MASTER2

MariaDB [testdb]> INSERT INTO users(name) VALUES ('Eric');
Query OK, 1 row affected (0.41 sec)

MariaDB [testdb]> INSERT INTO users(name) VALUES ('Clive');
Query OK, 1 row affected (0.08 sec)

MariaDB [testdb]> INSERT INTO users(name) VALUES ('Maria');
Query OK, 1 row affected (0.09 sec)

MariaDB [testdb]> select * from users;
+----+-------+---------------------+
| id | name  | datum               |
+----+-------+---------------------+
|  1 | Marko | 2015-02-01 00:41:41 |
|  2 | John  | 2015-01-31 16:17:55 |
|  3 | Eric  | 2015-01-31 16:19:49 |
|  4 | Clive | 2015-01-31 16:19:55 |
|  5 | Maria | 2015-01-31 16:20:01 |
+----+-------+---------------------+
5 rows in set (0.00 sec)


MariaDB [testdb]> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Reconnecting after a failed master event read
                  Master_Host: master1
                  Master_User: replusr
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mariadb-bin.000010
          Read_Master_Log_Pos: 1828
               Relay_Log_File: relay-bin.000012
                Relay_Log_Pos: 1083
        Relay_Master_Log_File: mariadb-bin.000010
             Slave_IO_Running: Connecting
            Slave_SQL_Running: Yes
              Replicate_Do_DB: testdb
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1828
              Relay_Log_Space: 1663
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 2003
                Last_IO_Error: error reconnecting to master 'replusr@master1:3306' - retry-time: 
                               60  retries: 86400  message: Can't connect to MySQL server 
                               on 'master1' (111 "Connection refused")
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: No
                  Gtid_IO_Pos:

Check Last_IO_Error message while MASTER1 is down.


Now turn on MASTER1 node again.
MariaDB server and replication will start automatically and MASTER1 should catch up MASTER2.


MASTER1

Check "users" table - it's synchronised again.

$ mysql -u root -p -D testdb

MariaDB [testdb]> select * from users;
+----+-------+---------------------+
| id | name  | datum               |
+----+-------+---------------------+
|  1 | Marko | 2015-02-01 00:41:41 |
|  2 | John  | 2015-01-31 16:17:55 |
|  3 | Eric  | 2015-01-31 16:19:49 |
|  4 | Clive | 2015-01-31 16:19:55 |
|  5 | Maria | 2015-01-31 16:20:01 |
+----+-------+---------------------+
5 rows in set (0.00 sec)



Please let me know if you see possible problems in this configuration. I will update post gladly. Thanks for reading!

32 comments:

  1. Hi!

    if you use the same server-id's for both master serve like you describe in your tutorial, you get this error massage in slave status:

    Last_IO_Error: Fatal error: The slave I/O thread stops because master and slave have equal MySQL server ids; these ids must be different for replication to work (or the --replicate-same-server-id option must be used on slave but this does not always make sense; please check the manual before using it).

    ReplyDelete
  2. This was typo in blog post. I have used different server-id's in my working example - already made correction in the post.

    Thank you for reporting me this.

    Regards,
    Marko

    ReplyDelete
  3. Very nice post, worked very well.

    ReplyDelete
  4. I think you forgot to mention about adding hosts to /etc/hosts file. Could confuse a newbie! Otherwise, thank you for a good detailed post!

    ReplyDelete
  5. I'm glad that you found this blog post useful.

    Best Regards,
    Marko

    ReplyDelete
  6. Nice! But what if I need more than 2 Masters?

    ReplyDelete
  7. Hi rafaeldb,

    I haven't tested this solution with more than 2 masters.

    For a multi-master replication with 3 and more servers I would use Galera Cluster.

    Regards,
    Marko

    ReplyDelete
  8. This is a fantastic tutorial. Very well written and accurate. I successfully created a multi-server environment to load balance two WordPress servers and your tutorial came in very handy. Thanks so much!

    ReplyDelete
  9. Install on CentOS 7 using MariaDB.repo and every thing worked after removing
    /var/log/mysql/ and letting MariaDB use /var/lib/mysql/ . Great tutorial. Thanks!!!

    ReplyDelete
  10. Very nice explanation step by step, thanks buddy.. Can u please provide link of galeria cluster also.

    Kishore Immadi

    ReplyDelete
  11. hi Marko,
    when I restarted the MySQL service from MASTER1 the log-file and position is changed , so every time I have to enter the updated log file and position, or can u help me for it?

    ReplyDelete
  12. This comment has been removed by the author.

    ReplyDelete
  13. Hi Marko,
    I appreciate your work providing us with this tutorial. I've just set up a pair of servers working as Freeradius db backend. The only weak point of the setup is case of simultaneous insertion to tables with auto_increment column on both servers, when the connection between them is broken. Maybe then unsafe, but saving your time workaround, when some unsynchronized records are irrelevant to you is additional directive in config on both masters slave_skip_errors=0 . Anyway this simple sollution does great job most the times.

    ReplyDelete
  14. I seem to get a lot of Error 'Duplicate entry'in my "lastmysql error" box showsing when I look at the slave config, this seems to stop the slaves from syncing and I have to manually find which one is more up to date, cart it over to the other server and then force the servers to re-sync to each others masters, any ideas?

    ReplyDelete
  15. For those that want to configure the replication with SSL (in case master1 and master2 are in different insecure networks), might consider to enable SSL by adding:

    [mysqld]
    ssl-ca=/etc/pki/tls/certs/root-ca-chain.crt
    ssl-cert=/etc/pki/tls/certs/host.ssl.pem.crt
    ssl-key=/etc/pki/tls/private/host.ssl.pem.key

    When the user "replusr" is created, add "REQUIRE SSL" to ensure that this user only connects via SSL. This results in something like:
    create user 'replusr'@'%' identified by 'replusr' REQUIRE SSL;

    Last but not least ensure that the replication slave connects via SSL by adding ",MASTER_SSL=1", which results in:

    CHANGE MASTER TO MASTER_HOST='master2',MASTER_SSL=1, MASTER_USER='replusr',
    -> MASTER_PASSWORD='replusr', MASTER_LOG_FILE='mariadb-bin.000009', MASTER_LOG_POS=759;

    ReplyDelete
  16. HI ,

    am using Open SUSE leap 42.1 maria database Vicidial 0n 02 server , my both server sync each other and all thing working fine but old data , user , table not sync master1 to master2 . new added table user sysnc accordingly. please advise

    ReplyDelete
  17. Tanver, before adding another server to your configuration you must create fresh backup, and using that backup create additional slave/master database. Then you will have fully synced databases.

    ReplyDelete
  18. This was one of the best articles for doing this. I like the little mini unit tests at the end :-)

    ReplyDelete
  19. Hi,
    you forgot some very important think ...
    You must ensure, that autoimcrement fields are correct incremented so when you got an disconnect the autoimcrement is ok around the Master Servers..
    Add following to the my.cnf
    First Master:
    auto_increment_increment = 5
    auto_increment_offset = 1

    Second Master
    auto_increment_increment = 5
    auto_increment_offset = 2

    ReplyDelete
  20. Thank you Holger for your comment.
    I have updated my.cnf files.

    Regards,
    Marko

    ReplyDelete
  21. for me it works still like master slave, making changes on slave has no result on master. I tested it with M an S down test it works the same way...
    It's not what i was looking for.

    ReplyDelete
  22. Dear Marko,

    Thank you for your tutorial, It seems to be not so hard to make that setup working. But I have a question (actually 2 questions).

    1) As you said, it is better to use Active-Passive mode in master-to-master replication. Suppose the "first" MASTER server (which was ACTIVE) fails, so, clients should now send their transactions to the "second" MASTER server (which was as Passive) that would begin to accept read / write operations in the database.
    If the "first" MASTER goes back on, I suppose It should start a replication process against the "second" master to have the information updated.
    Here the question: this "first" MASTER server will accept transactions in the database immediately as the service is activated or it will wait for the synchronization to end to accept new transactions ?.
    If it doesn't wait to finish the synchronization, is there any configuration to avoid the processing new mysql transactions until the databases between the two masters synchronize completely?

    Thanks in advance

    Alfredo

    ReplyDelete
  23. Hello Alfredo,

    the best is to test by yourself. I think that node will accept transactions without waiting for synchronization to finish.

    If you need virtually synchronous replication between nodes for production usage I would recommend MariaDB Galera Cluster.
    In Galera Cluster configuration node will be available for new transactions after synchronization finishes.

    Regards,
    Marko

    ReplyDelete
  24. Very usefull tutorial. Thanks!

    ReplyDelete
  25. Postovani, mozda je Vama profesionalcima pitanje smijesno, ali da li je moguce podesiti ovakvu replikaciju za dva servera koja su na windowsima, ako jest da li je ista procedura. I svaka cast na upustvu.

    ReplyDelete
  26. I have followed a number of posts (including this one) to set up Mariadb Multi-Master Replication. So far I have not been able to get any of them to work.
    Should the 2 dbs have been synced via mysqldump prior to using this post?

    ReplyDelete
  27. What would you do if you want to create master-slave configuration. You would copy master database using mysqldump or xtrabackup to the slave side than start replication from specified position. For master-master configuration both servers are master and slave at the same time so you need to start replication twice. You don't need to sync databases using mysqldump.

    For master-master configuration I would recommend Galera Cluster.

    Regards,
    Marko

    ReplyDelete
  28. This comment has been removed by a blog administrator.

    ReplyDelete
  29. How we can verify if both mariadb/mysql is having same data?

    ReplyDelete
  30. Hi Marko,
    Great article, thank you.

    For 3 node multi-master I woulds suggest to use GTID based replication with auto_increment_increment and auto_increment_offset parameters.

    Regards,
    Ravil Khalilov

    ReplyDelete
  31. Hello Ravil,
    Thank you for the comment. I agree that it is better to use GTID based replication.
    This is very old article, more like POC not production ready solution.
    For the multi-master configuration Galera Cluster is much better option.
    Maybe I should remove article not to confuse people.
    Best regards,
    Marko

    ReplyDelete