In MySql world it can cause confusion or possible problems with synchronisation in Galera cluster configuration.
Let’s check some examples.
I have MySQL database with datadir=/data in configuration file. I have deleted lost+found directory and restarted MySQL service.
When I list my databases this is result:
mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | employees | | mysql | | performance_schema | | pitrdb | | sbtest | | sys | | test | +--------------------+ 8 rows in set (0.34 sec)
I will stop MySQL service and recreate lost+found directory.
$ sudo service mysql stop $ cd /data $ sudo mklost+found mklost+found 1.42.9 (4-Feb-2014)
Restart service and show databases.
$ sudo service mysql start mysql> show databases; +---------------------+ | Database | +---------------------+ | information_schema | | employees | | #mysql50#lost+found | | mysql | | performance_schema | | pitrdb | | sbtest | | sys | | test | +---------------------+ 9 rows in set (0.01 sec)
Notice database : #mysql50#lost+found
If you have dedicated entire FS to use as MySQL datadir then MySQL will interpret all files under that directory as db-related files.
SHOW DATABASES lists database lost+found which is not real database.
If you check error log you can notice this message:
[ERROR] Invalid (old?) table or database name 'lost+found'
For a single server configuration issues with lost+found directory can only make confusion. I’m not aware of any negative effects for database.
To avoid confusion you should move database to sub-directory below the root level directory. Also remove all directories that are not MySql db-related from datadir location.
Stop MySQL service on database server.
$ sudo service mysql stop
Make sub-directory and move existing data to new directory.
$ sudo su - root@galera1:~# cd /data root@galera1:/data# mkdir mydata && mv !(mydata) mydata root@galera1:/data# chown -R mysql:mysql /data
Update configuration file with new datadir location.
# vi /etc/mysql/my.cnf ... datadir=/data/mydata ...
Remove non-database directories.
# rm -rf mydata/lost+found # mklost+found mklost+found 1.42.9 (4-Feb-2014) # pwd /data # ls -l total 56 drwx------ 2 root root 49152 Oct 4 16:48 lost+found drwxr-xr-x 9 mysql mysql 4096 Oct 4 16:48 mydata
Restart the service.
$ sudo service mysql start
From 5.6 version you can tell server to ignore non-database directories using ignore-db-dir option.
$ sudo vi /etc/mysql/my.cnf ... ignore-db-dir=lost+found ...
Let’s test how lost+found directory affects Galera cluster configuration.
For this test I’m using Percona XtraDB Cluster 5.6 with 3 nodes.
# dpkg -l | grep percona-xtradb-cluster-server ii percona-xtradb-cluster-server-5.6 5.6.25-25.12-1.trusty amd64 Percona XtraDB Cluster database server binaries mysql> select version(); +--------------------+ | version() | +--------------------+ | 5.6.25-73.1-56-log | +--------------------+ 1 row in set (0.00 sec) mysql> show global status like 'wsrep_cluster_size'; +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | wsrep_cluster_size | 3 | +--------------------+-------+ 1 row in set (0.01 sec)
In this configuration for datadir is specified /data location with lost+found directory.
As this is 5.6 version I’ve included ignore-db-dir option in configuration file.
In SHOW DATABASES list and error log I don’t see any issues.
mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | employees | | mysql | | performance_schema | | pitrdb | | sbtest | | sys | | test | +--------------------+ 8 rows in set (0.00 sec)
For SST method I’m using default and recommended Percona’s xtrabackup-v2.
So, what will happen if I initiate SST for one of the nodes in the cluster.
$ sudo service mysql stop * Stopping MySQL (Percona XtraDB Cluster) mysqld [OK] $ sudo rm /data/grastate.dat $ sudo service mysql start [sudo] password for marko: * Starting MySQL (Percona XtraDB Cluster) database server mysqld * State transfer in progress, setting sleep higher mysqld * The server quit without updating PID file (/data/galera2.pid).
It appears that SST failed with errors:
WSREP_SST: [ERROR] Cleanup after exit with status:1 (20151004 12:01:00.936) 2015-10-04 12:01:02 16136 [Note] WSREP: (cf98f684, 'tcp://0.0.0.0:4567') turning message relay requesting off 2015-10-04 12:01:12 16136 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '192.168.56.102' --datadir '/data/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '16136' --binlog 'percona-bin' : 1 (Operation not permitted) 2015-10-04 12:01:12 16136 [ERROR] WSREP: Failed to read uuid:seqno from joiner script. 2015-10-04 12:01:12 16136 [ERROR] WSREP: SST script aborted with error 1 (Operation not permitted) 2015-10-04 12:01:12 16136 [ERROR] WSREP: SST failed: 1 (Operation not permitted) 2015-10-04 12:01:12 16136 [ERROR] Aborting 2015-10-04 12:01:12 16136 [Warning] WSREP: 0.0 (galera3): State transfer to 1.0 (galera2) failed: -22 (Invalid argument) 2015-10-04 12:01:12 16136 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():731: Will never receive state. Need to abort.
The cause of SST failure is lost+found directory but in error log lost+found directory is not mentioned.
SST fails because xtrabackup ignores ignore-db-dir option and tries to synchronise lost+found directory which is owned by root user.
What will happen if I (for test) change the ownership of lost+found directory on donor nodes.
drwx------ 2 root root 49152 Oct 4 11:50 lost+found marko@galera3:/data# sudo chown -R mysql:mysql /data/lost+found marko@galera1:/data$ sudo chown -R mysql:mysql /data/lost+found marko@galera2:/data$ sudo service mysql start * Stale sst_in_progress file in datadir mysqld * Starting MySQL (Percona XtraDB Cluster) database server mysqld * State transfer in progress, setting sleep higher mysqld [OK] NODE2 ... drwxrwx--x 2 mysql mysql 4096 Oct 4 12:07 lost+found ...
SST succeeded and node is successfully joined/synced to the cluster.
To avoid this inconveniences just move databases from root directory.
Some of you will simply delete lost+found directory, but be aware, fsck may recreate lost+found directory and your cluster synchronisation will fail when you least expect it ;)
0 Comments:
Post a Comment