Wednesday, November 11, 2020

ProxySQL - Throttle for MySQL queries

ProxySQL is a great high availability and load balancing solution and it is mostly used for such purposes. But ProxySQL offers much more.

One of the nice features is the throttling mechanism for queries to the backends.

Imagine you have a very active system and applications are executing queries at very high rate, which is not so unusual nowadays. If just one of the queries slows down you could easily end up with many active sessions running the same query. Just one problematic query could cause high resource usage and general slowness.

Usually, DBA is called but DBA cannot modify a query, disable problematic application, or change database model without detailed analysis.

But ProxySQL could help.
Using ProxySQL we could delay execution of the problematic queries.
Yes, specific application request would still have a problem, but we would avoid general problem/downtime and "buy" some time for the fix.


Let's simulate such a situation in the test environment.

Run benchmark test using sysbench.

NUM_THREADS=1
TEST_DIR=/usr/share/sysbench/tests/include/oltp_legacy

sysbench \
  --test=${TEST_DIR}/oltp_simple.lua \
  --oltp-table-size=2000000 \
  --time=300 \
  --max-requests=0 \
  --mysql-table-engine=InnoDB \
  --mysql-user=sbtest \
  --mysql-password=sbtest \
  --mysql-port=3307 \
  --mysql-host=192.168.56.25 \
  --mysql-engine-trx=yes \
  --num-threads=$NUM_THREADS \
  prepare
  
sysbench \
  --test=${TEST_DIR}/oltp_simple.lua \
  --oltp-table-size=2000000 \
  --time=180 \
  --max-requests=0 \
  --mysql-table-engine=InnoDB \
  --mysql-user=sbtest \
  --mysql-password=sbtest \
  --mysql-port=3307 \
  --mysql-host=192.168.56.25 \
  --mysql-engine-trx=yes \
  --num-threads=$NUM_THREADS \
  run

Enable throttling mechanism and delay execution for all queries globally setting "mysql-default_query_delay=100".

  
ProxySQLServer> set mysql-default_query_delay=100;
ProxySQLServer> LOAD MYSQL VARIABLES TO RUNTIME; 
ProxySQLServer> SAVE MYSQL VARIABLES TO DISK;

Run test again and Check latency(ms).

  
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time

Initializing worker threads...

Threads started!

SQL statistics:
    queries performed:
        read:                            1774
        write:                           0
        other:                           0
        total:                           1774
    transactions:                        1774   (9.85 per sec.)
    queries:                             1774   (9.85 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          180.0942s
    total number of events:              1774

Latency (ms):
         min:                                  100.76 <<<<<<<<<<<<<<<<<
         avg:                                  101.51 <<<<< Throttling
         max:                                  129.17 <<<<<<<<<<<<<<<<<
         95th percentile:                      102.97
         sum:                               180083.66

Threads fairness:
    events (avg/stddev):           1774.0000/0.00
    execution time (avg/stddev):   180.0837/0.00

Disable throttling and reset ProxySQL counters.

    
ProxySQLServer> set mysql-default_query_delay=0;
ProxySQLServer> LOAD MYSQL VARIABLES TO RUNTIME; SAVE MYSQL VARIABLES TO DISK;
ProxySQLServer> select * from stats_mysql_query_digest_reset;

Check latency(ms).

  
Initializing worker threads...

Threads started!

SQL statistics:
    queries performed:
        read:                            641413
        write:                           0
        other:                           0
        total:                           641413
    transactions:                        641413 (3563.38 per sec.)
    queries:                             641413 (3563.38 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          180.0004s
    total number of events:              641413

Latency (ms):
         min:                                    0.19
         avg:                                    0.28
         max:                                   44.45
         95th percentile:                        0.43
         sum:                               179252.76

Threads fairness:
    events (avg/stddev):           641413.0000/0.00
    execution time (avg/stddev):   179.2528/0.00

Enable throttling for just a specific query using ProxySQL mysql query rules.

 
-- Find problematic query

ProxySQLServer> select hostgroup,username,count_star,
(count_star/(select Variable_Value from stats_mysql_global where Variable_Name='ProxySQL_Uptime')) 
as avg_per_sec, digest, digest_text from stats_mysql_query_digest order by count_star desc limit 10;

+-----------+----------+------------+-------------+--------------------+----------------------------------+
| hostgroup | username | count_star | avg_per_sec | digest             | digest_text                      |
+-----------+----------+------------+-------------+--------------------+----------------------------------+
| 2         | sbtest   | 641413     | 78          | 0xBF001A0C13781C1D | SELECT c FROM sbtest1 WHERE id=? |
+-----------+----------+------------+-------------+--------------------+----------------------------------+
1 row in set (0.00 sec)


-- Reset counters

ProxySQLServer> select * from stats_mysql_query_digest_reset;
+-----------+------------+----------+----------------+--------------------+----------------------------------+------------+------------+------------+-----------+----------+----------+-------------------+---------------+
| hostgroup | schemaname | username | client_address | digest             | digest_text                      | count_star | first_seen | last_seen  | sum_time  | min_time | max_time | sum_rows_affected | sum_rows_sent |
+-----------+------------+----------+----------------+--------------------+----------------------------------+------------+------------+------------+-----------+----------+----------+-------------------+---------------+
| 2         | sbtest     | sbtest   |                | 0xBF001A0C13781C1D | SELECT c FROM sbtest1 WHERE id=? | 641413     | 1601934890 | 1601935070 | 153023170 | 159      | 44349    | 0                 | 214399        |
+-----------+------------+----------+----------------+--------------------+----------------------------------+------------+------------+------------+-----------+----------+----------+-------------------+---------------+
1 row in set (0.00 sec)

Insert mysql query rule and enable throttling just for a specific query.

   
ProxySQLServer> insert into mysql_query_rules(rule_id,active,digest,delay,apply) values (1,1,'0xBF001A0C13781C1D',100,1);
Query OK, 1 row affected (0.00 sec)

LOAD MYSQL QUERY RULES TO RUNTIME;
SAVE MYSQL QUERY RULES TO DISK;

Compare "min_time" between executions.

 
Initializing worker threads...

Threads started!

SQL statistics:
    queries performed:
        read:                            1773
        write:                           0
        other:                           0
        total:                           1773
    transactions:                        1773   (9.85 per sec.)
    queries:                             1773   (9.85 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          180.0325s
    total number of events:              1773

Latency (ms):
         min:                                  100.78 <<<<<<<<<<<<<<<<<
         avg:                                  101.53 <<<<< Throttling
         max:                                  104.77 <<<<<<<<<<<<<<<<<
         95th percentile:                      102.97
         sum:                               180021.34

Threads fairness:
    events (avg/stddev):           1773.0000/0.00
    execution time (avg/stddev):   180.0213/0.00

 
ProxySQLServer> select * from stats_mysql_query_digest_reset;
+-----------+------------+----------+----------------+--------------------+----------------------------------+------------+------------+------------+-----------+----------+----------+-------------------+---------------+
| hostgroup | schemaname | username | client_address | digest             | digest_text                      | count_star | first_seen | last_seen  | sum_time  | min_time | max_time | sum_rows_affected | sum_rows_sent |
+-----------+------------+----------+----------------+--------------------+----------------------------------+------------+------------+------------+-----------+----------+----------+-------------------+---------------+
| 2         | sbtest     | sbtest   |                | 0xBF001A0C13781C1D | SELECT c FROM sbtest1 WHERE id=? | 1773       | 1601935408 | 1601935588 | 179697522 | 100681   | 104195   | 0                 | 594           |
+-----------+------------+----------+----------------+--------------------+----------------------------------+------------+------------+------------+-----------+----------+----------+-------------------+---------------+
1 row in set (0.01 sec)

Wednesday, January 24, 2018

Galera Cluster Schema Changes, Row Based Replication and Data Inconsistency

Galera Cluster is a virtually synchronous multi-master replication plug-in. When using Galera Cluster application can write to any node and transactions are then applied to all serves via row-based replication events.

This is built-in Mysql row-based replication which supports replication with differing table definitions between Master and Slave.
So, when using row-based repplication source and target table do not have to be identical. A table on master can have more or fewer columns or use different data types.

But there are limitations you must watch over depending on MySQL version you are running.
- The database and table names must be the same on both Master and Slave
- Columns must be in the same order before any additional column
- Each extra column must have default value
- ...

Newer MySQL versions may tolerate more differences between source and target table - check documentation for your version.


I want to show you what could happen with your data if you do not pay attention on this limitations.


Suppose I have 3-node MariaDB Galera Cluster with table t.
I want to add several columns to the table while database is used by an application.

For such task I will use built-in Rolling Schema Change (RSU) method which enables me to perform schema changes on node without impact on rest of the cluster.

Add column c4 to the table t following rules above for row-based replication.

Table t has three columns and one row inserted.
NODE1

MariaDB [testdb]> create table t (c1 varchar(10), c2 varchar(10), c3 varchar(10));
Query OK, 0 rows affected (0.37 sec)

MariaDB [testdb]> insert into t values ('n1-1','n1-1','n1-1');
Query OK, 1 row affected (0.00 sec)


NODE2

MariaDB [testdb]> select * from t;
+------+------+------+
| c1   | c2   | c3   |
+------+------+------+
| n1-1 | n1-1 | n1-1 |
+------+------+------+
1 row in set (0.00 sec)


NODE3

MariaDB [testdb]> select * from t;
+------+------+------+
| c1   | c2   | c3   |
+------+------+------+
| n1-1 | n1-1 | n1-1 |
+------+------+------+
1 row in set (0.01 sec)

I will enable RSU mode which ensures that this server will not impact the rest of the cluster during ALTER command execution.

Add column c4 and INSERT row simulating application activity.
MariaDB [testdb]> set session wsrep_OSU_method='RSU';
Query OK, 0 rows affected (0.00 sec)

MariaDB [testdb]> alter table t add column c4 varchar(10);
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

MariaDB [testdb]> set session wsrep_OSU_method='TOI';
Query OK, 0 rows affected (0.00 sec)

MariaDB [testdb]> insert into t(c1,c2,c3) values ('n1-1','n1-1','n1-1');
Query OK, 1 row affected (0.13 sec)

While table definition is different between Node1 and rest of the cluster INSERT few more rows on other nodes.

NODE2

insert into t(c1,c2,c3) values ('n2-1','n2-1','n2-1');


NODE3

insert into t(c1,c2,c3) values ('n3-1','n3-1','n3-1');

Check rows from table t.
NODE1

MariaDB [testdb]> select * from t;
+------+------+------+------+
| c1   | c2   | c3   | c4   |
+------+------+------+------+
| n1-1 | n1-1 | n1-1 | NULL |
| n1-1 | n1-1 | n1-1 | NULL |
| n2-1 | n2-1 | n2-1 | NULL |
| n3-1 | n3-1 | n3-1 | NULL |
+------+------+------+------+
4 rows in set (0.00 sec)

NODE2

MariaDB [testdb]> select * from t;
+------+------+------+
| c1   | c2   | c3   |
+------+------+------+
| n1-1 | n1-1 | n1-1 |
| n1-1 | n1-1 | n1-1 |
| n2-1 | n2-1 | n2-1 |
| n3-1 | n3-1 | n3-1 |
+------+------+------+
4 rows in set (0.00 sec)


NODE3

MariaDB [testdb]> select * from t;
+------+------+------+
| c1   | c2   | c3   |
+------+------+------+
| n1-1 | n1-1 | n1-1 |
| n1-1 | n1-1 | n1-1 |
| n2-1 | n2-1 | n2-1 |
| n3-1 | n3-1 | n3-1 |
+------+------+------+
4 rows in set (0.01 sec)

As you can notice everything is OK with my data.

Add new column to Node2 and Node3 following the same steps as for Node1.

NODE2

MariaDB [testdb]> set session wsrep_OSU_method='RSU';
Query OK, 0 rows affected (0.00 sec)

MariaDB [testdb]> alter table t add column c4 varchar(10);
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

MariaDB [testdb]> set session wsrep_OSU_method='TOI';
Query OK, 0 rows affected (0.00 sec)


NODE3

MariaDB [testdb]> set session wsrep_OSU_method='RSU';
Query OK, 0 rows affected (0.00 sec)

MariaDB [testdb]> alter table t add column c4 varchar(10);
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

MariaDB [testdb]> set session wsrep_OSU_method='TOI';
Query OK, 0 rows affected (0.00 sec)

And my task is completed. I have successfully changed model of the table.


But what can happen if I add new column between existing columns.
Remember, this is not permitted for a row-based replication and can cause replication to brake or something even worse.

Enable RSU mode on Node1 and add new column c11 after c1 column.
INSERT row simulating active application during schema change.

NODE1

MariaDB [testdb]> set session wsrep_OSU_method='RSU';
Query OK, 0 rows affected (0.00 sec)

MariaDB [testdb]>
MariaDB [testdb]> alter table t add column c11 varchar(10) after c1;
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

MariaDB [testdb]> set session wsrep_OSU_method='TOI';
Query OK, 0 rows affected (0.00 sec)

MariaDB [testdb]> insert into t(c1,c2,c3) values ('n1-1','n1-1','n1-1');
Query OK, 1 row affected (0.01 sec)

MariaDB [testdb]> select * from t;
+------+------+------+------+------+
| c1   | c11  | c2   | c3   | c4   |
+------+------+------+------+------+
| n1-1 | NULL | n1-1 | n1-1 | NULL |
| n1-1 | NULL | n1-1 | n1-1 | NULL |
| n2-1 | NULL | n2-1 | n2-1 | NULL |
| n3-1 | NULL | n3-1 | n3-1 | NULL |
| n1-1 | NULL | n1-1 | n1-1 | NULL |
+------+------+------+------+------+
5 rows in set (0.00 sec)

INSERT row on other nodes because Galera Cluster allows us write on any node in the cluster configuration.

NODE2

MariaDB [testdb]> insert into t(c1,c2,c3) values ('n2-1','n2-1','n2-1');
Query OK, 1 row affected (0.01 sec)

MariaDB [testdb]> select * from t;
+------+------+------+------+
| c1   | c2   | c3   | c4   |
+------+------+------+------+
| n1-1 | n1-1 | n1-1 | NULL |
| n1-1 | n1-1 | n1-1 | NULL |
| n2-1 | n2-1 | n2-1 | NULL |
| n3-1 | n3-1 | n3-1 | NULL |
| n1-1 | NULL | n1-1 | n1-1 |
| n2-1 | n2-1 | n2-1 | NULL |
+------+------+------+------+
6 rows in set (0.00 sec)


NODE3

MariaDB [testdb]> insert into t(c1,c2,c3) values ('n3-1','n3-1','n3-1');
Query OK, 1 row affected (0.01 sec)

MariaDB [testdb]> select * from t;
+------+------+------+------+
| c1   | c2   | c3   | c4   |
+------+------+------+------+
| n1-1 | n1-1 | n1-1 | NULL |
| n1-1 | n1-1 | n1-1 | NULL |
| n2-1 | n2-1 | n2-1 | NULL |
| n3-1 | n3-1 | n3-1 | NULL |
| n1-1 | NULL | n1-1 | n1-1 |
| n2-1 | n2-1 | n2-1 | NULL |
| n3-1 | n3-1 | n3-1 | NULL |
+------+------+------+------+
7 rows in set (0.00 sec)

INSERT commands were successfully executed and everything is OK with my replication.
I don't have any errors in error.log that suggests that I have any problem.

But check contest of table t on the first node where new column is added.

NODE1

MariaDB [testdb]> select * from t;
+------+------+------+------+------+
| c1   | c11  | c2   | c3   | c4   |
+------+------+------+------+------+
| n1-1 | NULL | n1-1 | n1-1 | NULL |
| n1-1 | NULL | n1-1 | n1-1 | NULL |
| n2-1 | NULL | n2-1 | n2-1 | NULL |
| n3-1 | NULL | n3-1 | n3-1 | NULL |
| n1-1 | NULL | n1-1 | n1-1 | NULL |
| n2-1 | n2-1 | n2-1 | NULL | NULL |
| n3-1 | n3-1 | n3-1 | NULL | NULL |
+------+------+------+------+------+
7 rows in set (0.00 sec)

Notice how rows differ between nodes, and we should have exactly the same data on all tree nodes.


Let's complete schema changes on other two nodes.
NODE2

MariaDB [testdb]> set session wsrep_OSU_method='RSU';
Query OK, 0 rows affected (0.00 sec)

MariaDB [testdb]> alter table t add column c11 varchar(10) after c1;
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

MariaDB [testdb]> set session wsrep_OSU_method='TOI';
Query OK, 0 rows affected (0.00 sec)


NODE3

MariaDB [testdb]> set session wsrep_OSU_method='RSU';
Query OK, 0 rows affected (0.00 sec)

MariaDB [testdb]> alter table t add column c11 varchar(10) after c1;
Query OK, 0 rows affected (0.34 sec)
Records: 0  Duplicates: 0  Warnings: 0

MariaDB [testdb]> set session wsrep_OSU_method='TOI';
Query OK, 0 rows affected (0.00 sec)

I have successfully added new column, did not brake reapplication and everything seems OK, but my data is not consistent between nodes.

NODE1

MariaDB [testdb]> select * from t;
+------+------+------+------+------+
| c1   | c11  | c2   | c3   | c4   |
+------+------+------+------+------+
| n1-1 | NULL | n1-1 | n1-1 | NULL |
| n1-1 | NULL | n1-1 | n1-1 | NULL |
| n2-1 | NULL | n2-1 | n2-1 | NULL |
| n3-1 | NULL | n3-1 | n3-1 | NULL |
| n1-1 | NULL | n1-1 | n1-1 | NULL |
| n2-1 | n2-1 | n2-1 | NULL | NULL |
| n3-1 | n3-1 | n3-1 | NULL | NULL |
+------+------+------+------+------+
7 rows in set (0.00 sec)


NODE2

MariaDB [testdb]> select * from t;
+------+------+------+------+------+
| c1   | c11  | c2   | c3   | c4   |
+------+------+------+------+------+
| n1-1 | NULL | n1-1 | n1-1 | NULL |
| n1-1 | NULL | n1-1 | n1-1 | NULL |
| n2-1 | NULL | n2-1 | n2-1 | NULL |
| n3-1 | NULL | n3-1 | n3-1 | NULL |
| n1-1 | NULL | NULL | n1-1 | n1-1 |
| n2-1 | NULL | n2-1 | n2-1 | NULL |
| n3-1 | NULL | n3-1 | n3-1 | NULL |
+------+------+------+------+------+
7 rows in set (0.00 sec)


NODE3

MariaDB [testdb]> select * from t;
+------+------+------+------+------+
| c1   | c11  | c2   | c3   | c4   |
+------+------+------+------+------+
| n1-1 | NULL | n1-1 | n1-1 | NULL |
| n1-1 | NULL | n1-1 | n1-1 | NULL |
| n2-1 | NULL | n2-1 | n2-1 | NULL |
| n3-1 | NULL | n3-1 | n3-1 | NULL |
| n1-1 | NULL | NULL | n1-1 | n1-1 |
| n2-1 | NULL | n2-1 | n2-1 | NULL |
| n3-1 | NULL | n3-1 | n3-1 | NULL |
+------+------+------+------+------+
7 rows in set (0.00 sec)


Data inconsistency is the worst problem that could happen in synchronous cluster configuration.
It could happen without any notice, but sooner or later it will stop reapplication process and failing node will be excluded from the cluster.





REFERENCE
https://dev.mysql.com/doc/refman/5.7/en/replication-features-differing-tables.html


Saturday, January 20, 2018

Global variable wsrep_OSU_method=RSU ignored on MariaDB Galera Cluster 10.0.32

Galera Cluster is great solution for application clustering but it has some limitations. One of the most critical limitations is how Galera Cluster performs model changes.

From Galera Cluster documentation:
http://galeracluster.com/documentation-webpages/clusterstallonalter.html
What’s happening is a side effect of a multi-master cluster with several appliers.
The cluster needs to control when a DDL statement ends in relation to other 
transactions, in order to deterministically detect conflicts and schedule
parallel appliers. Effectively, the DDL statement must execute in isolation.

Galera Cluster has a 65K window of tolerance for transactions applied in parallel,
but the cluster must wait when ALTER commands take too long.

Basically if you run DDL statement on slightly bigger table you will probably freeze your cluster for a duration of the DDL statement. And you cannot cancel this operation.

Built-in method to perform model changes is using "Rolling Schema Upgrade" method.

From MariaDB Galera Cluster documentation you have note on how to perform DDL changes on Galera Cluster.
https://mariadb.com/kb/en/library/tips-on-converting-to-galera/
A "fast" DDL may as well be done via TOI. This is a tentative list of such:
* CREATE/DROP/RENAME DATABASE/TABLE
* ALTER to change DEFAULT
* ALTER to change definition of ENUM or SET (see caveats in manual)
* Certain PARTITION ALTERs that are fast.
* DROP INDEX (other than PRIMARY KEY)
* ADD INDEX?
* Other ALTERs on 'small' tables.
* With 5.6 and especially 5.7 having a lot of ALTER ALGORITHM=INPLACE cases, 
check which ALTERs should be done which way.
Otherwise, use RSU. Do the following separately for each node: 
1. SET GLOBAL wsrep_OSU_method='RSU'; -- This also takes the node out of the cluster. 
2. ALTER TABLE ... 
3. SET GLOBAL wsrep_OSU_method='TOI'; -- Puts back in, leading to resync 
(hopefully a quick IST, not a slow SST)

Let’s test this solution.

My testing enviromnent:
Server version: 10.0.32-MariaDB-1~trusty-wsrep mariadb.org binary distribution, wsrep_25.20.rc3fc46e

I will add column to sbtest table.

Node1:
SET GLOBAL wsrep_OSU_method='RSU';

alter table sbtest add column text varchar(20);

In error.log I didn’t noticed any information about node being desynced from the cluster.

Node2 lock check:
MariaDB [(none)]> select trx_query,trx_tables_locked
                  from information_schema.innodb_trx;
+------------------------------------------------------+-------------------+
| trx_query                                            | trx_tables_locked |
+------------------------------------------------------+-------------------+
| alter table sbtest add column text varchar(20)       |                 1 |
+------------------------------------------------------+--------------------
1 row in set (0.00 sec)

Even though I globally enabled RSU method on Node1 with statement noted in documentation Node1 is not desynced from the cluster and DDL statement is executed on other nodes.
If this was production database I would have big problems during ALTER command.

But what if I enable RSU method using session level variable.

Node1:
SET SESSION wsrep_OSU_method=RSU;

alter table sbtest add column text1 varchar(20);

SET SESSION wsrep_OSU_method=TOI;

From error.log:
180120 11:26:34 [Note] WSREP: Member 0.0 (mariadb-test-1) desyncs itself from group
180120 11:26:34 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 8)
180120 11:26:34 [Note] WSREP: Provider paused at b95703dd-fd37-11e7-a31f-1af15b5193fa:8 (21)
180120 11:35:03 [Note] WSREP: resuming provider at 21
180120 11:35:03 [Note] WSREP: Provider resumed.
180120 11:35:03 [Note] WSREP: Member 0.0 (mariadb-test-1) resyncs itself to group
180120 11:35:03 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 8)
180120 11:35:03 [Note] WSREP: Member 0.0 (mariadb-test-1) synced with group.
180120 11:35:03 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 8)
180120 11:35:03 [Note] WSREP: Synchronized with group, ready for connections

Node2 lock check:
MariaDB [(none)]> select trx_query,trx_tables_locked 
                  from information_schema.innodb_trx;
Empty set (0.00 sec)

Now Node1 is desynced from the cluster and I am running ALTER statement locally without impact on the rest of the cluster.


I don’t know if this behaviour is expected for this version, or it is bug.

When running Galera Cluster in production it is VERY important to take some time and perform tests especially if you need to run ALTER statement on big tables.
Expect the unexpected even when running statements from documentation.


Monday, November 13, 2017

HASH GROUP BY not used when using more that 354 aggregate functions

Few days ago we had performance problem with one of our main application views. This was complex view that used a lot of aggregate function. Functions were used to transpose rows into columns.

When developer added few more aggregate functions for a new columns, query performance changed significantly and we had performance problem.

After quick analysis I have noticed one change in the execution plan.

HASH GROUP BY aggregation was replaced with less performant SORT GROUP BY. I have tried to force HASH GROUP BY using hints but nothing helped.


We tried to reproduce problem using dummy tables and then colleague found what was triggering plan change.

In this example I have query with 354 unique aggregate functions which is using HASH GROUP BY.

SELECT
          *
      FROM (SELECT LEVEL ID
             FROM DUAL CONNECT BY LEVEL < 1000) VANJSKI,
          (  SELECT
                    123 UNUTARNJI_ID,
                     sum(1) kolona0,
                     sum(1) kolona1,
                     sum(2) kolona2,
...
...
...
                     sum(350) kolona350    ,
                     sum(351) kolona351    ,
                     sum(352) kolona352    ,
                     sum(353) kolona353    ,
                     sum(354) kolona354
               FROM DUAL
           GROUP BY 123) UNUTARNJI
    WHERE     VANJSKI.ID = UNUTARNJI.UNUTARNJI_ID(+);

Plan hash value: 2294628051
---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |      |      1 |        |       |     5 (100)|    999 |00:00:00.01 |       |       |          |
|*  1 |  HASH JOIN OUTER               |      |      1 |      1 |  4631 |     5  (20)|    999 |00:00:00.01 |  2293K|  2293K| 1549K (0)|
|   2 |   VIEW                         |      |      1 |      1 |    13 |     2   (0)|    999 |00:00:00.01 |       |       |          |
|   3 |    CONNECT BY WITHOUT FILTERING|      |      1 |        |       |            |    999 |00:00:00.01 |       |       |          |
|   4 |     FAST DUAL                  |      |      1 |      1 |       |     2   (0)|      1 |00:00:00.01 |       |       |          |
|   5 |   VIEW                         |      |      1 |      1 |  4618 |     2   (0)|      1 |00:00:00.01 |       |       |          |
|   6 |    HASH GROUP BY               |      |      1 |      1 |       |     2   (0)|      1 |00:00:00.01 |   677K|   677K|  723K (0)|
|   7 |     FAST DUAL                  |      |      1 |      1 |       |     2   (0)|      1 |00:00:00.01 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------

Notice what will happen if I change "sum(1) kolona0" function and add one more unique function.

SELECT
          *
      FROM (SELECT LEVEL ID
             FROM DUAL CONNECT BY LEVEL < 1000) VANJSKI,
          (  SELECT
                    123 UNUTARNJI_ID,
                     sum(355) kolona0,
                     sum(1) kolona1,
                     sum(2) kolona2,
...
...
...
                     sum(350) kolona350    ,
                     sum(351) kolona351    ,
                     sum(352) kolona352    ,
                     sum(353) kolona353    ,
                     sum(354) kolona354
               FROM DUAL
           GROUP BY 123) UNUTARNJI
    WHERE     VANJSKI.ID = UNUTARNJI.UNUTARNJI_ID(+);

Plan hash value: 2326946862
---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |      |      1 |        |       |     5 (100)|    999 |00:00:00.01 |       |       |          |
|*  1 |  HASH JOIN OUTER               |      |      1 |      1 |  4631 |     5  (20)|    999 |00:00:00.01 |  2293K|  2293K| 1645K (0)|
|   2 |   VIEW                         |      |      1 |      1 |    13 |     2   (0)|    999 |00:00:00.01 |       |       |          |
|   3 |    CONNECT BY WITHOUT FILTERING|      |      1 |        |       |            |    999 |00:00:00.01 |       |       |          |
|   4 |     FAST DUAL                  |      |      1 |      1 |       |     2   (0)|      1 |00:00:00.01 |       |       |          |
|   5 |   VIEW                         |      |      1 |      1 |  4618 |     2   (0)|      1 |00:00:00.01 |       |       |          |
|   6 |    SORT GROUP BY               |      |      1 |      1 |       |     2   (0)|      1 |00:00:00.01 | 20480 | 20480 |18432  (0)|
|   7 |     FAST DUAL                  |      |      1 |      1 |       |     2   (0)|      1 |00:00:00.01 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------

Query execution plan changed - HASH GROUP BY was replaced with SORT GROUP BY.


This was obviously limitation for HASH GROUP BY but I couldn't find more information using Oracle docs or Google so I have asked Oracle support for a confirmation.

From Oracle support I have received answer that similar case was bug closed as not bug, without workaround. Using default DB_BLOCK_SIZE, the limitation is set to 354 aggregate functions.

To solve performance problem we have changed view to avoid HASH GROUP BY limitation.

Testing environment - Oracle Database 12c Enterprise Edition Release 12.1.0.2.0



Saturday, October 21, 2017

Beware of intensive slow query logging when using - log_queries_not_using_indexes

MySQL slow query log is great for identifying slow queries that are good candidates for optimisation. Slow query logging is disabled by default, but it is activated by DBA's or developers on most environments.

You can use slow query log to record all the traffic but be careful with this action as logging all traffic could be very I/O intensive and could have negative impact on general performance. It is recommended to record all traffic only for specific time periods.

This is why slow query logging is controlled with long_query_time parameter to log only slow queries.
But there is another parameter to think about - log_queries_not_using_indexes.

By default log_queries_not_using_indexes is disabled. If you have this parameter turned on you will log queries that don’t use an index, or that perform a full index scan where the index doesn't limit the number of rows - regardless of time taken.

If you have long_query_time configured to reasonable time, and still notice that queries are intensively logged in slow query log file, then you probably have enabled log_queries_not_using_indexes.

Enabling this parameter you’re practically saying that full scans are "evil" and should be considered for optimisation. But full scan doesn’t always mean that query is slow. In some situations query optimizer chooses full table scan as better option than index or you are probably querying very small table.


For instance, on several occasions I've noticed slow query logs flooded with queries like this:

# Time: 171021 17:51:45
# User@Host: monitor[monitor] @ localhost []
# Thread_id: 1492974  Schema:   QC_hit: No
# Query_time: 0.000321  Lock_time: 0.000072  Rows_sent: 0  Rows_examined: 1
# Full_scan: Yes  Full_join: No  Tmp_table: Yes  Tmp_table_on_disk: No
# Filesort: No  Filesort_on_disk: No  Merge_passes: 0  Priority_queue: No
SET timestamp=1508608305;
SELECT
      SCHEMA_NAME
    FROM information_schema.schemata
    WHERE SCHEMA_NAME NOT IN ('mysql', 'performance_schema', 'information_schema');

+------+-------------+----------+------+---------------+------+---------+------+------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+------+-------------+
|    1 | SIMPLE      | schemata | ALL  | NULL          | NULL | NULL    | NULL | NULL | Using where |
+------+-------------+----------+------+---------------+------+---------+------+------+-------------+
Notice, Query_time: 0.000321.

Should I optimize query that is running 0.000321 secs with adding indexes. Probably not. But anyway, my log is flooded with this or similar queries.

I don’t see that parameter very useful and I would leave it on default value to avoid possible problems with intensive query logging.


Tuesday, October 17, 2017

Enable SSL-encryption for MariaDB Galera Cluster

Imagine you have MariaDB Galera cluster with nodes running in different data centers. Data centers are not connected via secured VPN tunnel.
As database security is very important you must ensure that traffic between nodes is fully secured.

Galera Cluster supports encrypted connections between nodes using SSL protocol and in this post I want to show how to encrypt all cluster communication using SSL encryption.


Check current SSL configuration.
MariaDB [(none)]> SHOW VARIABLES LIKE 'have_ssl';
+---------------+----------+
| Variable_name | Value    |
+---------------+----------+
| have_ssl      | DISABLED |  ###==> SSL Disabled
+---------------+----------+
1 row in set (0.01 sec)

MariaDB [(none)]> status
--------------
mysql  Ver 15.1 Distrib 10.0.29-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2

Connection id:          56
Current database:
Current user:           marko@localhost
SSL:                    Not in use   ###==> SSL is not used
Current pager:          stdout
Using outfile:          ''
Using delimiter:        ;
Server:                 MariaDB
Server version:         10.0.17-MariaDB-1~trusty-wsrep-log mariadb.org binary distribution, wsrep_25.10.r4144
Protocol version:       10
Connection:             Localhost via UNIX socket
Server characterset:    utf8
Db     characterset:    utf8
Client characterset:    utf8
Conn.  characterset:    utf8
UNIX socket:            /var/run/mysqld/mysqld.sock
Uptime:                 7 days 42 min 29 sec

Threads: 52  Questions: 10  Slow queries: 0  Opens: 0  Flush tables: 1  Open tables: 63  Queries per second avg: 0.000
--------------
SSL is currently disabled.

To fully secure all cluster communication we must SSL-encrypt replication traffic within Galera Cluster, State Snapshot Transfer and traffic between database server and client.

We will create SSL Certificates and Keys using openssl.

# Create new folder for certificates
mkdir -p /etc/mysql/ssl
cd /etc/mysql/ssl

# Create CA certificate
# Generate CA key
openssl genrsa 2048 > ca-key.pem

# Using the CA key, generate the CA certificate
openssl req -new -x509 -nodes -days 3600 \
> -key ca-key.pem -out ca-cert.pem
-----
Country Name (2 letter code) [AU]:HR
State or Province Name (full name) [Some-State]:Zagreb
Locality Name (eg, city) []:Zagreb
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Dummycorp
Organizational Unit Name (eg, section) []:IT
Common Name (e.g. server FQDN or YOUR name) []:myu1.localdomain
Email Address []:marko@dummycorp.com


# Create server certificate, remove passphrase, and sign it
# Create the server key
openssl req -newkey rsa:2048 -days 3600 \
>         -nodes -keyout server-key.pem -out server-req.pem
-----
Country Name (2 letter code) [AU]:HR
State or Province Name (full name) [Some-State]:Zagreb
Locality Name (eg, city) []:Zagreb
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Dummycorp
Organizational Unit Name (eg, section) []:IT
##==> Use the ".localdomain" only on the first certificate.
Common Name (e.g. server FQDN or YOUR name) []:myu1
Email Address []:marko@dummycorp.com

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:secretpassword
An optional company name []:

# Process the server RSA key
openssl rsa -in server-key.pem -out server-key.pem

# Sign the server certificate
openssl x509 -req -in server-req.pem -days 3600 \
>         -CA ca-cert.pem -CAkey ca-key.pem -set_serial 01 -out server-cert.pem


# Create client certificate, remove passphrase, and sign it
# Create the client key
openssl req -newkey rsa:2048 -days 3600 \
>         -nodes -keyout client-key.pem -out client-req.pem
-----
Country Name (2 letter code) [AU]:HR
State or Province Name (full name) [Some-State]:Zagreb
Locality Name (eg, city) []:Zagreb
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Dummycorp
Organizational Unit Name (eg, section) []:IT
Common Name (e.g. server FQDN or YOUR name) []:myu1
Email Address []:marko@dummycorp.com

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:secretpassword
An optional company name []:

# Process client RSA key
openssl rsa -in client-key.pem -out client-key.pem

# Sign the client certificate
openssl x509 -req -in client-req.pem -days 3600 \
>         -CA ca-cert.pem -CAkey ca-key.pem -set_serial 01 -out client-cert.pem


# Verify certificates

openssl verify -CAfile ca-cert.pem server-cert.pem client-cert.pem

server-cert.pem: OK
client-cert.pem: OK

If verification succeeds copy certificates to all nodes in the cluster.
Set mysql as owner of the files.

# Copy
scp -r /etc/mysql/ssl node1:/etc/mysql
scp -r /etc/mysql/ssl node2:/etc/mysql
scp -r /etc/mysql/ssl node3:/etc/mysql

# Change owner
node1: chown -R mysql:mysql /etc/mysql/ssl
node2: chown -R mysql:mysql /etc/mysql/ssl
node3: chown -R mysql:mysql /etc/mysql/ssl


Secure database and client connections.

Add following lines in my.cnf configuration file.
# MySQL Server
[mysqld]
ssl-ca=/etc/mysql/ssl/ca-cert.pem
ssl-cert=/etc/mysql/ssl/server-cert.pem
ssl-key=/etc/mysql/ssl/server-key.pem

# MySQL Client
[client]
ssl-ca=/etc/mysql/ssl/ca-cert.pem
ssl-cert=/etc/mysql/ssl/client-cert.pem
ssl-key=/etc/mysql/ssl/client-key.pem


Secure replication traffic.

Define paths to the key, certificate and certificate authority files. Galera Cluster will use this files for encrypting and decrypting replication traffic.

wsrep_provider_options="socket.ssl_key=/etc/mysql/ssl/server-key.pem;socket.ssl_cert=/etc/mysql/ssl/server-cert.pem;socket.ssl_ca=/etc/mysql/ssl/ca-cert.pem"


Enable SSL for mysqldump and Xtrabackup.

Create user which requires SSL for connection.

MariaDB [(none)]> CREATE USER 'sstssl'@'localhost' IDENTIFIED BY 'sstssl';
Query OK, 0 rows affected (0.03 sec)

MariaDB [(none)]> GRANT PROCESS, RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sstssl'@'localhost' REQUIRE ssl;
Query OK, 0 rows affected (0.02 sec)

MariaDB [(none)]> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.00 sec)

I will use this user for replication.
Change wsrep_sst_auth in my.cnf configuration file.

wsrep_sst_auth="sstssl:sstssl"


Now we must recreate whole cluster.
If I restart only one node, while others are running, node won't join to existing cluster.
You can notice this errors in mysql error log.

171017  3:20:29 [ERROR] WSREP: handshake with remote endpoint ssl://192.168.56.22:4567 failed: asio.ssl:336031996: 'unknown protocol' ( 336031996: 'error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol')
171017  3:20:29 [ERROR] WSREP: handshake with remote endpoint ssl://192.168.56.23:4567 failed: asio.ssl:336031996: 'unknown protocol' ( 336031996: 'error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol')
Shutdown the cluster and bootstrap it.


Check.
MariaDB [(none)]> status
--------------
mysql  Ver 15.1 Distrib 10.0.29-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2

Connection id:          87
Current database:
Current user:           marko@localhost
SSL:                    Cipher in use is DHE-RSA-AES256-SHA  ###==> SSL is used
Current pager:          stdout
Using outfile:          ''
Using delimiter:        ;
Server:                 MariaDB
Server version:         10.0.17-MariaDB-1~trusty-wsrep-log mariadb.org binary distribution, wsrep_25.10.r4144
Protocol version:       10
Connection:             Localhost via UNIX socket
Server characterset:    utf8
Db     characterset:    utf8
Client characterset:    utf8
Conn.  characterset:    utf8
UNIX socket:            /var/run/mysqld/mysqld.sock
Uptime:                 1 min 4 sec

Threads: 52  Questions: 676  Slow queries: 16  Opens: 167  Flush tables: 1  Open tables: 31  Queries per second avg: 10.562
--------------


MariaDB [(none)]> SHOW VARIABLES LIKE 'have_ssl';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| have_ssl      | YES   |
+---------------+-------+
1 row in set (0.01 sec)



REFERENCES
6.4.3.2 Creating SSL Certificates and Keys Using openssl
MySQL : Configure SSL Connections

Thursday, September 28, 2017

Delete large amounts of data on Galera Cluster using pt-archiver

Galera Cluster is excellent virtually synchronous multi-master database cluster. It has many benefits which you can check on GaleraCluster.
But beside benefits it has some limitations and one of them is handling large transactions.

Large replication data sets could degrade performance of whole cluster causing cluster freezing, increased memory consumption, crashing nodes, etc. To avoid this issues it is recommended to split large transactions into smaller chunks.

In this post I want to show you how to safely delete large amounts of data on Galera Cluster. You can perform this task using several tools or writing custom procedures to split large transaction into chunks. In this example I will use pt-archiver tool from Percona.


Imagine you have received task to perform data cleanup in devices table for several schemas.
It looks like very simple task - delete rows from devices table where device_cookie is 0.
delete from devices where device_cookie = 0

But, although statement looks simple it could potentially freeze whole cluster so before executing delete statement count how many rows you need to delete.

In my case I have to delete few millions of rows which is too much for one transaction so I need to split transaction into smaller chunks.

mysql> select count(*) from devices;
+----------+
| count(*) |
+----------+
|  2788504 |
+----------+

mysql> select count(*) - (select count(*) from devices where device_cookie = 0) 
       from devices;
+----------+
| count(*) |
+----------+
|      208 |
+----------+

I have to delete around 2.7 millions of rows.

This is command I will use:
pt-archiver --source h=localhost,u=marko,p="passwd",D=sch_testdb,t=devices \
--purge --where "device_cookie = 0" --sleep-coef 1.0 --txn-size 1000

--purge - delete rows.
--where "device_cookie = 0" - filter rows you want to delete.
--sleep-coef 1.0 - throttle delete process to avoid pause signals from cluster.
--txn-size 1000 - this is chunk size for every transaction.

# time pt-archiver --source h=localhost,u=marko,p="passwd",D=sch_testdb,t=devices \
--purge --where "device_cookie = 0" --sleep-coef 1.0 --txn-size 1000

real 3m32.532s
user 0m17.268s
sys 0m2.460s

Check after delete finished.
mysql> select count(*) from devices;
+----------+
| count(*) |
+----------+
|      208 |
+----------+
1 row in set (0.00 sec)

As I have to perform delete for several schemas, I have created simple shell script which iterates through schema list and executes pt-archiver command.

# cat delete_rows.sh
#!/bin/bash

LOGFILE=/opt/skripte/schema/table_delete_rows.log
SCHEMA_LIST=/opt/skripte/schema/schema_list.conf

# Get schema list and populate conf file
mysql -B -u marko -ppasswd --disable-column-names --execute "select schema_name from information_schema.schemata where schema_name like 'sch_%' and schema_name <> 'sch_sys'" > $SCHEMA_LIST

while IFS= read -r schema; do

  START=`date +%s`

  echo "`date`=> Deleting rows from table in schema: $schema"

  pt-archiver --source h=localhost,u=marko,p="passwd",D=$schema,t=devices --purge --where "device_cookie = 0" --sleep-coef 1.0 --txn-size 500

  SPENT=$(((`date +%s` - $START) / 60))

  echo "`date`=> Finished deleting in schema - spent: $SPENT mins"
  echo "*************************************************************************"

done <$SCHEMA_LIST >> $LOGFILE

exit 0