Restore to Restore Point on Standard Edition (no Flashback technology)

Saturday, February 28, 2015 0 comments
Restore points and Flashback database are nice features introduced in 10g database that provide efficient point in time recovery to reverse unwanted data changes.

But what if you have Standard Edition database:

SQL> shutdown immediate;

SQL> startup mount;

SQL> alter database flashback on;
alter database flashback on
*
ERROR at line 1:
ORA-00439: feature not enabled: Flashback Database

In Standard Edition you don’t have Flashback Database feature, but you can still create restore points and perform incomplete recoveries to restore point.


Create test table and insert status row.

SQL> create table admin.test_restore (datum date, komentar varchar2(100));
Table created.

SQL> insert into admin.test_restore values (sysdate, 'Before Restore Point');
1 row created.

SQL> commit;
Commit complete.


Create restore point here.

SQL> create restore point RP_UPGRADE;

Restore point created.


SQL> select scn, to_char(time,'dd.mm.yyyy hh24:mi:ss') time, name
  2  from v$restore_point;

       SCN TIME                NAME
---------- ------------------- ---------------------
    580752 27.02.2015 10:31:19 RP_UPGRADE

Notice how name of restore point is associated with SCN of the database.


Now you can perform potentially dangerous operations like database upgrades, table modifications, truncating data and like.

I will enter some status data for later checks.

SQL> insert into admin.test_restore values (sysdate, 'After Restore Point');
1 row created.

SQL> insert into admin.test_restore values (sysdate, 'Upgrade actions performed');
1 row created.

SQL> commit;
Commit complete.


Check table.

SQL> alter session set nls_date_format='dd.mm.yyyy hh24:mi:ss';
Session altered.

SQL> select datum, komentar from admin.test_restore order by datum;

DATUM               KOMENTAR
------------------- ------------------------------
27.02.2015 10:30:39 Before Restore Point
27.02.2015 10:31:45 After Restore Point
27.02.2015 10:31:55 Upgrade actions performed


Suppose we had some problems and want to "rewind" database to restore point. In EE we would perform flashback database to restore point but in SE we will use different approach.


Shutdown database and startup mount.

RMAN> shutdown immediate;

using target database control file instead of recovery catalog
database closed
database dismounted
Oracle instance shut down

RMAN> startup mount;

connected to target database (not started)
Oracle instance started
database mounted

Total System Global Area     471830528 bytes

Fixed Size                     2254344 bytes
Variable Size                247466488 bytes
Database Buffers             213909504 bytes
Redo Buffers                   8200192 bytes


Restore and recover database until restore point RP_UPGRADE.

RMAN> restore database until restore point RP_UPGRADE;

Starting restore at 27.02.2015 10:36:26
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=247 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to +DATA1/ora11gr2/datafile/system.291.872722799
channel ORA_DISK_1: restoring datafile 00002 to +DATA1/ora11gr2/datafile/sysaux.292.872722847
channel ORA_DISK_1: restoring datafile 00003 to +DATA1/ora11gr2/datafile/undotbs1.278.872722879
channel ORA_DISK_1: restoring datafile 00004 to +DATA1/ora11gr2/datafile/users.296.872722925
channel ORA_DISK_1: reading from backup piece +FRA1/ora11gr2/backupset/2015_02_27/nnndf0_tag20150227t102559_0.1164.872763961
channel ORA_DISK_1: piece handle=+FRA1/ora11gr2/backupset/2015_02_27/nnndf0_tag20150227t102559_0.1164.872763961 tag=TAG20150227T102559
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:01:35
Finished restore at 27.02.2015 10:38:02

RMAN> recover database until restore point RP_UPGRADE;

Starting recover at 27.02.2015 10:38:45
using channel ORA_DISK_1

starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished recover at 27.02.2015 10:38:49


Open database with resetlogs option.

RMAN> sql 'alter database open resetlogs';

sql statement: alter database open resetlogs


Final check.

SQL> alter session set nls_date_format='dd.mm.yyyy hh24:mi:ss';

Session altered.

SQL> select datum, komentar
  2  from admin.test_restore
  3  order by datum;

DATUM               KOMENTAR
------------------- --------------------------------------------------
27.02.2015 10:30:39 Before Restore Point


We "rewound" database to state that existed before RP_UPGRADE restore point is created.
This was incomplete recovery and RP_UPGRADE restore point was used just to mark location in time.



Read More...

12c Migrate Database from non-ASM to ASM using online relocation of data files

Wednesday, February 18, 2015 1 comments
There are many articles explaining how to migrate database from file system into ASM. You could use RMAN to create an image copy of the database into ASM and switch to the database copy, restore database from backup sets into ASM or create duplicate database.

All of these RMAN features are available on Oracle versions before 12c.

In this post I will use slightly different approach - using online relocation of data files into ASM.

Renaming and Relocating Online Data Files is available from 12c and enables us to rename or relocate data file while the database is open and users are accessing data files. This simplifies datafile management and avoids downtimes even when moving SYSTEM or UNDO tablespaces.


This is 12.1 pluggable single-tenant database I want to migrate from file system to ASM:

RMAN> report schema;

using target database control file instead of recovery catalog
Report of database schema for database with db_unique_name CDB12C

List of Permanent Datafiles
===========================
File Size(MB) Tablespace           RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1    800      SYSTEM               YES     /u01/app/oracle12/oradata/cdb12c/system01.dbf
2    260      PDB$SEED:SYSTEM      NO      /u01/app/oracle12/oradata/cdb12c/pdbseed/system01.dbf
3    740      SYSAUX               NO      /u01/app/oracle12/oradata/cdb12c/sysaux01.dbf
4    585      PDB$SEED:SYSAUX      NO      /u01/app/oracle12/oradata/cdb12c/pdbseed/sysaux01.dbf
5    740      UNDOTBS1             YES     /u01/app/oracle12/oradata/cdb12c/undotbs01.dbf
6    5        USERS                NO      /u01/app/oracle12/oradata/cdb12c/users01.dbf
7    270      PDB:SYSTEM           NO      /u01/app/oracle12/oradata/cdb12c/pdb/system01.dbf
8    605      PDB:SYSAUX           NO      /u01/app/oracle12/oradata/cdb12c/pdb/sysaux01.dbf
9    5        PDB:USERS            NO      /u01/app/oracle12/oradata/cdb12c/pdb/pdb_users01.dbf
10   50       PDB:MARKO            NO      /u01/app/oracle12/oradata/cdb12c/pdb/marko01.dbf

List of Temporary Files
=======================
File Size(MB) Tablespace           Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
1    100      TEMP                 100         /u01/app/oracle12/oradata/cdb12c/temp01.dbf
2    62       PDB$SEED:TEMP        32767       /u01/app/oracle12/oradata/cdb12c/pdbseed/temp01.dbf
3    100      PDB:TEMP             100         /u01/app/oracle12/oradata/cdb12c/pdb/temp01.dbf



Create script to move data files:

$ sqlplus -s / as sysdba
set lines 200
set pages 50
set feed off
set head off
spool /tmp/move_dbfiles.sql
select 'ALTER DATABASE MOVE DATAFILE '''||name||''' TO ''+DATA'';' from v$datafile
order by con_id;

ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/system01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/undotbs01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/users01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/sysaux01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/pdbseed/sysaux01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/pdbseed/system01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/pdb/marko01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/pdb/pdb_users01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/pdb/system01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/pdb/sysaux01.dbf' TO '+DATA';
spool end;
exit

Now, if you just run this script you will get some errors because you can’t move PDB files if you’re not in PDB container. Also PDB must not be offline.


After I slightly edited script this is final version:

$ cat /tmp/move_dbfiles.sql

ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/system01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/undotbs01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/users01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/sysaux01.dbf' TO '+DATA';

ALTER SESSION SET CONTAINER=pdb$seed;

ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/pdbseed/sysaux01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/pdbseed/system01.dbf' TO '+DATA';

ALTER SESSION SET CONTAINER=pdb;
ALTER DATABASE OPEN;

ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/pdb/marko01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/pdb/pdb_users01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/pdb/system01.dbf' TO '+DATA';
ALTER DATABASE MOVE DATAFILE '/u01/app/oracle12/oradata/cdb12c/pdb/sysaux01.dbf' TO '+DATA';


Let’s execute this script:

$ sqlplus / as sysdba

SQL> @/tmp/move_dbfiles.sql

Database altered.
Database altered.
Database altered.
Database altered.
Session altered.
Database altered.
Database altered.
Session altered.
Database altered.
Database altered.
Database altered.
Database altered.
Database altered.

Data files are now migrated to ASM without any downtime. Very simple.


Notice my temporary tablespaces are still on file system.

List of Temporary Files
=======================
File Size(MB) Tablespace           Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
1    100      TEMP                 100         /u01/app/oracle12/oradata/cdb12c/temp01.dbf
2    62       PDB$SEED:TEMP        32767       /u01/app/oracle12/oradata/cdb12c/pdbseed/temp01.dbf
3    100      PDB:TEMP             100         /u01/app/oracle12/oradata/cdb12c/pdb/temp01.dbf


I cannot use command as above to move them so I will use slightly different approach.

SQL> alter tablespace TEMP add tempfile '+DATA';
Tablespace altered.

SQL> alter tablespace TEMP drop tempfile '/u01/app/oracle12/oradata/cdb12c/temp01.dbf';
Tablespace altered.


SQL> alter session set container=pdb;
Session altered.

SQL> alter tablespace TEMP add tempfile '+DATA';
Tablespace altered.

SQL> alter tablespace TEMP drop tempfile '/u01/app/oracle12/oradata/cdb12c/pdb/temp01.dbf';
Tablespace altered.


The tricky part is to move temp file from tablespace TEMP in PDB$SEED used for creation of pluggable databases. I have found this solution on Jože Senegačnik blog post.

SQL> alter session set container=CDB$ROOT;
Session altered.

SQL> alter session set "_oracle_script"=TRUE;
Session altered.

SQL> alter pluggable database pdb$seed close;
Pluggable database altered.

SQL> alter pluggable database pdb$seed open read write;
Pluggable database altered.

SQL> alter session set container=pdb$seed;
Session altered.

SQL> alter tablespace temp add tempfile '+DATA';
Tablespace altered.

SQL>  alter tablespace temp drop tempfile '/u01/app/oracle12/oradata/cdb12c/pdbseed/temp01.dbf';
Tablespace altered.

SQL> alter session set container=CDB$ROOT;
Session altered.

SQL> alter pluggable database pdb$seed close;
Pluggable database altered.

SQL> alter pluggable database pdb$seed open read only;
Pluggable database altered.



All my data files and temporary files are now migrated into ASM.

RMAN> report schema;

using target database control file instead of recovery catalog
Report of database schema for database with db_unique_name CDB12C

List of Permanent Datafiles
===========================
File Size(MB) Tablespace           RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1    800      SYSTEM               YES     +DATA/CDB12C/DATAFILE/system.259.871936371
2    260      PDB$SEED:SYSTEM      NO      +DATA/CDB12C/0C366F6BD5E01C14E055000000000001/DATAFILE/system.265.871936527
3    740      SYSAUX               NO      +DATA/CDB12C/DATAFILE/sysaux.263.871936455
4    585      PDB$SEED:SYSAUX      NO      +DATA/CDB12C/0C366F6BD5E01C14E055000000000001/DATAFILE/sysaux.256.871936497
5    740      UNDOTBS1             YES     +DATA/CDB12C/DATAFILE/undotbs1.257.871936413
6    5        USERS                NO      +DATA/CDB12C/DATAFILE/users.258.871936451
7    270      PDB:SYSTEM           NO      +DATA/CDB12C/0C38EB3894542836E055000000000001/DATAFILE/system.269.871936549
8    605      PDB:SYSAUX           NO      +DATA/CDB12C/0C38EB3894542836E055000000000001/DATAFILE/sysaux.266.871936565
9    5        PDB:USERS            NO      +DATA/CDB12C/0C38EB3894542836E055000000000001/DATAFILE/users.270.871936547
10   50       PDB:MARKO            NO      +DATA/CDB12C/0C38EB3894542836E055000000000001/DATAFILE/marko.271.871936543

List of Temporary Files
=======================
File Size(MB) Tablespace           Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
2    100      PDB$SEED:TEMP        32767       +DATA/CDB12C/0C366F6BD5E01C14E055000000000001/TEMPFILE/temp.264.871937389
4    100      TEMP                 32767       +DATA/CDB12C/TEMPFILE/temp.268.871936871
5    100      PDB:TEMP             32767       +DATA/CDB12C/0C38EB3894542836E055000000000001/TEMPFILE/temp.267.871936909
7    100      PDB$SEED:TEMP        32767       +DATA/CDB12C/0C366F6BD5E01C14E055000000000001/TEMPFILE/temp.262.871937843



Move redo log files into ASM:

SQL> select member from v$logfile;

MEMBER
--------------------------------------------------------------------------------
/u01/app/oracle12/oradata/cdb12c/redo01.log
/u01/app/oracle12/oradata/cdb12c/redo02.log
/u01/app/oracle12/oradata/cdb12c/redo03.log


SQL> select group#, status from v$log;

    GROUP# STATUS
---------- ----------------
         1 INACTIVE
         2 INACTIVE
         3 CURRENT

SQL> alter database drop logfile group 1;

Database altered.

SQL> alter database add logfile group 1 '+DATA';

Database altered.

SQL> alter database drop logfile group 2;

Database altered.

SQL> alter database add logfile group 2 '+DATA';

Database altered.

SQL> alter system switch logfile;

System altered.

SQL> alter system checkpoint;

System altered.

SQL> alter database drop logfile group 3;

Database altered.

SQL> alter database add logfile group 3 '+DATA';

Database altered.


SQL> select member from v$logfile;

MEMBER
--------------------------------------------------------------------------------
+DATA/CDB12C/ONLINELOG/group_1.261.871938327
+DATA/CDB12C/ONLINELOG/group_2.260.871938351
+DATA/CDB12C/ONLINELOG/group_3.272.871938381


All this operations were done online while database was up and running.

Now comes part when some downtime is needed - part when we move control files from file system into ASM.


$ srvctl stop database -d cdb12c


RMAN> startup nomount;

Oracle instance started

Total System Global Area     838860800 bytes

Fixed Size                     2929936 bytes
Variable Size                570428144 bytes
Database Buffers             260046848 bytes
Redo Buffers                   5455872 bytes

RMAN> restore controlfile to '+DATA' from '/u01/app/oracle12/oradata/cdb12c/control01.ctl';

Starting restore at 17-FEB-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=12 device type=DISK

channel ORA_DISK_1: copied control file copy
Finished restore at 17-FEB-15

RMAN> restore controlfile to '+FRA' from '/u01/app/oracle12/oradata/cdb12c/control01.ctl';

Starting restore at 17-FEB-15
using channel ORA_DISK_1

channel ORA_DISK_1: copied control file copy
Finished restore at 17-FEB-15




ASMCMD [+] > find --type CONTROLFILE +DATA *
+DATA/CDB12C/CONTROLFILE/current.273.871940441
ASMCMD [+] > find --type CONTROLFILE +FRA *
+FRA/CDB12C/CONTROLFILE/current.256.871940457



SQL> alter system set control_files='+DATA/CDB12C/CONTROLFILE/current.273.871940441','+FRA/CDB12C/CONTROLFILE/current.256.871940457' scope=spfile;

System altered.

SQL> shutdown abort;
ORACLE instance shut down.



Put parameter file into ASM also and start database.

$ srvctl start database -d cdb12c -o mount


RMAN> run
2> {
3> BACKUP AS BACKUPSET SPFILE;
4> RESTORE SPFILE TO "+DATA/CDB12C/spfilecdb12c.ora";
5> }

Starting backup at 17-FEB-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=14 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 17-FEB-15
channel ORA_DISK_1: finished piece 1 at 17-FEB-15
piece handle=/u01/app/oracle12/fast_recovery_area/CDB12C/backupset/2015_02_17/o1_mf_nnsnf_TAG20150217T214744_bg7b7jkg_.bkp tag=TAG20150217T214744 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 17-FEB-15

Starting Control File and SPFILE Autobackup at 17-FEB-15
piece handle=/u01/app/oracle12/fast_recovery_area/CDB12C/autobackup/2015_02_17/o1_mf_s_871939236_bg7b7m1p_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 17-FEB-15

Starting restore at 17-FEB-15
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: restoring SPFILE
output file name=+DATA/CDB12C/spfilecdb12c.ora
channel ORA_DISK_1: reading from backup piece /u01/app/oracle12/fast_recovery_area/CDB12C/autobackup/2015_02_17/o1_mf_s_871939236_bg7b7m1p_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle12/fast_recovery_area/CDB12C/autobackup/2015_02_17/o1_mf_s_871939236_bg7b7m1p_.bkp tag=TAG20150217T214745
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 17-FEB-15



$ srvctl modify database -db cdb12c -spfile '+DATA/CDB12C/spfilecdb12c.ora';



$ srvctl stop database -db cdb12c
$ srvctl start database -db cdb12c



Final checks:

SQL> show parameter spfile

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      +DATA/CDB12C/spfilecdb12c.ora

SQL> set lines 200
SQL> set pages 999
SQL> select name from v$datafile union all
  2  select name from v$tempfile union all
  3  select member from v$logfile union all
  4  select name from v$controlfile;

NAME
-----------------------------------------------------------------------------
+DATA/CDB12C/DATAFILE/system.259.871936371
+DATA/CDB12C/0C366F6BD5E01C14E055000000000001/DATAFILE/system.265.871936527
+DATA/CDB12C/DATAFILE/sysaux.263.871936455
+DATA/CDB12C/0C366F6BD5E01C14E055000000000001/DATAFILE/sysaux.256.871936497
+DATA/CDB12C/DATAFILE/undotbs1.257.871936413
+DATA/CDB12C/DATAFILE/users.258.871936451
+DATA/CDB12C/0C38EB3894542836E055000000000001/DATAFILE/system.269.871936549
+DATA/CDB12C/0C38EB3894542836E055000000000001/DATAFILE/sysaux.266.871936565
+DATA/CDB12C/0C38EB3894542836E055000000000001/DATAFILE/users.270.871936547
+DATA/CDB12C/0C38EB3894542836E055000000000001/DATAFILE/marko.271.871936543
+DATA/CDB12C/0C366F6BD5E01C14E055000000000001/TEMPFILE/temp.264.871937389
+DATA/CDB12C/TEMPFILE/temp.268.871936871
+DATA/CDB12C/0C38EB3894542836E055000000000001/TEMPFILE/temp.267.871936909
+DATA/CDB12C/0C366F6BD5E01C14E055000000000001/TEMPFILE/temp.262.871937843
+DATA/CDB12C/ONLINELOG/group_1.261.871938327
+DATA/CDB12C/ONLINELOG/group_2.260.871938351
+DATA/CDB12C/ONLINELOG/group_3.272.871938381
+DATA/CDB12C/CONTROLFILE/current.273.871940441
+FRA/CDB12C/CONTROLFILE/current.256.871940457

19 rows selected.




REFERENCES
http://docs.oracle.com/database/121/ADMIN/dfiles.htm#ADMIN012
http://joze-senegacnik.blogspot.com/2015/02/maintaining-tempfile-in-temp-tablespace.html


Read More...

MariaDB - Measure Replicaton Lag and Check / Fix Replication Inconsistencies using Percona tools

Thursday, February 5, 2015 0 comments
Percona Toolkit is collection of command-line tools to perform many MySQL tasks like creating backups, finding duplicate indexes, managing replication, etc.

In this post I will talk about how to measure replication lag and check/fix replication inconsistencies with this tools:
pt-heartbeat
pt-table-checksum
pt-table-sync


I am using environment from previous blog post.
Master-Master replication with MariaDB 10.0.16 database on Debian 7.


Install Percona Toolkit on both nodes:

$ sudo wget percona.com/get/percona-toolkit.deb

$ sudo apt-get install libterm-readkey-perl
$ sudo dpkg -i percona-toolkit.deb  


I will create percona database where I will store tables needed for various checks. Also I will create percona user which will be used with Percona tools.



MASTER1

MariaDB [(none)]> create database percona;

MariaDB [(none)]> grant all privileges on *.* to 'percona'@'master1.localdomain' identified by 'percona';

MariaDB [(none)]> grant all privileges on *.* to 'percona'@'localhost' identified by 'percona';

MariaDB [(none)]> flush privileges;


MASTER2

MariaDB [(none)]> grant all privileges on *.* to 'percona'@'master2.localdomain' identified by 'percona';

MariaDB [(none)]> grant all privileges on *.* to 'percona'@'localhost' identified by 'percona';

MariaDB [(none)]> flush privileges;




MONITOR REPLICATION LAG


So, I have replication running and I want to be sure that everything is working fine.
Typical method to monitor replication lag would be to run SLAVE STATUS and look at Seconds_Behind_Master. But Seconds_Behind_Master is not always accurate.

Percona Toolkit has a tool to monitor replication delay called pt-heartbeat.

We must create heartbeat table on the master manually or using --create-table option and heartbeat table must contain one heartbeat row. This table will be updated in interval we specify by pt-heartbeat. Slave will actively check table and calculate time delay.


Create heartbeat table and start daemonized process to update percona.heartbeat table.


MASTER1

$ pt-heartbeat -upercona -ppercona -D percona --update master1 --daemonize --create-table


MASTER2
Start pt-heartbeat.

$ pt-heartbeat -upercona -ppercona --update --database percona


MASTER1
Monitor replication slave lag.

$ pt-heartbeat -upercona -ppercona -D percona --monitor -h master2
0.00s [  0.00s,  0.00s,  0.00s ]
0.00s [  0.00s,  0.00s,  0.00s ]
0.00s [  0.00s,  0.00s,  0.00s ]
0.00s [  0.00s,  0.00s,  0.00s ]
0.00s [  0.00s,  0.00s,  0.00s ]
0.00s [  0.00s,  0.00s,  0.00s ]
0.00s [  0.00s,  0.00s,  0.00s ]
0.00s [  0.00s,  0.00s,  0.00s ]
0.00s [  0.00s,  0.00s,  0.00s ]
0.00s [  0.00s,  0.00s,  0.00s ]
0.00s [  0.00s,  0.00s,  0.00s ]
0.00s [  0.00s,  0.00s,  0.00s ]
0.00s [  0.00s,  0.00s,  0.00s ]
0.01s [  0.00s,  0.00s,  0.00s ]
0.00s [  0.00s,  0.00s,  0.00s ]
0.00s [  0.00s,  0.00s,  0.00s ]





CHECK REPLICATION INCONSISTENCIES


If we want to check replication integrity we can use pt-table-checksum tool.

Run tool on master server. It will automatically detect slave servers and connect to them to do some safety checks. After that it runs checksums on the tables of the master database and reports results in the checksum table. This results are then compared with the results on the slave whether the data differs.
You can inspect that table anytime - in this example percona.checksums table.

If there are no different rows in the tables between master and slave database DIFF will show 0.


$ pt-table-checksum -upercona -ppercona --create-replicate-table --replicate percona.checksums --databases testdb -h master2
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
02-02T20:58:15      0      0        5       1       0   1.134 testdb.users


MASTER2

MariaDB [testdb]> create table address (id int auto_increment primary key, city varchar(30));
Query OK, 0 rows affected (0.06 sec)

MariaDB [testdb]> insert into address (city) values ('New York');
Query OK, 1 row affected (0.07 sec)

MariaDB [testdb]> insert into address (city) values ('LA');
Query OK, 1 row affected (0.06 sec)

MariaDB [testdb]> insert into address (city) values ('Zagreb');
Query OK, 1 row affected (0.13 sec)


MASTER1

$ pt-table-checksum -upercona -ppercona --replicate percona.checksums --databases testdb -h master2
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
02-02T20:59:16      0      0        3       1       0   1.032 testdb.address
02-02T20:59:17      0      0        5       1       0   1.120 testdb.users


$ pt-table-checksum -upercona -ppercona --replicate=percona.checksums --replicate-check-only --databases=testdb master1


Nothing received in output which means that testdb database is in sync with slave.


Insert some test data:

MariaDB [testdb]> create table animals (id int not null auto_increment,
    -> name char(30) not null,
    -> primary key(id));
Query OK, 0 rows affected (0.04 sec)

MariaDB [testdb]> insert into animals (name) values ('dog'),('cat'),('whale');
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0


MariaDB [testdb]> create table countries (id int not null auto_increment,
    -> name varchar(30),
    -> primary key(id));
Query OK, 0 rows affected (0.09 sec)

MariaDB [testdb]> insert into countries(name) values ('Croatia'),('England'),('USA'),('Island');
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

MariaDB [testdb]> select * from animals;
+----+-------+
| id | name  |
+----+-------+
|  1 | dog   |
|  2 | cat   |
|  3 | whale |
+----+-------+
3 rows in set (0.00 sec)

MariaDB [testdb]>  select * from countries;
+----+---------+
| id | name    |
+----+---------+
|  1 | Croatia |
|  2 | England |
|  3 | USA     |
|  4 | Island  |
+----+---------+
4 rows in set (0.00 sec)


Check if database is in sync:

$ pt-table-checksum -upercona -ppercona --create-replicate-table --replicate percona.checksums --databases testdb -h master1
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
02-02T21:03:49      0      0        3       1       0   0.177 testdb.address
02-02T21:03:49      0      0        3       1       0   0.045 testdb.animals
02-02T21:03:49      0      0        4       1       0   0.049 testdb.countries
02-02T21:03:49      0      0        5       1       0   0.037 testdb.users





RESYNC REPLICA FROM THE MASTER


Lets make database on MASTER2 out-of-sync and create some differences between databases.


MASTER2

Instead of stopping replication process, I will temporarily disable binary logging on MASTER2 server.

MariaDB [testdb]> SET SQL_LOG_BIN=0;
Query OK, 0 rows affected (0.00 sec)


Make same data modifications.

MariaDB [testdb]> insert into animals (name) values ('Ostrich'),('Penguin');
Query OK, 2 rows affected (0.04 sec)
Records: 2  Duplicates: 0  Warnings: 0

MariaDB [testdb]> delete from countries where id=2;
Query OK, 1 row affected (0.01 sec)


MariaDB [testdb]> create table colors (name varchar(30));
Query OK, 0 rows affected (0.10 sec)

MariaDB [testdb]> insert into colors(name) values ('Red'),('Blue');
Query OK, 2 rows affected (0.02 sec)
Records: 2  Duplicates: 0  Warnings: 0


Enable binary logging again.

MariaDB [testdb]> SET SQL_LOG_BIN=1;
Query OK, 0 rows affected (0.00 sec)



MASTER1

MariaDB [testdb]> select * from animals;
+----+-------+
| id | name  |
+----+-------+
|  1 | dog   |
|  2 | cat   |
|  3 | whale |
+----+-------+
3 rows in set (0.00 sec)

MariaDB [testdb]> select * from countries;
+----+---------+
| id | name    |
+----+---------+
|  1 | Croatia |
|  2 | England |
|  3 | USA     |
|  4 | Island  |
+----+---------+
4 rows in set (0.00 sec)

MariaDB [testdb]> show tables;
+------------------+
| Tables_in_testdb |
+------------------+
| address          |
| animals          |
| countries        |
| users            |
+------------------+
4 rows in set (0.00 sec)


MASTER2

MariaDB [testdb]> select * from animals;
+----+---------+
| id | name    |
+----+---------+
|  1 | dog     |
|  2 | cat     |
|  3 | whale   |
|  4 | Ostrich |
|  5 | Penguin |
+----+---------+
5 rows in set (0.00 sec)

MariaDB [testdb]> select * from countries;
+----+---------+
| id | name    |
+----+---------+
|  1 | Croatia |
|  3 | USA     |
|  4 | Island  |
+----+---------+
3 rows in set (0.00 sec)

MariaDB [testdb]> show tables;
+------------------+
| Tables_in_testdb |
+------------------+
| address          |
| animals          |
| colors           |
| countries        |
| users            |
+------------------+
5 rows in set (0.00 sec)


Notice that there are some inconsistencies between databases and there isn’t any built-in tool that will notify us about that. Replication is working fine, even though replica has different data than master.

With pt-table-checksum we will check data differences between databases.


MASTER1

$ pt-table-checksum -upercona -ppercona --create-replicate-table --replicate percona.checksums --databases testdb -h master1
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
02-02T21:11:23      0      0        3       1       0   0.106 testdb.address
02-02T21:11:23      0      1        3       1       0   0.053 testdb.animals
02-02T21:11:24      0      1        4       1       0   0.046 testdb.countries
02-02T21:11:24      0      0        5       1       0   0.042 testdb.users


$ pt-table-checksum -upercona -ppercona --replicate=percona.checksums --replicate-check-only --databases=testdb master1
Differences on master2
TABLE CHUNK CNT_DIFF CRC_DIFF CHUNK_INDEX LOWER_BOUNDARY UPPER_BOUNDARY
testdb.animals 1 2 1
testdb.countries 1 -1 1

Notice how tool reported differences in DIFFS column.


Synchronizing data between servers in master-master configuration is not trivial task. You have to think about which process is changing data where and be very careful to avoid data corruption.

In master-master configuration data changes are replicated between nodes and statements executed on "slave" node are replicated to the master.

Maybe the best approach would be to stop replication, restore replica from backup or reclone whole server and start replication again. You can also dump only affected data with mysqldump and reload it.


As this is my testing environment I will try to resolve differences using pt-table-sync tool from Percona toolkit.


First I will use tool with --print option which will only display me queries that will resolve differences. I will inspect those queries before executing them on the slave server.
These queries could be executed manually also.

$ pt-table-sync -upercona -ppercona --sync-to-master --databases testdb --transaction --lock=1 --verbose master2 --print

# Syncing h=master2,p=...,u=percona
# DELETE REPLACE INSERT UPDATE ALGORITHM START    END      EXIT DATABASE.TABLE
#      0       0      0      0 Chunk     22:13:17 22:13:17 0    testdb.address
DELETE FROM `testdb`.`animals` WHERE `id`='4' LIMIT 1 /*percona-toolkit src_db:testdb src_tbl:animals src_dsn:P=3306,h=master1,p=...,u=percona dst_db:testdb dst_tbl:animals dst_dsn:h=master2,p=...,u=percona lock:1 transaction:1 changing_src:1 replicate:0 bidirectional:0 pid:7723 user:msutic host:master1*/;
DELETE FROM `testdb`.`animals` WHERE `id`='5' LIMIT 1 /*percona-toolkit src_db:testdb src_tbl:animals src_dsn:P=3306,h=master1,p=...,u=percona dst_db:testdb dst_tbl:animals dst_dsn:h=master2,p=...,u=percona lock:1 transaction:1 changing_src:1 replicate:0 bidirectional:0 pid:7723 user:msutic host:master1*/;
#      2       0      0      0 Chunk     22:13:17 22:13:17 2    testdb.animals
REPLACE INTO `testdb`.`countries`(`id`, `name`) VALUES ('2', 'England') /*percona-toolkit src_db:testdb src_tbl:countries src_dsn:P=3306,h=master1,p=...,u=percona dst_db:testdb dst_tbl:countries dst_dsn:h=master2,p=...,u=percona lock:1 transaction:1 changing_src:1 replicate:0 bidirectional:0 pid:7723 user:msutic host:master1*/;
#      0       1      0      0 Chunk     22:13:17 22:13:17 2    testdb.countries
#      0       0      0      0 Chunk     22:13:17 22:13:17 0    testdb.users


Set --execute option to execute those queries.
With --sync-to-master option we will treat MASTER2 server as a slave.


$ pt-table-sync -upercona -ppercona --sync-to-master --databases testdb --transaction --lock=1 --verbose master2 --execute

# Syncing h=master2,p=...,u=percona
# DELETE REPLACE INSERT UPDATE ALGORITHM START    END      EXIT DATABASE.TABLE
#      0       0      0      0 Chunk     22:19:51 22:19:51 0    testdb.address
#      2       0      0      0 Chunk     22:19:51 22:19:51 2    testdb.animals
#      0       1      0      0 Chunk     22:19:51 22:19:51 2    testdb.countries
#      0       0      0      0 Chunk     22:19:51 22:19:51 0    testdb.users


Output shows that differences are successfully resolved with two DELETE and one REPLACE operation on specified tables.

Let’s run another check to verify if differences still exist.

$ pt-table-checksum -upercona -ppercona --create-replicate-table --replicate percona.checksums --databases testdb -h master1

            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
02-02T22:21:30      0      0        3       1       0   0.549 testdb.address
02-02T22:21:30      0      0        3       1       0   0.048 testdb.animals
02-02T22:21:30      0      0        4       1       0   0.043 testdb.countries
02-02T22:21:30      0      0        5       1       0   0.049 testdb.users

DIFFS columns shows only 0 which means that tables are in sync.




What if I run checksums on MASTER2 server.


MASTER2

$ pt-table-checksum -upercona -ppercona --create-replicate-table --replicate percona.checksums --databases testdb -h master2

            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
02-02T22:24:16      0      0        3       1       0   0.072 testdb.address
02-02T22:24:16      0      0        3       1       0   0.048 testdb.animals
02-02T22:24:16 Skipping table testdb.colors because it has problems on these replicas:
Table testdb.colors does not exist on replica master1
This can break replication.  If you understand the risks, specify --no-check-slave-tables to disable this check.
02-02T22:24:16 Error checksumming table testdb.colors: DBD::mysql::db selectrow_hashref failed: Table 'testdb.colors' doesn't exist [for Statement "EXPLAIN SELECT * FROM `testdb`.`colors` WHERE 1=1"] at /usr/bin/pt-table-checksum line 6595.

02-02T22:24:16      1      0        0       0       0   0.003 testdb.colors
02-02T22:24:16      0      0        4       1       0   0.044 testdb.countries
02-02T22:24:16      0      0        5       1       0   0.043 testdb.users


Output shows error because table testdb.colors exists on MASTER2 but not in MASTER1.

I know that MASTER1 has "correct" data so I will just drop testdb.colors table on MASTER2 node.

MariaDB [testdb]> drop table if exists testdb.colors;
Query OK, 0 rows affected (0.05 sec)


Run check again:

$ pt-table-checksum -upercona -ppercona --create-replicate-table --replicate percona.checksums --databases testdb -h master2
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
02-02T22:26:43      0      0        3       1       0   0.322 testdb.address
02-02T22:26:43      0      0        3       1       0   0.056 testdb.animals
02-02T22:26:43      0      0        4       1       0   0.050 testdb.countries
02-02T22:26:43      0      0        5       1       0   0.045 testdb.users


Now we have synced databases.



If we use --quiet option tool will print out row per table only if there are some differences. This is nice way to run tool from a cron job and send mail only if there is non-zero exit status.

$ pt-table-checksum -upercona -ppercona --create-replicate-table --replicate percona.checksums --databases testdb -h master1 --quiet
(no rows)




REFERENCES
http://www.percona.com/doc/percona-toolkit/2.2/pt-table-sync.html
http://www.percona.com/doc/percona-toolkit/2.2/pt-table-checksum.html
http://www.percona.com/software/percona-toolkit


Read More...

MariaDB(MySQL) Master-Master Replication

Sunday, February 1, 2015 0 comments
The simplest and probably most common replication method is master-slave replication. Basically, data is replicated from master database to the slave. In case of master database failure you must get the slave database up-to-date before failover and then promote slave to be new master.

Another method is to set up replication in both directions called master-master replication. But you must be aware that this setup brings some potential issues as data changes are happening on both nodes. It could be problem if you have tables with auto_increment fields. If both servers are inserting or updating in the same table replication will break on one server due to “duplicate entry” error. To resolve this issue you have "auto_increment_increment" and "auto_increment_offset" settings.

In my case its best to use master-master setup as active-passive replication. If we know that only one node is performing data modifications we can avoid many possible problems. In case of the failover "slave" could be easily promoted to a new master. Data modifications are automatically replicated to failed node when it comes back up.

Of course, this simple setup is not suitable for all situations and it has it's drawbacks but luckily you have several other options at your disposal, like MariaDB Galera Cluster.



Servers setup:
OS: Debian 7.8
DB: MariaDB 10.0.16



Install MariaDB 10 (both nodes).

$ sudo apt-get install python-software-properties
$ sudo apt-key adv --recv-keys --keyserver keyserver.ubuntu.com 0xcbcb082a1bb943db
$ sudo add-apt-repository 'deb http://mirror3.layerjet.com/mariadb/repo/10.0/debian wheezy main'
$ sudo apt-get update
$ sudo apt-get install mariadb-server


Stop MariaDB on both nodes:
$ sudo service mysql stop


MASTER1

Edit /etc/mysql/my.cnf parameter file.

# bind-address          = 127.0.0.1
server-id               = 61
report_host             = master1
log_bin                 = /var/log/mysql/mariadb-bin
log_bin_index           = /var/log/mysql/mariadb-bin.index
relay_log               = /var/log/mysql/relay-bin
relay_log_index         = /var/log/mysql/relay-bin.index
# replicate-do-db       = testdb


# bind-address = 127.0.0.1
By default mysql will accept connections only from local host. We will comment this line to enable connections from other hosts. This is important for replication to work.

server-id = 61
report_host = master1
Choose ID that will uniquely identify your host. I will use last two digits of my IP address. Optionally you could set report_host parameter for servers to report each other their hostnames.

log_bin = /var/log/mysql/mariadb-bin
log_bin_index = /var/log/mysql/mariadb-bin.index
Enable binary logging.

relay_log = /var/log/mysql/relay-bin
relay_log_index = /var/log/mysql/relay-bin.index
Enable creating relay log files. Events that are read from master’s binary log are written to slave relay log.

replicate-do-db = testdb
With this parameter we are telling to MariaDB which databases to replicate. This parameter is optional.




Now we can start MariaDB server.

$ sudo service mysql start


Login as root and create user that will be used for replicating data between our servers. Grant appropriate privileges to the user.

$ sudo mysql -uroot -p
MariaDB [(none)]> create user 'replusr'@'%' identified by 'replusr';
MariaDB [(none)]> grant replication slave on *.* to 'replusr'@'%';


For the last step check status information about binary log files as we will use this information to start replication on another node.

MariaDB [(none)]> show master status;
+--------------------+----------+--------------+------------------+
| File               | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+--------------------+----------+--------------+------------------+
| mariadb-bin.000009 |      634 |              |                  |
+--------------------+----------+--------------+------------------+



MASTER2

Edit /etc/mysql/my.cnf parameter file.

# bind-address          = 127.0.0.1
server-id               = 61
report_host             = master1
log_bin                 = /var/log/mysql/mariadb-bin
log_bin_index           = /var/log/mysql/mariadb-bin.index
relay_log               = /var/log/mysql/relay-bin
relay_log_index         = /var/log/mysql/relay-bin.index
# replicate-do-db       = testdb


Start MariaDB server.

$ sudo service mysql start


Create user which will be used for replication and grant privileges to the user.

$ sudo mysql -uroot -p
MariaDB [(none)]> create user 'replusr'@'%' identified by 'replusr';
MariaDB [(none)]> grant replication slave on *.* to 'replusr'@'%';


To start replication enter following commands.

MariaDB [(none)]> STOP SLAVE;

MariaDB [(none)]> CHANGE MASTER TO MASTER_HOST='master1', MASTER_USER='replusr',
-> MASTER_PASSWORD='replusr', MASTER_LOG_FILE='mariadb-bin.000009', MASTER_LOG_POS=634;

MariaDB [(none)]> START SLAVE;

For MASTER_LOG_FILE and MASTER_LOG_POS I have used information from "show master status" on the first node.

Check status information of the slave threads.

MariaDB [(none)]> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: master1
                  Master_User: replusr
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mariadb-bin.000009
          Read_Master_Log_Pos: 634
               Relay_Log_File: relay-bin.000002
                Relay_Log_Pos: 537
        Relay_Master_Log_File: mariadb-bin.000009
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: testdb
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 634
              Relay_Log_Space: 828
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: No
                  Gtid_IO_Pos:


Notice that Read_Master_Log_Pos and Exec_Master_Log_Pos are in sync which is good indicator that our databases are in sync.


Check status information about binary log files of the MASTER2 node. We will need this information to start replication on MASTER1 node.

MariaDB [(none)]> show master status;
+--------------------+----------+--------------+------------------+
| File               | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+--------------------+----------+--------------+------------------+
| mariadb-bin.000009 |      759 |              |                  |
+--------------------+----------+--------------+------------------+



MASTER1

Start replicating data from MASTER2 to MASTER1 node.

MariaDB [(none)]> STOP SLAVE;

MariaDB [(none)]> CHANGE MASTER TO MASTER_HOST='master2', MASTER_USER='replusr',
-> MASTER_PASSWORD='replusr', MASTER_LOG_FILE='mariadb-bin.000009', MASTER_LOG_POS=759;

MariaDB [(none)]> START SLAVE;


MariaDB [(none)]> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: master2
                  Master_User: replusr
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mariadb-bin.000009
          Read_Master_Log_Pos: 759
               Relay_Log_File: relay-bin.000002
                Relay_Log_Pos: 537
        Relay_Master_Log_File: mariadb-bin.000009
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: testdb
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 759
              Relay_Log_Space: 828
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 62
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: No
                  Gtid_IO_Pos:


Everything seems to be OK.



Let’s create test table and insert some rows to test our replication.


MASTER1

MariaDB [(none)]> create database testdb;

MariaDB [(none)]> use testdb;
Database changed

MariaDB [testdb]> CREATE TABLE users (id INT AUTO_INCREMENT,
    -> name VARCHAR(30),
    -> datum TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    -> PRIMARY KEY(id));
Query OK, 0 rows affected (0.50 sec)

MariaDB [testdb]> INSERT INTO users(name) VALUES ('Marko');
Query OK, 1 row affected (0.06 sec)

MariaDB [testdb]> select * from users;
+----+-------+---------------------+
| id | name  | datum               |
+----+-------+---------------------+
|  1 | Marko | 2015-02-01 00:41:41 |
+----+-------+---------------------+
1 row in set (0.00 sec)


MASTER2

MariaDB [testdb]> use testdb
Database changed

MariaDB [testdb]> select * from users;
+----+-------+---------------------+
| id | name  | datum               |
+----+-------+---------------------+
|  1 | Marko | 2015-02-01 00:41:41 |
+----+-------+---------------------+
1 row in set (0.00 sec)

MariaDB [testdb]> INSERT INTO users(name) VALUES('John');
Query OK, 1 row affected (0.39 sec)

MariaDB [testdb]> select * from users;
+----+-------+---------------------+
| id | name  | datum               |
+----+-------+---------------------+
|  1 | Marko | 2015-02-01 00:41:41 |
|  2 | John  | 2015-01-31 16:17:55 |
+----+-------+---------------------+
2 rows in set (0.00 sec)


MASTER1

MariaDB [testdb]> select * from users;
+----+-------+---------------------+
| id | name  | datum               |
+----+-------+---------------------+
|  1 | Marko | 2015-02-01 00:41:41 |
|  2 | John  | 2015-01-31 16:17:55 |
+----+-------+---------------------+
2 rows in set (0.00 sec)

As we can see our table and rows are replicated successfully.



Let’s simulate crash of the MASTER1 node and power off the server.

$ sudo shutdown -h now

While server is down insert some rows on MASTER2 node.

MASTER2

MariaDB [testdb]> INSERT INTO users(name) VALUES ('Eric');
Query OK, 1 row affected (0.41 sec)

MariaDB [testdb]> INSERT INTO users(name) VALUES ('Clive');
Query OK, 1 row affected (0.08 sec)

MariaDB [testdb]> INSERT INTO users(name) VALUES ('Maria');
Query OK, 1 row affected (0.09 sec)

MariaDB [testdb]> select * from users;
+----+-------+---------------------+
| id | name  | datum               |
+----+-------+---------------------+
|  1 | Marko | 2015-02-01 00:41:41 |
|  2 | John  | 2015-01-31 16:17:55 |
|  3 | Eric  | 2015-01-31 16:19:49 |
|  4 | Clive | 2015-01-31 16:19:55 |
|  5 | Maria | 2015-01-31 16:20:01 |
+----+-------+---------------------+
5 rows in set (0.00 sec)


MariaDB [testdb]> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Reconnecting after a failed master event read
                  Master_Host: master1
                  Master_User: replusr
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mariadb-bin.000010
          Read_Master_Log_Pos: 1828
               Relay_Log_File: relay-bin.000012
                Relay_Log_Pos: 1083
        Relay_Master_Log_File: mariadb-bin.000010
             Slave_IO_Running: Connecting
            Slave_SQL_Running: Yes
              Replicate_Do_DB: testdb
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1828
              Relay_Log_Space: 1663
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 2003
                Last_IO_Error: error reconnecting to master 'replusr@master1:3306' - retry-time: 
                               60  retries: 86400  message: Can't connect to MySQL server 
                               on 'master1' (111 "Connection refused")
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: No
                  Gtid_IO_Pos:

Check Last_IO_Error message while MASTER1 is down.


Now turn on MASTER1 node again.
MariaDB server and replication will start automatically and MASTER1 should catch up MASTER2.


MASTER1

Check "users" table - it's synchronised again.

$ mysql -u root -p -D testdb

MariaDB [testdb]> select * from users;
+----+-------+---------------------+
| id | name  | datum               |
+----+-------+---------------------+
|  1 | Marko | 2015-02-01 00:41:41 |
|  2 | John  | 2015-01-31 16:17:55 |
|  3 | Eric  | 2015-01-31 16:19:49 |
|  4 | Clive | 2015-01-31 16:19:55 |
|  5 | Maria | 2015-01-31 16:20:01 |
+----+-------+---------------------+
5 rows in set (0.00 sec)



Please let me know if you see possible problems in this configuration. I will update post gladly. Thanks for reading!

Read More...

ORA-19599 block corruption when filesystemio_options=SETALL on ext4 file system using Linux

Monday, December 22, 2014 0 comments
Few days ago I experienced strange issue in my development environment running on OEL 5.8 with EXT4 filesystem. Note - EXT4 filesystem is supported from OEL 5.6 version.

This was virtual machine running oldish 10.2.0.5.0 Oracle database.

I noticed that backup for my database is failing because of archive log corruption. As this is development database I simply deleted corrupted archive logs and initiated full backup again. But backup failed because new archive logs were corrupted.

Weird issue...

I forced switch of log file few times and validated new archive logs - everything was OK. Redo logs were multiplexed and everything was fine with them. I have validated database for physical and logical corruption - everything was OK.

Then I initiated backup again and it failed.
This is excerpt from RMAN log (I've changed log slightly):


RMAN> connect target *
2> run
3> {
7>
8> ALLOCATE CHANNEL d1 DEVICE TYPE DISK;
9> BACKUP INCREMENTAL LEVEL 0 FORMAT '/u01/backup_db/QAS/fullbkp_dir/FULL_%d_%u' DATABASE TAG "weekly_full";
10> RELEASE CHANNEL d1;
11> sql 'alter system archive log current';
12> ALLOCATE CHANNEL d1 DEVICE TYPE DISK;
13> BACKUP (ARCHIVELOG ALL FORMAT '/u01/backup_db/QAS/fullbkp_dir/ARCH_%d_%T_%u_s%s_p%p' DELETE INPUT TAG "archivelogs");
14> RELEASE CHANNEL d1;
15>
16> DELETE OBSOLETE;
17>
18> BACKUP CURRENT CONTROLFILE FORMAT '/u01/backup_db/QAS/fullbkp_dir/controlf_%d_%u_%s_%T';
19> }
20>
connected to target database: QAS (DBID=2203246509)
using target database control file instead of recovery catalog

allocated channel: d1
channel d1: sid=43 devtype=DISK

Starting backup at 17.12.2014 08:17:02
channel d1: starting compressed incremental level 0 datafile backupset
channel d1: specifying datafile(s) in backupset
input datafile fno=00035 name=/u01/oradata/qas700.data1
input datafile fno=00036 name=/u01/oradata/qas700.data2
input datafile fno=00037 name=/u01/oradata/qas700.data3
input datafile fno=00002 name=/u01/oradata/undo.data1
...
...
...
channel d1: starting piece 1 at 17.12.2014 08:17:03
channel d1: finished piece 1 at 17.12.2014 09:45:48
piece handle=/u01/backup_db/QAS/fullbkp_dir/FULL_QAS_26pqchvu tag=WEEKLY_FULL comment=NONE
channel d1: backup set complete, elapsed time: 01:28:46
Finished backup at 17.12.2014 09:45:48

Starting Control File and SPFILE Autobackup at 17.12.2014 09:45:48
piece handle=/u01/app/oracle10/product/10.2.0/db_1/dbs/c-2203246509-20141217-13 comment=NONE
Finished Control File and SPFILE Autobackup at 17.12.2014 09:45:53

released channel: d1

sql statement: alter system archive log current

allocated channel: d1
channel d1: sid=43 devtype=DISK

Starting backup at 17.12.2014 09:45:54
current log archived
channel d1: starting compressed archive log backupset
channel d1: specifying archive log(s) in backup set
input archive log thread=1 sequence=11350 recid=39 stamp=866540753
input archive log thread=1 sequence=11351 recid=40 stamp=866540754
channel d1: starting piece 1 at 17.12.2014 09:45:55
released channel: d1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on d1 channel at 12/17/2014 09:45:56
ORA-19599: block number 6144 is corrupt in archived log /u01/oradata/QAS/QASarch/1_11350_826737654.dbf

Recovery Manager complete.


Notice that full backup finished successfully and when RMAN tried to backup new archive logs it failed due to corruption.

I've mentioned this issue on Twitter and got responses from Ronald Rood (@Ik_zelf) and Philippe Fierens (@pfierens) who helped me to find problem resolution.
Thanks guys!


Check this note:
ORA-1578 ORA-353 ORA-19599 Corrupt blocks with zeros when filesystemio_options=SETALL on ext4 file system using Linux (Doc ID 1487957.1)

I had filesystemio_options configured as SETALL and resetting this parameter to default value solved my corruption problem.


As this was development machine I wasn't thinking much about filesystem, but next time it will be ASM or XFS - EXT4 probably not :-)


Read More...

Mount ASM diskgroups with new ASM instance

Wednesday, October 29, 2014 0 comments
Imagine you have 11gR2 Oracle Restart configuration with database files located in ASM.

After server crash you realized that local disks are corrupted and with local disks you lost all Oracle installations. Even though this is important system you don’t have database backup (always take backups!).

But you managed to save all ASM disks as they were located on separate storage.


This will be small beginner guide on how to help yourself in such situation.


As old server crashed you must create new server configuration, identical as old configuration. Nice thing about ASM is that it keeps it’s metadata in disk header. If disks are intact and headers are not damaged you should be able to mount diskgroups with new ASM instance. But this new instance must be compatible with your diskgroups.


Grid Infrastrcuture and database software were 11.2.0.1 version and this version I will install on new server.

To keep this post short enough steps like creating users, installing ASMLib and other packages, configuring kernel parameters,... are excluded.


List Oracle ASM disks mounted to new server.
With "scandisks" command I will find devices which have been labeled as ASM disks.

# oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Scanning system for ASM disks...

# oracleasm listdisks
DISK1
DISK2
DISK3
DISK4
DISK5
FRA1

Install "Oracle Grid Infrastructure software only" option to avoid automatic Oracle Restart and ASM configuration. This configuration will be performed later manually.

After installation finished run noted perl script as root to configure Grid Infrastructure for a Stand-Alone server.
For my configuration script looks like this:
To configure Grid Infrastructure for a Stand-Alone Server run the following command as the root user:
/u01/app/11.2.0.1/grid/perl/bin/perl -I/u01/app/11.2.0.1/grid/perl/lib -I/u01/app/11.2.0.1/grid/crs/install /u01/app/11.2.0.1/grid/crs/install/roothas.pl


Start cssd if it’s not running.

# ./crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        OFFLINE OFFLINE
ora.diskmon
      1        OFFLINE OFFLINE

# ./crs_start ora.cssd
Attempting to start `ora.cssd` on member `asterix`
Attempting to stop `ora.diskmon` on member `asterix`
Stop of `ora.diskmon` on member `asterix` succeeded.
Attempting to start `ora.diskmon` on member `asterix`
Start of `ora.diskmon` on member `asterix` succeeded.
Start of `ora.cssd` on member `asterix` succeeded.


Create parameter file for ASM instance in $ORACLE_HOME/dbs directory of Grid Infrastructure.

init+ASM.ora
*.asm_diskstring='/dev/oracleasm/disks'
*.asm_power_limit=1
*.diagnostic_dest='/u01/app/grid'
*.instance_type='asm'
*.large_pool_size=12M
*.remote_login_passwordfile='EXCLUSIVE'


Register and start ASM instance.

$ export ORACLE_SID=+ASM
$ export ORACLE_HOME=/u01/app/11.2.0.1/grid
$ srvctl add asm -p $ORACLE_HOME/dbs/init+ASM.ora

$ srvctl start asm
$ srvctl status asm
ASM is running on asterix


Now notice what I see when I start ASM configuration assistant.

$ ./asmca



These are diskgroups with my database and recovery files.
Click "Mount all" to mount them all.






Install Oracle database software and create parameter file in "$ORACLE_HOME/dbs" to start database.

$ export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1
$ export ORACLE_SID=ora11gr2

$ cd $ORACLE_HOME/dbs
$ cat initora11gr2.ora
*.spfile='+DATA1/ora11gr2/spfileora11gr2.ora'

$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on Wed Oct 29 14:29:37 2014

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area  668082176 bytes
Fixed Size                  2216344 bytes
Variable Size             222301800 bytes
Database Buffers          436207616 bytes
Redo Buffers                7356416 bytes
Database mounted.
Database opened.
SQL>
SQL>
SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
+DATA1/ora11gr2/datafile/system.297.844627929
+DATA1/ora11gr2/datafile/sysaux.265.844627967
+DATA1/ora11gr2/datafile/undotbs1.266.844627991
+DATA1/ora11gr2/datafile/users.267.844628031
+DATA2/ora11gr2/datafile/marko.261.859213577


Database is successfully opened and you can register instance using SRVCTL command.

$ srvctl add database -d $ORACLE_SID -o $ORACLE_HOME -p $ORACLE_HOME/dbs/initora11gr2.ora
$ srvctl start database -d $ORACLE_SID


Final status.

$ ./crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA1.dg
               ONLINE  ONLINE       asterix
ora.DATA2.dg
               ONLINE  ONLINE       asterix
ora.FRA1.dg
               ONLINE  ONLINE       asterix
ora.asm
               ONLINE  ONLINE       asterix                  Started
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        ONLINE  ONLINE       asterix
ora.diskmon
      1        ONLINE  ONLINE       asterix
ora.ora11gr2.db
      1        ONLINE  ONLINE       asterix                  Open


Be aware that this demo is performed in virtual environment on my notebook.

Read More...

Increase disk space for VM running Linux

Friday, October 24, 2014 1 comments
When I create virtual machines on my notebook I always create too small disk for root partition or partition where I put Oracle binaries. After a while when I want to perform upgrade, or install another Oracle software, there is not enough space. This time I want to note steps about how to increase disk free space.

I can easily extend or shrink my logical volumes because I am using LVM in my virtual machines. Consider using LVM in production also because it gives you more flexibility then using normal hard drive partitions.

In this demo I'm using Oracle Linux 6.4.


Check disk free space after OS installation.

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_linuxtest-lv_root
                      4.9G  2.8G  2.0G  59% /
tmpfs                 770M  100K  770M   1% /dev/shm
/dev/sda1             485M   55M  405M  12% /boot


Add "/u01" mount and assign some disk space for Oracle installation files.


Shutdown VM and add disk.


Partition new disk "/dev/sdb" using fdisk command.

# fdisk /dev/sdb
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0xa07249dd.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
         switch off the mode (command 'c') and change display units to
         sectors (command 'u').

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-391, default 1):
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-391, default 391):
Using default value 391

Command (m for help): t
Selected partition 1
Hex code (type L to list codes): L

 0  Empty           24  NEC DOS         81  Minix / old Lin bf  Solaris
 1  FAT12           39  Plan 9          82  Linux swap / So c1  DRDOS/sec (FAT-
 2  XENIX root      3c  PartitionMagic  83  Linux           c4  DRDOS/sec (FAT-
 3  XENIX usr       40  Venix 80286     84  OS/2 hidden C:  c6  DRDOS/sec (FAT-
 4  FAT16 <32M      41  PPC PReP Boot   85  Linux extended  c7  Syrinx
 5  Extended        42  SFS             86  NTFS volume set da  Non-FS data
 6  FAT16           4d  QNX4.x          87  NTFS volume set db  CP/M / CTOS / .
 7  HPFS/NTFS       4e  QNX4.x 2nd part 88  Linux plaintext de  Dell Utility
 8  AIX             4f  QNX4.x 3rd part 8e  Linux LVM       df  BootIt
 9  AIX bootable    50  OnTrack DM      93  Amoeba          e1  DOS access
 a  OS/2 Boot Manag 51  OnTrack DM6 Aux 94  Amoeba BBT      e3  DOS R/O
 b  W95 FAT32       52  CP/M            9f  BSD/OS          e4  SpeedStor
 c  W95 FAT32 (LBA) 53  OnTrack DM6 Aux a0  IBM Thinkpad hi eb  BeOS fs
 e  W95 FAT16 (LBA) 54  OnTrackDM6      a5  FreeBSD         ee  GPT
 f  W95 Ext'd (LBA) 55  EZ-Drive        a6  OpenBSD         ef  EFI (FAT-12/16/
10  OPUS            56  Golden Bow      a7  NeXTSTEP        f0  Linux/PA-RISC b
11  Hidden FAT12    5c  Priam Edisk     a8  Darwin UFS      f1  SpeedStor
12  Compaq diagnost 61  SpeedStor       a9  NetBSD          f4  SpeedStor
14  Hidden FAT16 <3 63  GNU HURD or Sys ab  Darwin boot     f2  DOS secondary
16  Hidden FAT16    64  Novell Netware  af  HFS / HFS+      fb  VMware VMFS
17  Hidden HPFS/NTF 65  Novell Netware  b7  BSDI fs         fc  VMware VMKCORE
18  AST SmartSleep  70  DiskSecure Mult b8  BSDI swap       fd  Linux raid auto
1b  Hidden W95 FAT3 75  PC/IX           bb  Boot Wizard hid fe  LANstep
1c  Hidden W95 FAT3 80  Old Minix       be  Solaris boot    ff  BBT
1e  Hidden W95 FAT1
Hex code (type L to list codes): 8e
Changed system type of partition 1 to 8e (Linux LVM)

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.


Notice that I have identified partition as "Linux LVM" choosing "8e" hex code.


Using pvcreate command create a physical volume for later use by the LVM.

# pvcreate /dev/sdb1
  Physical volume "/dev/sdb1" successfully created

Create new volume group "vg_orabin". Later I can add or remove disks from this volume group.

# vgcreate vg_orabin /dev/sdb1
  Volume group "vg_orabin" successfully created


Information about volume group.

# vgdisplay vg_orabin
  --- Volume group ---
  VG Name               vg_orabin
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  1
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                0
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               2.99 GiB
  PE Size               4.00 MiB
  Total PE              766
  Alloc PE / Size       0 / 0
  Free  PE / Size       766 / 2.99 GiB
  VG UUID               h3N1o5-AlYF-9nkL-PXiB-P8HK-tGAa-GlXPa5


Create logical volume using disk space from volume group.

# lvcreate --extents 766 -n lv_orabin vg_orabin
  Logical volume "lv_orabin" created


Create and mount filesystem.

# mkfs.ext4 /dev/mapper/vg_orabin-lv_orabin
mke2fs 1.41.12 (17-May-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
196224< inodes, 784384 blocks
39219 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=805306368
24 block groups
32768 blocks per group, 32768 fragments per group
8176 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912

Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 25 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.




# mkdir /u01
# mount /dev/mapper/vg_orabin-lv_orabin /u01

Check disk space available.


# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_linuxtest-lv_root
                      4.9G  2.8G  2.0G  59% /
tmpfs                 770M   88K  770M   1% /dev/shm
/dev/sda1             485M   55M  405M  12% /boot
/dev/mapper/vg_orabin-lv_orabin
                      3.0G   69M  2.8G   3% /u01


Hm, 2.8G is not enough free space for me. Let’s extend this mount adding another disk.



Shutdown VM and add disk.


Partition new disk and create physical volume for LVM.

# fdisk /dev/sdc
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0x16953397.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
         switch off the mode (command 'c') and change display units to
         sectors (command 'u').

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-652, default 1):
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-652, default 652):
Using default value 652

Command (m for help): t
Selected partition 1
Hex code (type L to list codes): 8e
Changed system type of partition 1 to 8e (Linux LVM)

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.




# pvcreate /dev/sdc1
  Physical volume "/dev/sdc1" successfully created

Check current status of volume group “vg_orabin”.

# vgdisplay vg_orabin
  --- Volume group ---
  VG Name               vg_orabin
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  2
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               2.99 GiB
  PE Size               4.00 MiB
  Total PE              766
  Alloc PE / Size       766 / 2.99 GiB
  Free  PE / Size       0 / 0
  VG UUID               h3N1o5-AlYF-9nkL-PXiB-P8HK-tGAa-GlXPa5

Extend volume group by adding physical volume "/dev/sdc1" using vgextend command.

# vgextend vg_orabin /dev/sdc1
  Volume group "vg_orabin" successfully extended


Check volume group size - it is extended from 2.99G to 7.98G.

# vgdisplay vg_orabin
  --- Volume group ---
  VG Name               vg_orabin
  System ID
  Format                lvm2
  Metadata Areas        2
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               0
  Max PV                0
  Cur PV                2
  Act PV                2
  VG Size               7.98 GiB
  PE Size               4.00 MiB
  Total PE              2044
  Alloc PE / Size       766 / 2.99 GiB
  Free  PE / Size       1278 / 4.99 GiB
  VG UUID               h3N1o5-AlYF-9nkL-PXiB-P8HK-tGAa-GlXPa5


Using pvscan command scan all disks and notice physical volumes with free space.

# pvscan
  PV /dev/sdb1   VG vg_orabin      lvm2 [2.99 GiB / 0    free]
  PV /dev/sdc1   VG vg_orabin      lvm2 [4.99 GiB / 4.99 GiB free]
  PV /dev/sda2   VG vg_linuxtest   lvm2 [6.51 GiB / 0    free]
  Total: 3 [14.49 GiB] / in use: 3 [14.49 GiB] / in no VG: 0 [0   ]


With lvdisplay command display logical volume properties
Notice LV size = 2.99G.

# lvdisplay /dev/vg_orabin/lv_orabin
  --- Logical volume ---
  LV Path                /dev/vg_orabin/lv_orabin
  LV Name                lv_orabin
  VG Name                vg_orabin
  LV UUID                ypw9X1-vIsM-4rVF-NtVB-ACrf-f5nh-25p2sn
  LV Write Access        read/write
  LV Creation host, time linuxtest.localdomain, 2014-10-23 13:19:56 +0200
  LV Status              available
  # open                 0
  LV Size                2.99 GiB
  Current LE             766
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           252:2


I will add only 2G (of 5G) using lvextend command.

# lvextend -L +2G /dev/mapper/vg_orabin-lv_orabin /dev/sdc1
  Extending logical volume lv_orabin to 4.99 GiB
  Logical volume lv_orabin successfully resized


Mount volume and check for free space.

# mount /dev/mapper/vg_orabin-lv_orabin /u01

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_linuxtest-lv_root
                      4.9G  2.8G  2.0G  59% /
tmpfs                 770M   88K  770M   1% /dev/shm
/dev/sda1             485M   55M  405M  12% /boot
/dev/mapper/vg_orabin-lv_orabin
                      3.0G   69M  2.8G   3% /u01


Resize filesystem using resize2fs command:
# resize2fs /dev/mapper/vg_orabin-lv_orabin
resize2fs 1.41.12 (17-May-2010)
Filesystem at /dev/mapper/vg_orabin-lv_orabin is mounted on /u01; on-line resizing required
old desc_blocks = 1, new_desc_blocks = 1
Performing an on-line resize of /dev/mapper/vg_orabin-lv_orabin to 1308672 (4k) blocks.
The filesystem on /dev/mapper/vg_orabin-lv_orabin is now 1308672 blocks long.


Now I have 4.6G free space for "/u01" mount.

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_linuxtest-lv_root
                      4.9G  2.8G  2.0G  59% /
tmpfs                 770M   88K  770M   1% /dev/shm
/dev/sda1             485M   55M  405M  12% /boot
/dev/mapper/vg_orabin-lv_orabin
                      5.0G   70M  4.6G   2% /u01



===========================================

Now I will try to extend root partition.

Newer Oracle Linux releases are using LVM by default during install.
Let’s see can I increase my root partition using commands above.


Display information about logical volumes using lvs command.

# lvs
  LV        VG           Attr      LSize Pool Origin Data%  Move Log Cpy%Sync Convert
  lv_root   vg_linuxtest -wi-ao--- 4.97g
  lv_swap   vg_linuxtest -wi-ao--- 1.54g
  lv_orabin vg_orabin    -wi-a---- 4.99g

Check free space.
# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_linuxtest-lv_root
                      4.9G  2.8G  2.0G  59% /
tmpfs                 770M   88K  770M   1% /dev/shm
/dev/sda1             485M   55M  405M  12% /boot


Shutdown VM and add disk for extending root partition.


Partition new disk and create physical volume for LVM.

# fdisk /dev/sdd
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0xf0608435.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
         switch off the mode (command 'c') and change display units to
         sectors (command 'u').

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-652, default 1):
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-652, default 652):
Using default value 652

Command (m for help): t
Selected partition 1
Hex code (type L to list codes): 8e
Changed system type of partition 1 to 8e (Linux LVM)

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.



# pvcreate /dev/sdd1
  Physical volume "/dev/sdd1" successfully created


Check information about volume group.

# vgdisplay vg_linuxtest

  --- Volume group ---
  VG Name               vg_linuxtest
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               6.51 GiB
  PE Size               4.00 MiB
  Total PE              1666
  Alloc PE / Size       1666 / 6.51 GiB
  Free  PE / Size       0 / 0
  VG UUID               TXkKYl-PIxu-s2xk-LsEB-sgTZ-TdcO-8wapCV

Extend volume group using new physical volume.

# vgextend vg_linuxtest /dev/sdd1
  Volume group "vg_linuxtest" successfully extended

Logical volume status.
# lvdisplay  /dev/vg_linuxtest/lv_root
  --- Logical volume ---
  LV Path                /dev/vg_linuxtest/lv_root
  LV Name                lv_root
  VG Name                vg_linuxtest
  LV UUID                VNgeT7-4yhd-XqRi-2da1-XTqT-qTvm-oVK2pz
  LV Write Access        read/write
  LV Creation host, time linuxtest.localdomain, 2014-10-23 10:30:21 +0200
  LV Status              available
  # open                 1
  LV Size                4.97 GiB
  Current LE             1272
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           252:0


Extend logical volume.
# lvextend /dev/mapper/vg_linuxtest-lv_root /dev/sdd1
  Extending logical volume lv_root to 9.96 GiB
  Logical volume lv_root successfully resized

Resize filesystem.

# resize2fs /dev/mapper/vg_linuxtest-lv_root
resize2fs 1.41.12 (17-May-2010)
Filesystem at /dev/mapper/vg_linuxtest-lv_root is mounted on /; on-line resizing required
old desc_blocks = 1, new_desc_blocks = 1
Performing an on-line resize of /dev/mapper/vg_linuxtest-lv_root to 2611200 (4k) blocks.
The filesystem on /dev/mapper/vg_linuxtest-lv_root is now 2611200 blocks long.


Check disk free space. Notice that I have 6.6G of free space for my root partition.

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_linuxtest-lv_root
                      9.9G  2.8G  6.6G  30% /
tmpfs                 770M   88K  770M   1% /dev/shm
/dev/sda1             485M   55M  405M  12% /boot



WARNING! Be very careful when using commands from blog post on your production system. These are dangerous commands which can cause loss of data or many other problems. I’ve used this commands in my test environment for educational purpose and it is possible that I have made mistakes in this demo. After all I am only simple Oracle DBA not Linux SA :-)


REFERENCES
http://www.linuxuser.co.uk/features/resize-your-disks-on-the-fly-with-lvm
http://www.rootusers.com/how-to-increase-the-size-of-a-linux-lvm-by-adding-a-new-disk/
https://wiki.archlinux.org/index.php/LVM

Read More...

Using Oracle Flex ASM with single instance database

Friday, July 25, 2014 2 comments
Oracle Flex ASM was introduced in 12c version. This is one of the best features introduced with new version in my opinion.

I won’t speak in detail about Flex ASM because you can find more information in documentation. In this post I will concentrate on how Flex ASM handles crash of ASM instance.

For this test I’ve created 2 node cluster - 12c Grid Infrastructure with Flex ASM enabled.

$ asmcmd showclustermode
ASM cluster : Flex mode enabled

$ srvctl config asm ASM home: /u01/app/12.1.0/grid_1 Password file: +OCRVOTE/ASM/PASSWORD/pwdasm.256.853771307 ASM listener: LISTENER ASM instance count: ALL Cluster ASM listener: ASMNET1LSNR_ASM
$ srvctl status asm ASM is running on cluster1,cluster2


Install single instance database on one of the nodes.

$ ./dbca -silent \
> -createDatabase \
> -templateName General_Purpose.dbc \
> -gdbName singl12 \
> -sid singl12 \
> -sysPassword oracle \
> -SystemPassword oracle \
> -emConfiguration none \
> -recoveryAreaDestination FRA \
> -storageType ASM \
> -asmSysPassword oracle \
> -diskGroupName DATA \
> -characterSet AL32UTF8 \
> -nationalCharacterSet AL16UTF16 \
> -totalMemory 768 \

Copying database files 1% complete 3% complete 10% complete 17% complete 24% complete 31% complete 35% complete Creating and starting Oracle instance 37% complete 42% complete 47% complete 52% complete 53% complete 56% complete 58% complete Registering database with Oracle Restart 64% complete Completing Database Creation 68% complete 71% complete 75% complete 85% complete 96% complete 100% complete Look at the log file "/u01/app/orcl12/cfgtoollogs/dbca/singl12/singl12.log" for further details.


Single instance database is registered to the OCR.

$ srvctl config database -d singl12
Database unique name: singl12
Database name: singl12
Oracle home: /u01/app/orcl12/product/12.1.0/dbhome_1
Oracle user: orcl12
Spfile: +DATA/singl12/spfilesingl12.ora
Password file:
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: singl12
Database instance: singl12
Disk Groups: DATA
Mount point paths:
Services:
Type: SINGLE  <<<<<-------
Database is administrator managed

V$ASM_CLIENT shows that my database is managed by the Oracle ASM instance.

SQL> select instance_name, db_name, status
  2  from v$asm_client
  3  where db_name='singl12';

INSTANCE_NAME        DB_NAME  STATUS
-------------------- -------- ------------
singl12              singl12  CONNECTED


Check that ASM instances are running on both nodes.

$ ./crsctl status resource ora.asm
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE            , ONLINE
STATE=ONLINE on cluster2, ONLINE on cluster1


My database is running on cluster1 node.

$ srvctl status database -d singl12
Instance singl12 is running on node cluster1

SQL> select instance_name, host_name from v$instance; INSTANCE_NAME HOST_NAME --------------- -------------------- singl12 cluster1.localdomain

Now I will simulate crash of ASM instance on cluster1 node where I have my database running.

# ps -ef|grep asm_pmon|grep -v grep
oracle    3072     1  0 10:12 ?        00:00:01 asm_pmon_+ASM1

# kill -9 3072

Without Flex ASM I would expect that crash of ASM instance would crash database instance also but with Flex ASM my database stays up and running.

Check alert log of database instance:
...
NOTE: ASMB registering with ASM instance as client 0x10005 (reg:2156157897)
NOTE: ASMB connected to ASM instance +ASM2 (Flex mode; client id 0x10005)
NOTE: ASMB rebuilding ASM server state
NOTE: ASMB rebuilt 1 (of 1) groups
NOTE: ASMB rebuilt 13 (of 13) allocated files
NOTE: fetching new locked extents from server
NOTE: 0 locks established; 0 pending writes sent to server
SUCCESS: ASMB reconnected & completed ASM server state

Check line - "NOTE: ASMB connected to ASM instance +ASM2 (Flex mode; client id 0x10005)"

As +ASM1 instance crashed ASMB connected to ASM instance +ASM2.


Check status:
# ./crsctl status resource ora.asm
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE            , ONLINE
STATE=ONLINE on cluster2, INTERMEDIATE on cluster1


SQL> select instance_name, host_name from v$instance;

INSTANCE_NAME   HOST_NAME
--------------- --------------------
singl12         cluster1.localdomain

Oracle Clusterware restarted crashed ASM instance and both instances were up in a minute.

# ./crsctl status resource ora.asm
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE            , ONLINE
STATE=ONLINE on cluster2, ONLINE on cluster1

Now to test crash ASM instance on second node.

SQL> select instance_name from v$instance;

INSTANCE_NAME
----------------
+ASM2

SQL> shutdown abort;
ASM instance shutdown

Excerpt from alertlog:

...
Fri Jul 25 12:44:33 2014
NOTE: ASMB registering with ASM instance as client 0x10005 (reg:4169355750)
NOTE: ASMB connected to ASM instance +ASM1 (Flex mode; client id 0x10005)
NOTE: ASMB rebuilding ASM server state
NOTE: ASMB rebuilt 1 (of 1) groups
NOTE: ASMB rebuilt 13 (of 13) allocated files
NOTE: fetching new locked extents from server
NOTE: 0 locks established; 0 pending writes sent to server
SUCCESS: ASMB reconnected & completed ASM server state


Again, user connected to database instance didn’t even noticed that something is happening with ASM.

Flex ASM enables for ASM instance to run on separate nodes than database servers. If ASM instance fails database will failover to another available ASM instance.

In case you are running <12c databases on your cluster you can still configure Flex ASM but you are required to configure local ASM instances on nodes. ASM instance failover won’t work for 10g or 11g databases.

Good reason to move towards 12c? ;-)



Read More...

ORA-19909: datafile 1 belongs to an orphan incarnation

Thursday, July 17, 2014 1 comments
I love to read Oracle related blogs, forum posts and mailing lists much more often than books. Why? Because there many Oracle DBA’s and developers share their experiences, problems, "best practices",... which are very valuable to me.

It's great that we have so big and active Oracle community.

Today I noticed mail from Oracle-L list where someone asked for help with recovery after overwriting production controlfiles. Check Oracle-L for more info.

It reminded me that I haven’t played with controlfile recoveries for a while.
Mistakes or various disasters could happen when you least expect it. When problem occurs it is essential that DBA is confident with recovery procedure.
Confidence comes only with practice. Create test environment and enjoy in destroying/recovering databases.
Of course - be creative with disasters :)


So I’ve imagined one scenario and decided to share it in form of blog post.


Create backup of current controlfile.

RMAN> backup current controlfile;

Starting backup at 17.07.2014 13:45:29
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
channel ORA_DISK_1: starting piece 1 at 17.07.2014 13:45:30
channel ORA_DISK_1: finished piece 1 at 17.07.2014 13:45:33
piece handle=/u01/oradata/fra/ORAKL/backupset/2014_07_17/o1_mf_ncnnf_TAG20140717T142921_9whjf2sc_.bkp tag=TAG20140717T134529 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 17.07.2014 13:45:33

After that I’ve performed incomplete recovery and opened database with resetlogs option.

...
...
...
SQL> alter database open resetlogs;
Database altered.

Create small status table and insert one row.

SQL> create table admin.test (datum date);
Table created.

SQL> insert into admin.test values(sysdate);
1 row created.

SQL> commit;
Commit complete.

SQL> select to_char(datum,'dd.mm.yyyy hh24:mi:ss') from admin.test;

TO_CHAR(DATUM,'DD.M
-------------------
17.07.2014 14:44:44


Where are my controlfiles:

SQL> select name from v$controlfile;

NAME
--------------------------------------------------------------------------------
/u01/oradata/orakl/orakl/control01.ctl
/u01/oradata/orakl/orakl/control02.ctl
/u01/oradata/orakl/orakl/control03.ctl

Overwrite all controlfiles with random file to simulate problem.

$ cp o1_mf_s_853163282_9whfzmk1_.bkp /u01/oradata/orakl/orakl/control01.ctl
$ cp o1_mf_s_853163282_9whfzmk1_.bkp /u01/oradata/orakl/orakl/control02.ctl
$ cp o1_mf_s_853163282_9whfzmk1_.bkp /u01/oradata/orakl/orakl/control03.ctl

After that instance crashed with error in alertlog:

Thu Jul 17 14:45:30 2014
Errors in file /u01/app/oracle/diag/rdbms/orakl/orakl/trace/orakl_ckpt_8103.trc:
ORA-00201: control file version  incompatible with ORACLE version
ORA-00202: control file: '/u01/oradata/orakl/orakl/control01.ctl'
CKPT (ospid: 8103): terminating the instance due to error 201
Instance terminated by CKPT, pid = 8103

To perform quick recovery I’ve restored previously backed up controlfile and mounted database.

RMAN> run
2> {
3> restore controlfile from '/u01/oradata/fra/ORAKL/backupset/2014_07_17/o1_mf_ncnnf_TAG20140717T142921_9whjf2sc_.bkp';
4> }

Starting restore at 17.07.2014 14:46:34
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=153 device type=DISK

channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
output file name=/u01/oradata/orakl/orakl/control01.ctl
output file name=/u01/oradata/orakl/orakl/control02.ctl
output file name=/u01/oradata/orakl/orakl/control03.ctl
Finished restore at 17.07.2014 14:46:37

RMAN> alter database mount;
database mounted

Recover database using backup controlfile.

SQL> recover database using backup controlfile;
ORA-00283: recovery session canceled due to errors
ORA-19909: datafile 1 belongs to an orphan incarnation
ORA-01110: data file 1: '/u01/oradata/orakl/orakl/system01.dbf'

Recovery failed with ORA-19909 error.
If you remember I’ve performed incomplete recovery after creating backup of controlfile.
Controlfile backup belongs to another incarnation, and I don’t have fresh controlfile backup.


I will use this “old” controlfile and create SQL script to reproduce new control file.

SQL> alter database backup controlfile to trace as '/tmp/ctlfile.sql';

Database altered.

Delete unnecessary lines and create SQL script "/tmp/ctlfile.sql":

STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "ORAKL" RESETLOGS  ARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 292
LOGFILE
  GROUP 1 '/u01/oradata/orakl/orakl/redo01.log'  SIZE 50M,
  GROUP 2 '/u01/oradata/orakl/orakl/redo02.log'  SIZE 50M,
  GROUP 3 '/u01/oradata/orakl/orakl/redo03.log'  SIZE 50M
DATAFILE
  '/u01/oradata/orakl/orakl/system01.dbf',
  '/u01/oradata/orakl/orakl/sysaux01.dbf',
  '/u01/oradata/orakl/orakl/undotbs01.dbf',
  '/u01/oradata/orakl/orakl/users01.dbf'
CHARACTER SET UTF8
;

Shutdown instance and create controlfile.

SQL> shutdown abort;
ORACLE instance shut down.
SQL>
SQL>
SQL> @/tmp/ctlfile.sql
ORACLE instance started.

Total System Global Area  626327552 bytes
Fixed Size                  2162280 bytes
Variable Size             171966872 bytes
Database Buffers          444596224 bytes
Redo Buffers                7602176 bytes

Control file created.

Use information from redologs for recovery.

SQL> select member, status from v$logfile;

MEMBER                                                                 STATUS
---------------------------------------------------------------------- ---------
/u01/oradata/orakl/orakl/redo03.log                                    STALE
/u01/oradata/orakl/orakl/redo02.log                                    STALE
/u01/oradata/orakl/orakl/redo01.log                                    STALE

SQL> recover database using backup controlfile;
ORA-00279: change 7015200 generated at 07/17/2014 14:43:54 needed for thread 1
ORA-00289: suggestion :
/u01/oradata/fra/ORAKL/archivelog/2014_07_17/o1_mf_1_1_%u_.arc
ORA-00280: change 7015200 for thread 1 is in sequence #1


Specify log: {=suggested | filename | AUTO | CANCEL}
/u01/oradata/orakl/orakl/redo01.log
Log applied.
Media recovery complete.

Open database with resetlogs.

SQL> alter database open resetlogs;

Database altered.

Check status table.

SQL> select to_char(datum,'dd.mm.yyyy hh24:mi:ss') from admin.test;

TO_CHAR(DATUM,'DD.M
-------------------
17.07.2014 14:44:44

Check incarnations:

RMAN> list incarnation;

using target database controlfile instead of recovery catalog

List of Database Incarnations
DB Key  Inc Key DB Name  DB ID            STATUS  Reset SCN  Reset Time
------- ------- -------- ---------------- --- ---------- ----------
2       2       ORAKL    3724279545       PARENT  641001     12.10.2013 15:30:46
1       1       ORAKL    3724279545       PARENT  7015197    17.07.2014 14:43:44
3       3       ORAKL    3724279545       CURRENT 7015386    17.07.2014 14:55:50



Cheers!

Read More...