NOTE!
You should use 'flash recovery area' as primary location for your backups whenever you can. To complicate this test case a little I won’t use 'flash recovery area' for my backups.
Oracle 10.2.0.4.0
Linux OEL5
For taking backup I will use this RMAN script:
connect target / run { sql 'alter system checkpoint'; allocate channel c1 device type disk format '/home/ora10g/backup/%U.dbf'; backup as compressed backupset check logical incremental level 1 for recover of copy with tag DAILY_FULL database filesperset 1; recover copy of database with tag DAILY_FULL; release channel c1; allocate channel c1 device type disk; backup (archivelog all format '/home/ora10g/backup/ARCH_%d_%T_%u_s%s_p%p' delete input TAG "archivelogs"); backup current controlfile format '/home/ora10g/backup/controlf_%d_%u_%s_%T'; release channel c1; }(I've borrowed first part of the script from great DBA Husnu Sensoy - Thanks!)
After first execution of this script RMAN will create initial image copy of the database, backup archived redo logs and current controlfile to directory ‘/home/ora10g/backup’.
Now I will create small table just for later checks and run script above again.
$ sqlplus admin SQL> create table test123 as select * from dba_objects; Table created. SQL> alter system switch logfile; System altered.
Next time RMAN will find image copies of data files and make just incremental backups. Additional step we will be updating image copies of datafiles applying the incremental backups to them.
RMAN script in usual circumstances works perfectly but this time I will interfere and kill RMAN process during recovery using ‘kill -9 [rman pid]’ command to simulate disaster.
To simulate disk crash I will drop original database using rm command.
So to summarize:
- I’ve lost my original database due to the 'disk crash'
- I have:
Image copies of datafiles partially recovered
Backups of archived redo logs
Multiplexed redo logs
Backup of controlfile
Can I fully recover my original database using this files? I can try... :)
I have already created empty directories where I will put database files, controlfiles, etc.
Restore controlfile and mount database:
RMAN> run 2> { 3> allocate channel c1 device type disk; 4> restore controlfile from '/home/ora10g/backup/controlf_RASAPP_khls5prr_3729_20101104'; 5> } using target database control file instead of recovery catalog allocated channel: c1 channel c1: sid=157 devtype=DISK Starting restore at 04-NOV-10 channel c1: restoring control file channel c1: restore complete, elapsed time: 00:00:02 output filename=/u01/oradata/rasapp/control01.ctl output filename=/u01/oradata/rasapp/control02.ctl output filename=/u02/oradata/rasapp/control03.ctl Finished restore at 04-NOV-10 released channel: c1 RMAN> alter database mount; database mounted
Catalog backup pieces from backup directory ‘/home/ora10g/backup’:
RMAN> catalog start with '/home/ora10g/backup'; ... ... -- Check copy of database: RMAN> list copy of database; ... ...
Now I will try to switch database to the latest image copy.
RMAN> switch database to copy; RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of switch to copy command at 11/04/2010 15:30:15 RMAN-06571: datafile 9 does not have recoverable copy
As I’ve killed RMAN process during recovery operation my image copy of database is not valid.
How to overcome this error...
I will run ‘RECOVER COPY OF DATABASE WITH TAG ‘DAILY_FULL’’ to apply incremental backups to datafile copies with the same tag.
RMAN> recover copy of database with tag 'DAILY_FULL'; Starting recover at 04-NOV-10 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: sid=157 devtype=DISK no copy of datafile 9 found to recover no copy of datafile 12 found to recover no copy of datafile 13 found to recover no copy of datafile 17 found to recover channel ORA_DISK_1: starting incremental datafile backupset restore channel ORA_DISK_1: specifying datafile copies to recover recovering datafile copy fno=00018 name=/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-UNDOTBS2_FNO-18_k4ls5pee.dbf channel ORA_DISK_1: reading from backup piece /home/ora10g/backup/krls5qrn_1_1.dbf channel ORA_DISK_1: restored backup piece 1 piece handle=/home/ora10g/backup/krls5qrn_1_1.dbf tag=TAG20101104T145544 channel ORA_DISK_1: restore complete, elapsed time: 00:00:01 channel ORA_DISK_1: starting incremental datafile backupset restore channel ORA_DISK_1: specifying datafile copies to recover recovering datafile copy fno=00008 name=/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-SAP_MERGE_FNO-8_k3ls5p99.dbf channel ORA_DISK_1: reading from backup piece /home/ora10g/backup/kqls5qrm_1_1.dbf channel ORA_DISK_1: restored backup piece 1 piece handle=/home/ora10g/backup/kqls5qrm_1_1.dbf tag=TAG20101104T145544 channel ORA_DISK_1: restore complete, elapsed time: 00:00:01 channel ORA_DISK_1: starting incremental datafile backupset restore channel ORA_DISK_1: specifying datafile copies to recover recovering datafile copy fno=00006 name=/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-BILL_DATA_FNO-6_k8ls5pnh.dbf channel ORA_DISK_1: reading from backup piece /home/ora10g/backup/kvls5qrs_1_1.dbf channel ORA_DISK_1: restored backup piece 1 piece handle=/home/ora10g/backup/kvls5qrs_1_1.dbf tag=TAG20101104T145544 channel ORA_DISK_1: restore complete, elapsed time: 00:00:01 ... ... Finished recover at 04-NOV-10
Notice this part of the log:
no copy of datafile 9 found to recover
no copy of datafile 12 found to recover
no copy of datafile 13 found to recover
no copy of datafile 17 found to recover
From alertlog: ... Thu Nov 4 15:26:51 2010 datafilecopy header validation failure for file /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-LNP_FNO-9_k1ls5orq.dbf datafilecopy header validation failure for file /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-LNP_FNO-9_k1ls5orq.dbf datafilecopy header validation failure for file /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-PSUB_DATA_FNO-12_jvls5nuo.dbf datafilecopy header validation failure for file /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-PSUB_DATA_FNO-12_jvls5nuo.dbf datafilecopy header validation failure for file /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-DSLAM_STATS_FNO-13_juls5nco.dbf datafilecopy header validation failure for file /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-DSLAM_STATS_FNO-13_juls5nco.dbf datafilecopy header validation failure for file /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-TS_DATA_FNO-17_k0ls5od9.dbf datafilecopy header validation failure for file /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-TS_DATA_FNO-17_k0ls5od9.dbf Thu Nov 4 15:27:06 2010 datafilecopy header validation failure for file /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-LNP_FNO-9_k1ls5orq.dbf datafilecopy header validation failure for file /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-PSUB_DATA_FNO-12_jvls5nuo.dbf datafilecopy header validation failure for file /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-DSLAM_STATS_FNO-13_juls5nco.dbf datafilecopy header validation failure for file /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-TS_DATA_FNO-17_k0ls5od9.dbf ... ...
ORA-19563:
%s header validation failed for file string
Cause: When opening the file to be placed in a copy or backup set, to be inspected, or used as the target for an incremental restore, its header was not recognized as a valid file header for a file of the indicated type (data file, archived log, or control file) belonging to the current database.
Action: Ensure that the correct files are being specified for the copy or backup operation.
When I run ‘switch database to copy’ command it fails again with same error as before:
RMAN> switch database to copy; RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of switch to copy command at 11/04/2010 18:29:03 RMAN-06571: datafile 9 does not have recoverable copy
To overcome this error I will uncatalog noted datafiles:
RMAN> change copy of datafile 9 uncatalog; using target database control file instead of recovery catalog uncataloged datafile copy datafile copy filename=/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-LNP_FNO-9_k1ls5orq.dbf recid=95 stamp=734192788 Uncataloged 1 objects RMAN> change copy of datafile 12 uncatalog; uncataloged datafile copy datafile copy filename=/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-PSUB_DATA_FNO-12_jvls5nuo.dbf recid=93 stamp=734192037 Uncataloged 1 objects RMAN> change copy of datafile 13 uncatalog; uncataloged datafile copy datafile copy filename=/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-DSLAM_STATS_FNO-13_juls5nco.dbf recid=92 stamp=734191568 Uncataloged 1 objects RMAN> change copy of datafile 17 uncatalog; uncataloged datafile copy datafile copy filename=/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-TS_DATA_FNO-17_k0ls5od9.dbf recid=94 stamp=734192505 Uncataloged 1 objects
Catalog them again:
RMAN> catalog start with '/home/ora10g/backup'; searching for all files that match the pattern /home/ora10g/backup List of Files Unknown to the Database ===================================== File Name: /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-PSUB_DATA_FNO-12_jvls5nuo.dbf File Name: /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-LNP_FNO-9_k1ls5orq.dbf File Name: /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-DSLAM_STATS_FNO-13_juls5nco.dbf File Name: /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-TS_DATA_FNO-17_k0ls5od9.dbf Do you really want to catalog the above files (enter YES or NO)? yes cataloging files... cataloging done List of Cataloged Files ======================= File Name: /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-PSUB_DATA_FNO-12_jvls5nuo.dbf File Name: /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-LNP_FNO-9_k1ls5orq.dbf File Name: /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-DSLAM_STATS_FNO-13_juls5nco.dbf File Name: /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-TS_DATA_FNO-17_k0ls5od9.dbf
Let’s try switch database to copy:
RMAN> switch database to copy; datafile 1 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-SYSTEM_FNO-1_k6ls5plb.dbf" datafile 2 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-SAPBI_DATA_FNO-2_k7ls5pmo.dbf" datafile 3 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-SYSAUX_FNO-3_k5ls5piv.dbf" datafile 4 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-APP_DATA_FNO-4_k2ls5p4o.dbf" datafile 5 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-USERS_FNO-5_kbls5ppj.dbf" datafile 6 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-BILL_DATA_FNO-6_k8ls5pnh.dbf" datafile 7 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-OT_DATA_FNO-7_kcls5pq2.dbf" datafile 8 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-SAP_MERGE_FNO-8_k3ls5p99.dbf" datafile 9 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-LNP_FNO-9_k1ls5orq.dbf" datafile 10 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-PHONEBOOK_FNO-10_kals5pp4.dbf" datafile 11 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-IVR_DATA_FNO-11_kdls5pqh.dbf" datafile 12 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-PSUB_DATA_FNO-12_jvls5nuo.dbf" datafile 13 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-DSLAM_STATS_FNO-13_juls5nco.dbf" datafile 14 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-SIPDIR_DATA_FNO-14_k9ls5pob.dbf" datafile 16 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-INV_RN_INTERFACE_FNO-16_kels5pr0.dbf" datafile 17 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-TS_DATA_FNO-17_k0ls5od9.dbf" datafile 18 switched to datafile copy "/home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-UNDOTBS2_FNO-18_k4ls5pee.dbf"
It worked...
Recover database now:
RMAN> recover database; Starting recover at 04-NOV-10 allocated channel: ORA_DISK_1 channel ORA_DISK_1: sid=157 devtype=DISK starting media recovery channel ORA_DISK_1: starting archive log restore to default destination channel ORA_DISK_1: restoring archive log archive log thread=1 sequence=10 channel ORA_DISK_1: restoring archive log archive log thread=1 sequence=11 channel ORA_DISK_1: reading from backup piece /home/ora10g/backup/ARCH_RASAPP_20101104_kjls5q8e_s3731_p1 channel ORA_DISK_1: restored backup piece 1 piece handle=/home/ora10g/backup/ARCH_RASAPP_20101104_kjls5q8e_s3731_p1 tag=ARCHIVELOGS channel ORA_DISK_1: restore complete, elapsed time: 00:00:02 archive log filename=/u01/oradata/rasapp/archlogs/1_10_734173548.dbf thread=1 sequence=10 archive log filename=/u01/oradata/rasapp/archlogs/1_11_734173548.dbf thread=1 sequence=11 unable to find archive log archive log thread=1 sequence=12 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of recover command at 11/04/2010 18:31:26 RMAN-06054: media recovery requesting unknown log: thread 1 seq 12 lowscn 48367782959
I’m missing some redo entries - this is because I forgot to copy multiplexed redo log files to the correct location.
Copy multiplexed redo logs to specified locations:
SQL> select member from v$logfile; MEMBER -------------------------------------------------------------------------------- /u01/oradata/rasapp/redo01a.rdo /u02/oradata/rasapp/redo01b.rdo /u01/oradata/rasapp/redo02a.rdo /u02/oradata/rasapp/redo02b.rdo /u01/oradata/rasapp/redo03a.rdo /u02/oradata/rasapp/redo03b.rdo 6 rows selected. $ cp redo01b.rdo /u02/oradata/rasapp/ ... ...
Everything should be OK now during recovery:
$ rman target / Recovery Manager: Release 10.2.0.4.0 - Production on Thu Nov 4 18:50:25 2010 Copyright (c) 1982, 2007, Oracle. All rights reserved. connected to target database: RASAPP (DBID=1650089695, not open) RMAN> recover database; Starting recover at 04-NOV-10 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: sid=157 devtype=DISK starting media recovery archive log thread 1 sequence 12 is already on disk as file /u01/oradata/rasapp/redo03b.rdo archive log filename=/u02/oradata/rasapp/redo03b.rdo thread=1 sequence=12 media recovery complete, elapsed time: 00:00:06 Finished recover at 04-NOV-10 RMAN> alter database open resetlogs; database opened
Let’s check data:
SQL> col file_name for a100 SQL> set lines 200 SQL> set pages 999 SQL> select file_name from dba_data_files; FILE_NAME ---------------------------------------------------------------------------------------------------- /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-SYSTEM_FNO-1_k6ls5plb.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-SAPBI_DATA_FNO-2_k7ls5pmo.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-SYSAUX_FNO-3_k5ls5piv.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-OT_DATA_FNO-7_kcls5pq2.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-APP_DATA_FNO-4_k2ls5p4o.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-USERS_FNO-5_kbls5ppj.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-BILL_DATA_FNO-6_k8ls5pnh.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-SAP_MERGE_FNO-8_k3ls5p99.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-LNP_FNO-9_k1ls5orq.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-PHONEBOOK_FNO-10_kals5pp4.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-IVR_DATA_FNO-11_kdls5pqh.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-PSUB_DATA_FNO-12_jvls5nuo.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-DSLAM_STATS_FNO-13_juls5nco.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-SIPDIR_DATA_FNO-14_k9ls5pob.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-INV_RN_INTERFACE_FNO-16_kels5pr0.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-TS_DATA_FNO-17_k0ls5od9.dbf /home/ora10g/backup/data_D-RASAPP_I-1650089695_TS-UNDOTBS2_FNO-18_k4ls5pee.dbf 17 rows selected. SQL> select count(*) from dba_objects where status<>'VALID'; COUNT(*) ---------- 0 SQL> select count(*) from admin.test123; COUNT(*) ---------- 50118
In your own test environment, you can try out many different recovery scenarios without doing any harm. Performing test cases is very important for DBA’s because you must be well prepared when production database encounters disaster ;)
Thanks,
ReplyDeleteVery very helpful
Appreciate your publishing this test case on the web.