Showing posts with label Oracle. Show all posts
Showing posts with label Oracle. Show all posts

Saturday, September 16, 2017

Beware of ORA-19721 on 12c using Transportable Tablespace (Oracle changed behavior)

Almost every big database has it's hot data which is used often, and cold data which is rarely touched. From version 9i I have used transportable tablespace feature to exclude cold (archive) data from database and keep it on cheap storage or tapes.

If someone needs to query some of archive tables it was very easy to plug in tablespace for a few days and after archive data is not needed anymore tablespace could be easily dropped. So I was plugging the same tablespaces more than once.

But when I tried the same process on 12c database I was unpleasantly surprised that Oracle changed behaviour and I could not reattach tablespace.

Let’s demonstrate this in simple demo case.

Create tablespace and set it to be read only.
create tablespace ARCHIVE01 datafile '/oradata1/data/ora12c/archive01.dbf' size 50M;
Tablespace created.

create table archtab tablespace ARCHIVE01 as select * from dba_objects;
Table created.

alter tablespace ARCHIVE01 read only;
Tablespace altered.

create directory export_tts as '/oradata1/export';
Directory created.

Export tablespace metadata.
$ expdp '" / as sysdba "' directory=EXPORT_TTS dumpfile=exp_archive01.dmp logfile=exp_archive01.log transport_tablespaces=ARCHIVE01 transport_full_check=Y

Export: Release 12.1.0.2.0 - Production on Sat Sep 16 18:07:27 2017

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

WARNING: Oracle Data Pump operations are not typically needed when connected to the root or seed of a container database.

Starting "SYS"."SYS_EXPORT_TRANSPORTABLE_01":  "/******** AS SYSDBA" directory=EXPORT_TTS dumpfile=exp_archive01.dmp logfile=exp_archive01.log transport_tablespaces=ARCHIVE01 transport_full_check=Y
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/TABLE_STATISTICS
Processing object type TRANSPORTABLE_EXPORT/STATISTICS/MARKER
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Master table "SYS"."SYS_EXPORT_TRANSPORTABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYS.SYS_EXPORT_TRANSPORTABLE_01 is:
  /oradata1/export/exp_archive01.dmp
******************************************************************************
Datafiles required for transportable tablespace ARCHIVE01:
  /oradata1/data/ora12c/archive01.dbf
Job "SYS"."SYS_EXPORT_TRANSPORTABLE_01" successfully completed at Sat Sep 16 18:08:06 2017 elapsed 0 00:00:3

Drop tablespace but keep datafile.
SQL> drop tablespace ARCHIVE01 including contents keep datafiles;
Tablespace dropped.

Let’s plug in tablespace.
$ impdp '" /as sysdba "' directory=EXPORT_TTS dumpfile=exp_archive01.dmp logfile=imp_archive01.log transport_datafiles='/oradata1/data/ora12c/archive01.dbf'

Import: Release 12.1.0.2.0 - Production on Sat Sep 16 18:11:32 2017

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

WARNING: Oracle Data Pump operations are not typically needed when connected to the root or seed of a container database.

Master table "SYS"."SYS_IMPORT_TRANSPORTABLE_01" successfully loaded/unloaded
Starting "SYS"."SYS_IMPORT_TRANSPORTABLE_01":  "/******** AS SYSDBA" directory=EXPORT_TTS dumpfile=exp_archive01.dmp logfile=imp_archive01.log transport_datafiles=/oradata1/data/ora12c/archive01.dbf
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/TABLE_STATISTICS
Processing object type TRANSPORTABLE_EXPORT/STATISTICS/MARKER
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Job "SYS"."SYS_IMPORT_TRANSPORTABLE_01" successfully completed at Sat Sep 16 18:11:51 2017 elapsed 0 00:00:18
Check alert log.
Plug in tablespace ARCHIVE01 with datafile
  '/oradata1/data/ora12c/archive01.dbf'
TABLE SYS.WRI$_OPTSTAT_HISTHEAD_HISTORY: ADDED INTERVAL PARTITION SYS_P451 (42993) VALUES LESS THAN (TO_DATE(' 2017-09-17 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
ALTER TABLESPACE "ARCHIVE01" READ WRITE
Completed: ALTER TABLESPACE "ARCHIVE01" READ WRITE
ALTER TABLESPACE "ARCHIVE01" READ ONLY
Sat Sep 16 18:11:51 2017
 Converting block 0 to version 10 format
Completed: ALTER TABLESPACE "ARCHIVE01" READ ONLY

Notice that Oracle is altering tablespace (datafile headers) to READ WRITE - Completed: ALTER TABLESPACE "ARCHIVE01" READ WRITE.

Quote from Oracle Support site:
Oracle Development declared it as "Expected Behavior" Starting from 12.1, during the TTS import operation, the tablespaces (datafile headers) are put into read-write mode intermittently in order to fix up TSTZ table columns and clean up unused segments in the datafiles. This functionality was implemented on many customer's request basis. And, hence, this cannot be reversed. Note that, it intermittently only changes the status to "read-write" and the final status will still be "read-only" only.

Now if I drop tablespace and try to reattach it again.

Create tablespace.
SQL> drop tablespace ARCHIVE01 including contents keep datafiles;
Tablespace dropped.
Import tablespace metadata.
$ impdp '" /as sysdba "' directory=EXPORT_TTS dumpfile=exp_archive01.dmp logfile=imp_archive01.log transport_datafiles='/oradata1/data/ora12c/archive01.dbf'

Import: Release 12.1.0.2.0 - Production on Sat Sep 16 18:13:51 2017

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

WARNING: Oracle Data Pump operations are not typically needed when connected to the root or seed of a container database.

Master table "SYS"."SYS_IMPORT_TRANSPORTABLE_01" successfully loaded/unloaded
Starting "SYS"."SYS_IMPORT_TRANSPORTABLE_01":  "/******** AS SYSDBA" directory=EXPORT_TTS dumpfile=exp_archive01.dmp logfile=imp_archive01.log transport_datafiles=/oradata1/data/ora12c/archive01.dbf
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
ORA-39123: Data Pump transportable tablespace job aborted
ORA-19721: Cannot find datafile with absolute file number 14 in tablespace ARCHIVE01

Job "SYS"."SYS_IMPORT_TRANSPORTABLE_01" stopped due to fatal error at Sat Sep 16 18:13:55 2017 elapsed 0 00:00:02

I have received error and failed to plug in tablespace.


Workaround for this "expected" behaviour is to change datafile permissions in OS level to be read only.
There is also workaround if you are using ASM so check on Oracle supprot site.

Let’s repeat steps from demo but now using workaround.


Create tablespace.
SQL> create tablespace ARCHIVE02 datafile '/oradata1/data/ora12c/archive02.dbf' size 50M;
Tablespace created.

SQL> create table archtab tablespace ARCHIVE02 as select * from dba_objects;
Table created.

SQL> alter tablespace ARCHIVE02 read only;
Tablespace altered.

Export tablespace metadata.
$ expdp '" / as sysdba "' directory=EXPORT_TTS dumpfile=exp_archive02.dmp logfile=exp_archive02.log transport_tablespaces=ARCHIVE02 transport_full_check=Y

Export: Release 12.1.0.2.0 - Production on Sat Sep 16 18:18:25 2017

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

WARNING: Oracle Data Pump operations are not typically needed when connected to the root or seed of a container database.

Starting "SYS"."SYS_EXPORT_TRANSPORTABLE_01":  "/******** AS SYSDBA" directory=EXPORT_TTS dumpfile=exp_archive02.dmp logfile=exp_archive02.log transport_tablespaces=ARCHIVE02 transport_full_check=Y
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/TABLE_STATISTICS
Processing object type TRANSPORTABLE_EXPORT/STATISTICS/MARKER
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Master table "SYS"."SYS_EXPORT_TRANSPORTABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYS.SYS_EXPORT_TRANSPORTABLE_01 is:
  /oradata1/export/exp_archive02.dmp
******************************************************************************
Datafiles required for transportable tablespace ARCHIVE02:
  /oradata1/data/ora12c/archive02.dbf
Job "SYS"."SYS_EXPORT_TRANSPORTABLE_01" successfully completed at Sat Sep 16 18:18:44 2017 elapsed 0 00:00:18

Drop tablespace and keep datafile.
SQL> drop tablespace ARCHIVE02 including contents keep datafiles;
Tablespace dropped.


Change permissions for datafile to be read only.
$ chmod 0440 /oradata1/data/ora12c/archive02.dbf
$ ls -l /oradata1/data/ora12c/archive02.dbf
-r--r-----. 1 oracle oinstall 52436992 Sep 16 18:17 /oradata1/data/ora12c/archive02.dbf

Import tablespace metadata.
$ impdp '" /as sysdba "' directory=EXPORT_TTS dumpfile=exp_archive02.dmp logfile=imp_archive02.log transport_datafiles='/oradata1/data/ora12c/archive02.dbf'

Import: Release 12.1.0.2.0 - Production on Sat Sep 16 18:20:23 2017

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

WARNING: Oracle Data Pump operations are not typically needed when connected to the root or seed of a container database.

Master table "SYS"."SYS_IMPORT_TRANSPORTABLE_01" successfully loaded/unloaded
Starting "SYS"."SYS_IMPORT_TRANSPORTABLE_01":  "/******** AS SYSDBA" directory=EXPORT_TTS dumpfile=exp_archive02.dmp logfile=imp_archive02.log transport_datafiles=/oradata1/data/ora12c/archive02.dbf
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/TABLE_STATISTICS
Processing object type TRANSPORTABLE_EXPORT/STATISTICS/MARKER
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Job "SYS"."SYS_IMPORT_TRANSPORTABLE_01" successfully completed at Sat Sep 16 18:20:28 2017 elapsed 0 00:00:03

In alert log you can notice ORA-1114 IO errors because Oracle cannot modify datafile.
Plug in tablespace ARCHIVE02 with datafile
  '/oradata1/data/ora12c/archive02.dbf'
ALTER TABLESPACE "ARCHIVE02" READ WRITE
ORA-1114 signalled during: ALTER TABLESPACE "ARCHIVE02" READ WRITE...

Drop tablespace and reattach it again.
SQL> drop tablespace ARCHIVE02 including contents keep datafiles;
Tablespace dropped.

Plug in tablespace.
$ impdp '" /as sysdba "' directory=EXPORT_TTS dumpfile=exp_archive02.dmp logfile=imp_archive02.log transport_datafiles='/oradata1/data/ora12c/archive02.dbf'

Import: Release 12.1.0.2.0 - Production on Sat Sep 16 18:22:01 2017

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

WARNING: Oracle Data Pump operations are not typically needed when connected to the root or seed of a container database.

Master table "SYS"."SYS_IMPORT_TRANSPORTABLE_01" successfully loaded/unloaded
Starting "SYS"."SYS_IMPORT_TRANSPORTABLE_01":  "/******** AS SYSDBA" directory=EXPORT_TTS dumpfile=exp_archive02.dmp logfile=imp_archive02.log transport_datafiles=/oradata1/data/ora12c/archive02.dbf
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/TABLE_STATISTICS
Processing object type TRANSPORTABLE_EXPORT/STATISTICS/MARKER
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Job "SYS"."SYS_IMPORT_TRANSPORTABLE_01" successfully completed at Sat Sep 16 18:22:05 2017 elapsed 0 00:00:03

Now I didn’t received error and I was able to plug in tablespace.
I have to remind myself to change datafile permissions before plugging tablespaces from 12c version.



REFERENCES
Doc ID 2094476.1


Wednesday, February 24, 2016

Slow full table scan due to row chaining

Few days ago I’ve received complaint that simple count on 2 million rows table is running forever.

This was the statement:
select count(1)
from CLIENT k
where k.expires is null;

I've used fake names for table name and columns.
Database version: 11.2.0.4.0

Indeed, query was running longer than I would expect. Oracle was using FULL SCAN of the table with "db file sequential read" wait events. This was little odd to me as I would expect "direct path reads" or "db file scattered reads".


It was partitioned table with 4 partitions and 294 columns.

select count(*) from dba_tab_columns where table_name = 'CLIENT';
 
 COUNT(*)
----------
      294

select owner, segment_name, partition_name, bytes, blocks
from dba_segments
where segment_name in ('CLIENT');
 
OWNER      SEGMENT_NAME    PARTITION_NAME            BYTES     BLOCKS
---------- --------------- -------------------- ---------- ----------
SCOTT       CLIENT         CLIENT_OTHER          8388608        1024
SCOTT       CLIENT         CLIENT_CITY           1643118592     200576
SCOTT       CLIENT         CLIENT_CNTR           591396864      72192
SCOTT       CLIENT         CLIENT_STRNG          52428800       6400


select table_name, partition_name, NUM_ROWS, AVG_ROW_LEN
from dba_tab_partitions
where table_name='CLIENT';

TABLE_NAME                     PARTITION_NAME           NUM_ROWS    AVG_ROW_LEN
------------------------------ ----------------------- ----------- ---------------
CLIENT                          CLIENT_OTHER             0            0
CLIENT                          CLIENT_CITY              1469420      572
CLIENT                          CLIENT_CNTR              592056       495
CLIENT                          CLIENT_STRNG             48977        565


select table_name, data_type, count(*)
from dba_tab_cols
where table_name='CLIENT'
group by table_name, data_type
order by 3 desc;
 
TABLE_NAME DATA_TYPE                                  COUNT(*)
---------- ---------------------------------------- ----------
CLIENT   NUMBER                                          191
CLIENT   VARCHAR2                                         70
CLIENT   DATE                                             32
CLIENT   TIMESTAMP(6)                                      3
CLIENT   RAW                                               2
CLIENT   CL_UTR                                            1
CLIENT   O_TIP_KAR                                         1
CLIENT   O_ZA_NA                                           1
CLIENT   O_PO_OSO                                          1

Some of the columns were collections.

select type_name, typecode
from dba_types
where type_name in (select data_type
                   from dba_tab_cols
                   where table_name='CLIENT'
                   and data_type not in ('NUMBER','VARCHAR2',
                   'DATE','TIMESTAMP(6)','RAW'));
 
TYPE_NAME                      TYPECODE                     
------------------------------ ------------------------------
CL_UTR                         COLLECTION                    
O_TIP_KAR                      COLLECTION                    
O_ZA_NA                        COLLECTION                    
O_PO_OSO                       COLLECTION                    

These were varrays used to store multivalued attributes.


In trace I've seen lots disk reads and elapsed time over 2400 seconds.

select count(1)
  from CLIENT k
where k.expires is null

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2    203.96    2450.19    5455717    8240323          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4    203.97    2450.20    5455717    8240323          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 369  (MSUTIC)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=8240323 pr=5455717 pw=0 time=1349733885 us)
   1905617    1905617    1905617   PARTITION LIST ALL PARTITION: 1 4 (cr=8240323 pr=5455717 pw=0 time=2449532855 us cost=164110 size=3801914 card=1900957)
   1905617    1905617    1905617    TABLE ACCESS FULL CLIENT PARTITION: 1 4 (cr=8240323 pr=5455717 pw=0 time=2448530798 us cost=164110 size=3801914 card=1900957)


Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
      1   SORT (AGGREGATE)
1905617    PARTITION LIST (ALL) PARTITION: START=1 STOP=4
1905617     TABLE ACCESS   MODE: ANALYZED (FULL) OF 'CLIENT' (TABLE) 
                PARTITION: START=1 STOP=4

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to CLIENT                       2        0.00          0.00
  Disk file operations I/O                       29        0.00          0.00
  direct path read                             2048        0.19          9.78
  db file sequential read                   5178860        0.23       2241.08
  resmgr:internal state change                    2        0.11          0.21
  SQL*Net message from CLIENT                     1        0.00          0.00

Object statistics were telling me that all reads were from table partitions.

Session Objects Statistics
Object/Event % Time Seconds Calls - Time per Call -
Avg Min Max
Obj#(299564)
    db file sequential read 78.1% 1,757.0600s 3,677,752 0.0005s 0.0001s 0.2333s
    direct path read 0.4% 8.8314s 1,706 0.0052s 0.0004s 0.1953s
    resmgr:internal state change 0.0% 0.2162s 2 0.1081s 0.1000s 0.1162s
    Disk file operations I/O 0.0% 0.0014s 23 0.0001s 0.0000s 0.0002s
Obj#(299565)
    db file sequential read 20.5% 462.5006s 1,416,370 0.0003s 0.0001s 0.1794s
    direct path read 0.0% 0.8966s 304 0.0029s 0.0001s 0.0479s
    Disk file operations I/O 0.0% 0.0003s 6 0.0000s 0.0000s 0.0000s
Obj#(299566)
    db file sequential read 1.0% 21.5203s 84,738 0.0003s 0.0001s 0.0552s
    direct path read 0.0% 0.0587s 38 0.0015s 0.0000s 0.0206s


Hm… why am I having so many db file sequential reads with direct path reads happening also?
This is a table with lots of columns so I might have problems with chained or migrated rows.
Oracle is probably using individual block reads to fetch pieces of each row.

As I had table with more than 255 columns I would expect intra-block chaining, but this shouldn't cause sequential reads. Only if row doesn’t fit in the block I would have regular row chaining.
I’m probably having problems with row migrations.

Chained row is a row that is too large to fit into a block and if this is the root cause of the problem there isn't much I can do to improve performance. If it’s too big to fit into a block, it would be too big after rebuilding table also.

Migration of an row occurs when row is updated in a block and amount of free space in the block is not adequate to store all the row’s data. Row is migrated to another physical block.
This usually happens when you have PCTFREE set to low.

What is important for migrated rows - you can improve performance reorganizing table/partition or simply deleting/inserting chained rows.

Tanel wrote blog post on the subject "Detect chained and migrated rows in Oracle – Part 1” and I’ve decided to use his great tool Snapper to get some diagnostic info.

SQL> @sn 60 6596
@snapper all 60 1 "6596"
Sampling SID 6596 with interval 60 seconds, taking 1 snapshots...
 
-- Session Snapper v4.06 BETA - by Tanel Poder ( http://blog.tanelpoder.com ) - Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! :)
 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   SID, USERNAME  , TYPE, STATISTIC                                                 ,         DELTA, HDELTA/SEC,    %TIME, GRAPH       , NUM_WAITS,  WAITS/SEC,   AVERAGES
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  6596, MSUTIC    , STAT, session logical reads                                     ,        283813,      4.74k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, user I/O wait time                                        ,          5719,      95.46,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, non-idle wait time                                        ,          5719,      95.46,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, non-idle wait count                                       ,        193388,      3.23k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, session pga memory                                        ,         -8400,    -140.21,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, enqueue requests                                          ,             2,        .03,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, enqueue releases                                          ,             2,        .03,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, physical read total IO requests                           ,        193740,      3.23k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, physical read total multi block requests                  ,           353,       5.89,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, physical read total bytes                                 ,    1630494720,     27.21M,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, cell physical IO interconnect bytes                       ,    1630494720,     27.21M,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, consistent gets                                           ,        283812,      4.74k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, consistent gets direct                                    ,        283810,      4.74k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, physical reads                                            ,        199034,      3.32k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, physical reads direct                                     ,        199034,      3.32k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, physical read IO requests                                 ,        193739,      3.23k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, physical read bytes                                       ,    1630486528,     27.21M,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, file io wait time                                         ,      57195780,    954.66k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, Number of read IOs issued                                 ,           353,       5.89,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, no work - consistent read gets                            ,        283808,      4.74k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, table scan rows gotten                                    ,       2881106,     48.09k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, table scan blocks gotten                                  ,         83578,       1.4k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, table fetch continued row                                 ,        200188,      3.34k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , STAT, buffer is not pinned count                                ,        200226,      3.34k,         ,             ,          ,           ,          ~ per execution
  6596, MSUTIC    , TIME, DB CPU                                                    ,       5620720,    93.82ms,     9.4%, [@         ],          ,           ,
  6596, MSUTIC    , TIME, sql execute elapsed time                                  ,      60270147,      1.01s,   100.6%, [##########],          ,           ,
  6596, MSUTIC    , TIME, DB time                                                   ,      60270147,      1.01s,   100.6%, [##########],          ,           ,          ~ unaccounted time
  6596, MSUTIC    , WAIT, Disk file operations I/O                                  ,           123,     2.05us,      .0%, [          ],         2,        .03,     61.5us average wait
  6596, MSUTIC    , WAIT, db file sequential read                                   ,      57234629,   955.31ms,    95.5%, [WWWWWWWWWW],    192888,      3.22k,   296.72us average wait
 
--  End of Stats snap 1, end=2016-02-23 13:23:19, seconds=59.9
 
----------------------------------------------------------------------------------------------------
Active% | INST | SQL_ID          | SQL_CHILD | EVENT                               | WAIT_CLASS
----------------------------------------------------------------------------------------------------
   97% |    1 | 2q92xdvxjj712   | 0         | db file sequential read             | User I/O
    3% |    1 | 2q92xdvxjj712   | 0         | ON CPU                              | ON CPU
 
--  End of ASH snap 1, end=2016-02-23 13:23:19, seconds=60, samples_taken=99
 
PL/SQL procedure successfully completed.

Notice "table fetch continued row" statistic. Tanel wrote that this counter is usually increasing when rows are accessed with index access paths.
In my case I have full scan that is increasing the value. This count is number of chained pieces Oracle had to go through in order to find the individual pieces of the rows.
I won’t go any further in detail - just check Tanel’s blog post.


Let’s identify chained rows running ANALYZE command with the LIST CHAINED ROWS option. This command will collect information about each migrated or chained row.

SQL> analyze table SCOTT.CLIENT list chained rows;
 
Table analyzed.
 
SQL> select count(*) from chained_rows;
 
 COUNT(*)
----------
  2007045

SQL> select partition_name, count(*) from chained_rows group by partition_name;
 
PARTITION_NAME                   COUNT(*)
------------------------------ ----------
CLIENT_CITY                       1411813
CLIENT_CNTR                       552873
CLIENT_STRNG                      42359

Table with 2097647 rows has 2007045 chained/migrated rows. This was causing so many reads for simple full scan of the small table.


I have decided to rebuild table partitions without changing PCTFREE parameter to fit migrated rows into a single block.


After rebuild number of chained rows decreased.

SQL> analyze table SCOTT.CLIENT list chained rows;
 
Table analyzed.
 
SQL> select count(*) from chained_rows;
 
 COUNT(*)
----------
    37883

Now query finished in 14 secs without sequential reads happening.
select  count(1)
  from CLIENT k
  where k.expires is null

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      2.34      13.96     185802     185809          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      2.34      13.96     185802     185809          0           1

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 369  (MSUTIC)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=185809 pr=185802 pw=0 time=13965941 us)
   1905617    1905617    1905617   PARTITION LIST ALL PARTITION: 1 4 (cr=185809 pr=185802 pw=0 time=13560379 us cost=109526 size=3811234 card=1905617)
   1905617    1905617    1905617    TABLE ACCESS FULL CLIENT PARTITION: 1 4 (cr=185809 pr=185802 pw=0 time=12848619 us cost=109526 size=3811234 card=1905617)


Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
      1   SORT (AGGREGATE)
1905617    PARTITION LIST (ALL) PARTITION: START=1 STOP=4
1905617     TABLE ACCESS   MODE: ANALYZED (FULL) OF 'CLIENT' (TABLE) 
                PARTITION: START=1 STOP=4

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to CLIENT                       2        0.00          0.00
  direct path read                             3569        0.11          8.99
  SQL*Net message from CLIENT                     2        0.00          0.01


Snapper also showed that I don’t have problem with row chaining.
SQL> @sn 15 6601
@snapper all 15 1 "6601"
Sampling SID 6601 with interval 15 seconds, taking 1 snapshots...
 
-- Session Snapper v4.06 BETA - by Tanel Poder ( http://blog.tanelpoder.com ) - Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! :)
 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 SID, USERNAME  , TYPE, STATISTIC                                                 ,      DELTA, HDELTA/SEC, %TIME, GRAPH    , NUM_WAITS,  WAITS/SEC,   AVERAGES
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   6601, MSUTIC , STAT, Requests to/from CLIENT                                ,          1,     .07,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, user calls                                                ,          1,        .07,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, pinned cursors current                                 ,         -1,    -.07,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, session logical reads                                  ,     149590,    9.9k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, CPU used when call started                             ,        227,   15.02,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, CPU used by this session                               ,        227,   15.02,      ,             ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, DB time                                                   ,       1047,   69.29,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, user I/O wait time                                     ,        424,   28.06,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, non-idle wait time                                     ,        424,   28.06,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, non-idle wait count                                    ,       3216,  212.84,      ,          ,       ,           ,       ~ per execution
   6601, MSUTIC , STAT, session uga memory                                     ,     135248,   8.95k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, physical read total IO requests                        ,       9354,  619.07,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, physical read total multi block requests               ,       9333,  617.68,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, physical read total bytes                              , 1225228288,  81.09M,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, cell physical IO interconnect bytes                    , 1225228288,  81.09M,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, consistent gets                                        ,     149578,    9.9k,      ,             ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, consistent gets from cache                             ,          5,     .33,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, consistent gets from cache (fastpath)                  ,          5,     .33,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, consistent gets direct                                 ,     149572,    9.9k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, logical read bytes from cache                          ,      40960,   2.71k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, physical reads                                            ,     149548,    9.9k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, physical reads direct                                  ,     149548,    9.9k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, physical read IO requests                              ,       9353,  619.01,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, physical read bytes                                    , 1225097216,  81.08M,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, calls to kcmgcs                                           ,          5,     .33,      ,          ,          ,        ,       ~ per execution
   6601, MSUTIC , STAT, file io wait time                                      ,        304,   20.12,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, total number of slots                                  ,         -2,    -.13,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, Effective IO time                                      ,    4239980, 280.61k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, Number of read IOs issued                              ,       9354,  619.07,      ,          ,       ,        ,          ~ per execution
   6601, MSUTIC , STAT, no work - consistent read gets                         ,     149564,    9.9k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, Cached Commit SCN referenced                           ,     149132,   9.87k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, table scans (cache partitions)                         ,          3,      .2,         ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, table scans (direct read)                              ,          3,      .2,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, table scan rows gotten                                 ,    3518684, 232.88k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, table scan blocks gotten                               ,     149559,    9.9k,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, bytes sent via SQL*Net to CLIENT                       ,        211,   13.96,      ,          ,       ,           ,   105.5 bytes per roundtrip
   6601, MSUTIC , STAT, bytes received via SQL*Net from CLIENT                 ,          8,     .53,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , STAT, SQL*Net roundtrips to/from CLIENT                      ,          2,     .13,      ,          ,       ,        ,       ~ per execution
   6601, MSUTIC , TIME, DB CPU                                                    ,    2000964,   132.43ms, 13.2%, [@@     ],       ,        ,
   6601, MSUTIC , TIME, sql execute elapsed time                               ,    8500210,   562.57ms,    56.3%, [###### ],       ,        ,
   6601, MSUTIC , TIME, DB time                                                   ,    8500269,   562.57ms,    56.3%, [###### ],       ,        ,     14.62s unaccounted time
   6601, MSUTIC , WAIT, direct path read                                          ,    4059380,   268.66ms, 26.9%, [WWW    ],      3064,  202.78,  1.32ms average wait
   6601, MSUTIC , WAIT, SQL*Net message to CLIENT                              ,          4,   .26us,   .0%, [       ],      1,     .07,        4us average wait
   6601, MSUTIC , WAIT, SQL*Net message from CLIENT                            ,    8006127,   529.87ms,    53.0%, [WWWWWW ],      1,     .07,   8.01s average wait
 
--  End of Stats snap 1, end=2016-02-24 08:23:59, seconds=15.1
 
 
----------------------------------------------------------------------------------------------------
Active% | INST | SQL_ID       | SQL_CHILD | EVENT                               | WAIT_CLASS
----------------------------------------------------------------------------------------------------
 29% | 1 | gg54c4j6b9jb0   | 0         | direct path read                    | User I/O
 21% | 1 | gg54c4j6b9jb0   | 0         | ON CPU                           | ON CPU
 
--  End of ASH snap 1, end=2016-02-24 08:23:59, seconds=15, samples_taken=96


Reorganizing table solved my problem. Finally full scans on the table were running much faster.

There is interesting support note "Doc ID 238519.1" which states that trailing NULLs do not take space in the rowpiece: initially row fits in one rowpiece.
If column beyond 255 is then populated, then all the NULL columns between last populated and this new column now takes up space.
Row has to be split into two rowpieces and the new rowpiece is migrated to a new block - row becomes chained.

In our table we have trailing NULL columns so this probably caused such migration.


Unfortunately I don’t have time to perform detailed investigation.




REFERENCES
http://blog.tanelpoder.com/2009/11/04/detect-chained-and-migrated-rows-in-oracle/
Updating a Row with More Than 255 Columns Causes Row Chaining (Doc ID 238519.1)

Saturday, February 20, 2016

Detecting Soft Corruption in 12c - V$NONLOGGED_BLOCK, ORA-01578/ORA-26040

Last week we have created standby database in our dev environment and performed some ETL actions on primary side. Loading data or rebuilding indexes was performed with NOLOGGING option. After few days we noticed lots ORA-01578/ORA-26040 errors.
Corruption happened because we forgot to enable force logging.

As this was new dev database there wasn’t backup, but maybe not everything was lost. If only corrupted segments are indexes we could easily rebuild them.

Then I’ve learnt something new.
After performing validation check logical, we noticed lots corrupted blocks, but I was puzzled why do I have “v$database_block_corruption” view empty. Then my colleague told me that Oracle changed behaviour in reporting soft corrupted blocks in 12c version (we were using 12.1.0.2). New view was updated - V$NONLOGGED_BLOCK.

So I have created little demo case on how to detect (and repair) soft corrupted blocks on 12c database.



Create tablespace and small table.
SQL> create tablespace DEMO1 datafile '/oradata1/data/ora12c/demo01.dbf' size 50M;
Tablespace created.

SQL> create table objects tablespace DEMO as select * from dba_objects;
Table created.

SQL> alter table objects add constraint pk_obj primary key (object_id);
Table altered.

SQL> create index idx_obj_name on objects(object_name) tablespace demo1;
Index created.

Backup tablespace.
RMAN> backup tablespace DEMO1;

Starting backup at 23-AUG-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=50 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata1/data/ora12c/demo01.dbf
channel ORA_DISK_1: starting piece 1 at 23-AUG-15
channel ORA_DISK_1: finished piece 1 at 23-AUG-15
piece handle=/oradata1/fra/ORA12C/backupset/2015_08_23/o1_mf_nnndf_TAG20150823T060639_bxlkpj3j_.bkp tag=TAG20150823T060639 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 23-AUG-15

Starting Control File and SPFILE Autobackup at 23-AUG-15
piece handle=/oradata1/fra/ORA12C/autobackup/2015_08_23/o1_mf_s_888473201_bxlkpktg_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 23-AUG-15

Rebuild index with NOLOGGING option to simulate soft corruption later.
RMAN> alter index idx_obj_name rebuild nologging;
Statement processed

Confirm that we have datafiles that require backup because they have been affected with NOLOGGING operation.
RMAN> report unrecoverable;

Report of files that need backup due to unrecoverable operations
File Type of Backup Required Name
---- ----------------------- -----------------------------------
2    full or incremental     /oradata1/data/ora12c/demo01.dbf
5    full or incremental     /oradata1/data/ora12c/example01.dbf

Simulate corruption.
RMAN> alter database datafile 2 offline;
Statement processed

RMAN> restore datafile 2;

Starting restore at 23-AUG-15
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00002 to /oradata1/data/ora12c/demo01.dbf
channel ORA_DISK_1: reading from backup piece /oradata1/fra/ORA12C/backupset/2015_08_23/o1_mf_nnndf_TAG20150823T060639_bxlkpj3j_.bkp
channel ORA_DISK_1: piece handle=/oradata1/fra/ORA12C/backupset/2015_08_23/o1_mf_nnndf_TAG20150823T060639_bxlkpj3j_.bkp tag=TAG20150823T060639
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
Finished restore at 23-AUG-15

RMAN> recover datafile 2;

Starting recover at 23-AUG-15
using channel ORA_DISK_1

starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished recover at 23-AUG-15

RMAN> alter database datafile 2 online;
Statement processed

Query table with corrupted index and notice error.
SQL> select count(*) from objects where object_name like 'A%';
select count(*) from objects where object_name like 'A%'
       *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 2, block # 2617)
ORA-01110: data file 2: '/oradata1/data/ora12c/demo01.dbf'
ORA-26040: Data block was loaded using the NOLOGGING option

Let’s perform validation of datafile to check block corruption.
RMAN> backup validate check logical datafile 2;

Starting backup at 23-AUG-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=40 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata1/data/ora12c/demo01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
2    OK     460            129          6401            1776280
  File Name: /oradata1/data/ora12c/demo01.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              1537
  Index      0              462
  Other      0              4272

Finished backup at 23-AUG-15

Notice that we have 460 blocks marked corrupt but v$database_block_corruption view is empty.
SQL> select count(*) from v$database_block_corruption;

  COUNT(*)
----------
  0

Let’s query v$nonlogged_block view.
SQL> set lines 200
SQL> set pages 999
SQL> select file#, block#, blocks,object#,reason from v$nonlogged_block;

     FILE#     BLOCK#   BLOCKS OBJECT#      REASON
---------- ---------- ---------- ---------------------------------------- -------
  2  2308       12       UNKNOWN
  2  2321       15       UNKNOWN
  2  2337       15       UNKNOWN
  2  2353       15       UNKNOWN
  2  2369       15       UNKNOWN
  2  2385       15       UNKNOWN
  2  2401       15       UNKNOWN
  2  2417       15       UNKNOWN
  2  2434      126       UNKNOWN
  2  2562      126       UNKNOWN
  2  2690       91       UNKNOWN

11 rows selected.


Will RMAN detect that we have corrupted blocks?
RMAN> backup datafile 2;

Starting backup at 23-AUG-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=54 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata1/data/ora12c/demo01.dbf
channel ORA_DISK_1: starting piece 1 at 23-AUG-15
channel ORA_DISK_1: finished piece 1 at 23-AUG-15
piece handle=/oradata1/fra/ORA12C/backupset/2015_08_23/o1_mf_nnndf_TAG20150823T061602_bxll8275_.bkp tag=TAG20150823T061602 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 23-AUG-15
RMAN backup won’t fail due to NOLOGGING corrupt blocks and our backup will contain soft corrupted blocks.

Let’s Identify corrupt segments using v$nonlogged_block view.
set lines 2000
set pages 9999
col owner for a20
col partition_name for a10
col segment_name for a20

SELECT e.owner, e.segment_type, e.segment_name, e.partition_name, c.file#
     , greatest(e.block_id, c.block#) corr_start_block#
     , least(e.block_id+e.blocks-1, c.block#+c.blocks-1) corr_end_block#
     , least(e.block_id+e.blocks-1, c.block#+c.blocks-1)
       - greatest(e.block_id, c.block#) + 1 blocks_corrupted
  FROM dba_extents e, V$NONLOGGED_BLOCK c
 WHERE e.file_id = c.file#
   AND e.block_id <= c.block# + c.blocks - 1
   AND e.block_id + e.blocks - 1 >= c.block#
UNION
SELECT s.owner, s.segment_type, s.segment_name, s.partition_name, c.file#
     , header_block corr_start_block#
     , header_block corr_end_block#
     , 1 blocks_corrupted
  FROM dba_segments s, V$NONLOGGED_BLOCK c
 WHERE s.header_file = c.file#
   AND s.header_block between c.block# and c.block# + c.blocks - 1
UNION
SELECT null owner, null segment_type, null segment_name, null partition_name, c.file#
     , greatest(f.block_id, c.block#) corr_start_block#
     , least(f.block_id+f.blocks-1, c.block#+c.blocks-1) corr_end_block#
     , least(f.block_id+f.blocks-1, c.block#+c.blocks-1)
       - greatest(f.block_id, c.block#) + 1 blocks_corrupted
  FROM dba_free_space f, V$NONLOGGED_BLOCK  c
 WHERE f.file_id = c.file#
   AND f.block_id <= c.block# + c.blocks - 1
   AND f.block_id + f.blocks - 1 >= c.block#
order by file#, corr_start_block#
/



OWNER       SEGMENT_TYPE SEGMENT_NAME      PARTITION_      FILE# CORR_START_BLOCK# CORR_END_BLOCK# BLOCKS_CORRUPTED
-------------------- ------------------ -------------------- ---------- ---------- ----------------- --------------- ----------------
SYS       INDEX  IDX_OBJ_NAME     2  2308  2311      4
SYS       INDEX  IDX_OBJ_NAME     2  2312  2319      8
SYS       INDEX  IDX_OBJ_NAME     2  2321  2327      7
SYS       INDEX  IDX_OBJ_NAME     2  2328  2335      8
SYS       INDEX  IDX_OBJ_NAME     2  2337  2343      7
SYS       INDEX  IDX_OBJ_NAME     2  2344  2351      8
SYS       INDEX  IDX_OBJ_NAME     2  2353  2359      7
SYS       INDEX  IDX_OBJ_NAME     2  2360  2367      8
SYS       INDEX  IDX_OBJ_NAME     2  2369  2375      7
SYS       INDEX  IDX_OBJ_NAME     2  2376  2383      8
SYS       INDEX  IDX_OBJ_NAME     2  2385  2391      7
SYS       INDEX  IDX_OBJ_NAME     2  2392  2399      8
SYS       INDEX  IDX_OBJ_NAME     2  2401  2407      7
SYS       INDEX  IDX_OBJ_NAME     2  2408  2415      8
SYS       INDEX  IDX_OBJ_NAME     2  2417  2423      7
SYS       INDEX  IDX_OBJ_NAME     2  2424  2431      8
SYS       INDEX  IDX_OBJ_NAME     2  2434  2559    126
SYS       INDEX  IDX_OBJ_NAME     2  2562  2687    126
SYS       INDEX  IDX_OBJ_NAME     2  2690  2780     91

19 rows selected.

This is the best outcome to get if you notice corruption errors. All errors are related to index corruption so we could fix this problem rebuilding index.

SQL> alter index idx_obj_name rebuild;
alter index idx_obj_name rebuild
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 2, block # 2308)
ORA-01110: data file 2: '/oradata1/data/ora12c/demo01.dbf'
ORA-26040: Data block was loaded using the NOLOGGING option

Simply issuing "alter index rebuild" command won't work.
We should mark index unusable to drop segment before rebuilding it or just rebuild index with online option.

It is better choice to mark index unusable because you don't need additional space then, but I will simply rebuild index with online option and see what will happen.
SQL> alter index idx_obj_name rebuild online;
Index altered.

SQL> select count(*) from objects where object_name like 'A%';

  COUNT(*)
----------
      2079

No errors... but, let's validate datafile for corruption.
RMAN> backup validate check logical datafile 2;

Starting backup at 23-AUG-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=40 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata1/data/ora12c/demo01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
2    OK     460            94           6402            1779294
  File Name: /oradata1/data/ora12c/demo01.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              1537
  Index      0              587
  Other      0              4182

Finished backup at 23-AUG-15
Notice "Marked Corrupt" column. Hm... 460 like before.

Don't worry, this is not new corruption. These are FREE blocks which will be reused and Oracle will automatically re-format those blocks.
set lines 2000
set pages 9999
col owner for a20
col partition_name for a10
col segment_name for a20

SELECT e.owner, e.segment_type, e.segment_name, e.partition_name, c.file#
     , greatest(e.block_id, c.block#) corr_start_block#
     , least(e.block_id+e.blocks-1, c.block#+c.blocks-1) corr_end_block#
     , least(e.block_id+e.blocks-1, c.block#+c.blocks-1)
       - greatest(e.block_id, c.block#) + 1 blocks_corrupted
  FROM dba_extents e, V$NONLOGGED_BLOCK c
 WHERE e.file_id = c.file#
   AND e.block_id <= c.block# + c.blocks - 1
   AND e.block_id + e.blocks - 1 >= c.block#
UNION
SELECT s.owner, s.segment_type, s.segment_name, s.partition_name, c.file#
     , header_block corr_start_block#
     , header_block corr_end_block#
     , 1 blocks_corrupted
  FROM dba_segments s, V$NONLOGGED_BLOCK c
 WHERE s.header_file = c.file#
   AND s.header_block between c.block# and c.block# + c.blocks - 1
UNION
SELECT null owner, null segment_type, null segment_name, null partition_name, c.file#
     , greatest(f.block_id, c.block#) corr_start_block#
     , least(f.block_id+f.blocks-1, c.block#+c.blocks-1) corr_end_block#
     , least(f.block_id+f.blocks-1, c.block#+c.blocks-1)
       - greatest(f.block_id, c.block#) + 1 blocks_corrupted
  FROM dba_free_space f, V$NONLOGGED_BLOCK  c
 WHERE f.file_id = c.file#
   AND f.block_id <= c.block# + c.blocks - 1
   AND f.block_id + f.blocks - 1 >= c.block#
order by file#, corr_start_block#
/


OWNER       SEGMENT_TYPE SEGMENT_NAME      PARTITION_      FILE# CORR_START_BLOCK# CORR_END_BLOCK# BLOCKS_CORRUPTED
-------------------- ------------------ -------------------- ---------- ---------- ----------------- --------------- ----------------
           2  2308  2319     12
           2  2321  2335     15
           2  2337  2351     15
           2  2353  2367     15
           2  2369  2383     15
           2  2385  2399     15
           2  2401  2415     15
           2  2417  2431     15
           2  2434  2559    126
           2  2562  2687    126
           2  2690  2780     91

11 rows selected.

We could force re-formatting creating dummy table and inserting data to dummy table.
Check Doc ID 336133.1.
create table s (
       n number,
       c varchar2(4000)
     ) nologging tablespace DEMO1;


SQL> BEGIN
FOR i IN 1..1000000 LOOP
INSERT /*+ APPEND */ INTO sys.s select i, lpad('REFORMAT',3092, 'R') from dual;
commit ;
END LOOP;
END;
/  2    3    4    5    6    7


BEGIN
*
ERROR at line 1:
ORA-01653: unable to extend table SYS.S by 128 in tablespace DEMO1
ORA-06512: at line 3


SQL> drop table sys.s purge;
Table dropped.

Notice that we don't have corrupted blocks any more.
RMAN> backup validate check logical datafile 2;

Starting backup at 23-AUG-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=67 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata1/data/ora12c/demo01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
2    OK     0              3929         14593           1818933
  File Name: /oradata1/data/ora12c/demo01.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              9851
  Index      0              461
  Other      0              351

Finished backup at 23-AUG-15



Recovering corrupted index is easy, but recovering data blocks could be slightly difficult or sometimes impossible.
Perform validation and backups regularly because corruption will hit you when you least expect ;)



Saturday, May 9, 2015

ASM not starting with ORA-00845 - how to fix ASM parameter file

Few days ago I saw great post from Norman Dunbar on how to fix a broken ASM spfile.

With version 11gR2 ASM spfile can be stored in ASM diskgroup and by default Oracle Installer will put it there. So if you want to create pfile from spfile your ASM instance should be up and running.

If you have incorrect parameter in ASM spfile which is blocking ASM to start than you have slight problem. You cannot easily create pfile from spfile, correct incorrect parameter in pfile and recreate spfile, as you would do for database.

But don't worry, there are several options well explained available on net. I would recommend to practice all scenarios in you test environment if you want to avoid big stress in production later.


When I had problems with broken ASM parameter file (mostly in test/dev environment), I would always end up searching my notes or blog posts on how to solve this problem.

I knew that parameters were written directly in ASM disk header and I could extract them from there, or maybe check parameters in ASM alert log, but in back of my brain I was always thinking that there must be simpler way.

Thanks to Norman now I know how to quickly change incorrect parameter and keep other parameters intact.


I have used this trick few days ago and it worked perfectly. This blog post is just reminder which I know it will be useful for me in the future.



In my environment I have Oracle Restart with Oracle Database 12.1.0.2.0.

After starting my test server I have noticed that something is wrong because ASM was unable to start.
$ ./srvctl status asm
ASM is not running.

When I tried to start ASM manually I have received error:
$ ./srvctl start asm
PRCR-1079 : Failed to start resource ora.asm
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-00845: MEMORY_TARGET not supported on this system
. For details refer to "(:CLSN00107:)" in "/u01/app/grid/diag/crs/obelix/crs/trace/ohasd_oraagent_grid.trc".

CRS-2674: Start of 'ora.asm' on 'obelix' failed


Let's check alert log.
alert+ASM.log
Fri May 01 19:40:16 2015
MEMORY_TARGET defaulting to 1128267776.
* instance_number obtained from CSS = 1, checking for the existence of node 0...
* node 0 does not exist. instance_number = 1
Starting ORACLE instance (normal) (OS id: 4136)
Fri May 01 19:40:16 2015
CLI notifier numLatches:3 maxDescs:222
Fri May 01 19:40:16 2015
WARNING: You are trying to use the MEMORY_TARGET feature. This feature requires the /dev/shm file system to be mounted for at least 1140850688 bytes. /dev/shm is either not mounted or is mounted with available space less than this size. Please fix this so that MEMORY_TARGET can work as expected. Current available is 1051975680 and used is 208896 bytes. Ensure that the mount point is /dev/shm for this directory.


So there is problem with MEMORY_TARGET parameter, but how can I disable AMM when my ASM instance is down.

First I had to find location of ASM parameter file. I don’t have GPnP profile as this is single instance setup so I have extracted ASM parameter file location from "ora.asm" resource information.
$ crsctl stat res ora.asm -p | egrep "ASM_DISKSTRING|SPFILE"
ASM_DISKSTRING= SPFILE=+DATA/ASM/ASMPARAMETERFILE/registry.253.822856169


Create new parameter file with corrected MEMORY_TARGET parameter.
$ vi /tmp/initASM.ora
spfile="+DATA/asm/asmparameterfile/registry.253.862145335"
MEMORY_TARGET=0


Start ASM instance using new parameter file.

$ sqlplus / as sysasm

SQL*Plus: Release 12.1.0.2.0 Production on Fri May 1 20:04:39 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup pfile=/tmp/initASM.ora
ASM instance started

Total System Global Area  197132288 bytes
Fixed Size                  2922520 bytes
Variable Size             169043944 bytes
ASM Cache                  25165824 bytes
ASM diskgroups mounted

And woila!
New parameter was applied and I was able to start ASM instance.


Change parameter in ASM spfile.
SQL> alter system set memory_target=0 scope=spfile;

System altered.

Restart ASM.
SQL> shutdown immediate;
ASM diskgroups dismounted
ASM instance shutdown

[grid@obelix bin]$ ./srvctl start asm
[grid@obelix bin]$ ./srvctl status asm
ASM is running on obelix


ASM instance successfully started with corrected parameter file.



Saturday, February 28, 2015

Restore to Restore Point on Standard Edition (no Flashback technology)

Restore points and Flashback database are nice features introduced in 10g database that provide efficient point in time recovery to reverse unwanted data changes.

But what if you have Standard Edition database:

SQL> shutdown immediate;

SQL> startup mount;

SQL> alter database flashback on;
alter database flashback on
*
ERROR at line 1:
ORA-00439: feature not enabled: Flashback Database

In Standard Edition you don’t have Flashback Database feature, but you can still create restore points and perform incomplete recoveries to restore point.


Create test table and insert status row.

SQL> create table admin.test_restore (datum date, komentar varchar2(100));
Table created.

SQL> insert into admin.test_restore values (sysdate, 'Before Restore Point');
1 row created.

SQL> commit;
Commit complete.


Create restore point here.

SQL> create restore point RP_UPGRADE;

Restore point created.


SQL> select scn, to_char(time,'dd.mm.yyyy hh24:mi:ss') time, name
  2  from v$restore_point;

       SCN TIME                NAME
---------- ------------------- ---------------------
    580752 27.02.2015 10:31:19 RP_UPGRADE

Notice how name of restore point is associated with SCN of the database.


Now you can perform potentially dangerous operations like database upgrades, table modifications, truncating data and like.

I will enter some status data for later checks.

SQL> insert into admin.test_restore values (sysdate, 'After Restore Point');
1 row created.

SQL> insert into admin.test_restore values (sysdate, 'Upgrade actions performed');
1 row created.

SQL> commit;
Commit complete.


Check table.

SQL> alter session set nls_date_format='dd.mm.yyyy hh24:mi:ss';
Session altered.

SQL> select datum, komentar from admin.test_restore order by datum;

DATUM               KOMENTAR
------------------- ------------------------------
27.02.2015 10:30:39 Before Restore Point
27.02.2015 10:31:45 After Restore Point
27.02.2015 10:31:55 Upgrade actions performed


Suppose we had some problems and want to "rewind" database to restore point. In EE we would perform flashback database to restore point but in SE we will use different approach.


Shutdown database and startup mount.

RMAN> shutdown immediate;

using target database control file instead of recovery catalog
database closed
database dismounted
Oracle instance shut down

RMAN> startup mount;

connected to target database (not started)
Oracle instance started
database mounted

Total System Global Area     471830528 bytes

Fixed Size                     2254344 bytes
Variable Size                247466488 bytes
Database Buffers             213909504 bytes
Redo Buffers                   8200192 bytes


Restore and recover database until restore point RP_UPGRADE.

RMAN> restore database until restore point RP_UPGRADE;

Starting restore at 27.02.2015 10:36:26
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=247 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to +DATA1/ora11gr2/datafile/system.291.872722799
channel ORA_DISK_1: restoring datafile 00002 to +DATA1/ora11gr2/datafile/sysaux.292.872722847
channel ORA_DISK_1: restoring datafile 00003 to +DATA1/ora11gr2/datafile/undotbs1.278.872722879
channel ORA_DISK_1: restoring datafile 00004 to +DATA1/ora11gr2/datafile/users.296.872722925
channel ORA_DISK_1: reading from backup piece +FRA1/ora11gr2/backupset/2015_02_27/nnndf0_tag20150227t102559_0.1164.872763961
channel ORA_DISK_1: piece handle=+FRA1/ora11gr2/backupset/2015_02_27/nnndf0_tag20150227t102559_0.1164.872763961 tag=TAG20150227T102559
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:01:35
Finished restore at 27.02.2015 10:38:02

RMAN> recover database until restore point RP_UPGRADE;

Starting recover at 27.02.2015 10:38:45
using channel ORA_DISK_1

starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished recover at 27.02.2015 10:38:49


Open database with resetlogs option.

RMAN> sql 'alter database open resetlogs';

sql statement: alter database open resetlogs


Final check.

SQL> alter session set nls_date_format='dd.mm.yyyy hh24:mi:ss';

Session altered.

SQL> select datum, komentar
  2  from admin.test_restore
  3  order by datum;

DATUM               KOMENTAR
------------------- --------------------------------------------------
27.02.2015 10:30:39 Before Restore Point


We "rewound" database to state that existed before RP_UPGRADE restore point is created.
This was incomplete recovery and RP_UPGRADE restore point was used just to mark location in time.



Monday, December 22, 2014

ORA-19599 block corruption when filesystemio_options=SETALL on ext4 file system using Linux

Few days ago I experienced strange issue in my development environment running on OEL 5.8 with EXT4 filesystem. Note - EXT4 filesystem is supported from OEL 5.6 version.

This was virtual machine running oldish 10.2.0.5.0 Oracle database.

I noticed that backup for my database is failing because of archive log corruption. As this is development database I simply deleted corrupted archive logs and initiated full backup again. But backup failed because new archive logs were corrupted.

Weird issue...

I forced switch of log file few times and validated new archive logs - everything was OK. Redo logs were multiplexed and everything was fine with them. I have validated database for physical and logical corruption - everything was OK.

Then I initiated backup again and it failed.
This is excerpt from RMAN log (I've changed log slightly):


RMAN> connect target *
2> run
3> {
7>
8> ALLOCATE CHANNEL d1 DEVICE TYPE DISK;
9> BACKUP INCREMENTAL LEVEL 0 FORMAT '/u01/backup_db/QAS/fullbkp_dir/FULL_%d_%u' DATABASE TAG "weekly_full";
10> RELEASE CHANNEL d1;
11> sql 'alter system archive log current';
12> ALLOCATE CHANNEL d1 DEVICE TYPE DISK;
13> BACKUP (ARCHIVELOG ALL FORMAT '/u01/backup_db/QAS/fullbkp_dir/ARCH_%d_%T_%u_s%s_p%p' DELETE INPUT TAG "archivelogs");
14> RELEASE CHANNEL d1;
15>
16> DELETE OBSOLETE;
17>
18> BACKUP CURRENT CONTROLFILE FORMAT '/u01/backup_db/QAS/fullbkp_dir/controlf_%d_%u_%s_%T';
19> }
20>
connected to target database: QAS (DBID=2203246509)
using target database control file instead of recovery catalog

allocated channel: d1
channel d1: sid=43 devtype=DISK

Starting backup at 17.12.2014 08:17:02
channel d1: starting compressed incremental level 0 datafile backupset
channel d1: specifying datafile(s) in backupset
input datafile fno=00035 name=/u01/oradata/qas700.data1
input datafile fno=00036 name=/u01/oradata/qas700.data2
input datafile fno=00037 name=/u01/oradata/qas700.data3
input datafile fno=00002 name=/u01/oradata/undo.data1
...
...
...
channel d1: starting piece 1 at 17.12.2014 08:17:03
channel d1: finished piece 1 at 17.12.2014 09:45:48
piece handle=/u01/backup_db/QAS/fullbkp_dir/FULL_QAS_26pqchvu tag=WEEKLY_FULL comment=NONE
channel d1: backup set complete, elapsed time: 01:28:46
Finished backup at 17.12.2014 09:45:48

Starting Control File and SPFILE Autobackup at 17.12.2014 09:45:48
piece handle=/u01/app/oracle10/product/10.2.0/db_1/dbs/c-2203246509-20141217-13 comment=NONE
Finished Control File and SPFILE Autobackup at 17.12.2014 09:45:53

released channel: d1

sql statement: alter system archive log current

allocated channel: d1
channel d1: sid=43 devtype=DISK

Starting backup at 17.12.2014 09:45:54
current log archived
channel d1: starting compressed archive log backupset
channel d1: specifying archive log(s) in backup set
input archive log thread=1 sequence=11350 recid=39 stamp=866540753
input archive log thread=1 sequence=11351 recid=40 stamp=866540754
channel d1: starting piece 1 at 17.12.2014 09:45:55
released channel: d1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on d1 channel at 12/17/2014 09:45:56
ORA-19599: block number 6144 is corrupt in archived log /u01/oradata/QAS/QASarch/1_11350_826737654.dbf

Recovery Manager complete.


Notice that full backup finished successfully and when RMAN tried to backup new archive logs it failed due to corruption.

I've mentioned this issue on Twitter and got responses from Ronald Rood (@Ik_zelf) and Philippe Fierens (@pfierens) who helped me to find problem resolution.
Thanks guys!


Check this note:
ORA-1578 ORA-353 ORA-19599 Corrupt blocks with zeros when filesystemio_options=SETALL on ext4 file system using Linux (Doc ID 1487957.1)

I had filesystemio_options configured as SETALL and resetting this parameter to default value solved my corruption problem.


As this was development machine I wasn't thinking much about filesystem, but next time it will be ASM or XFS - EXT4 probably not :-)


Wednesday, October 29, 2014

Mount ASM diskgroups with new ASM instance

Imagine you have 11gR2 Oracle Restart configuration with database files located in ASM.

After server crash you realized that local disks are corrupted and with local disks you lost all Oracle installations. Even though this is important system you don’t have database backup (always take backups!).

But you managed to save all ASM disks as they were located on separate storage.


This will be small beginner guide on how to help yourself in such situation.


As old server crashed you must create new server configuration, identical as old configuration. Nice thing about ASM is that it keeps it’s metadata in disk header. If disks are intact and headers are not damaged you should be able to mount diskgroups with new ASM instance. But this new instance must be compatible with your diskgroups.


Grid Infrastrcuture and database software were 11.2.0.1 version and this version I will install on new server.

To keep this post short enough steps like creating users, installing ASMLib and other packages, configuring kernel parameters,... are excluded.


List Oracle ASM disks mounted to new server.
With "scandisks" command I will find devices which have been labeled as ASM disks.

# oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Scanning system for ASM disks...

# oracleasm listdisks
DISK1
DISK2
DISK3
DISK4
DISK5
FRA1

Install "Oracle Grid Infrastructure software only" option to avoid automatic Oracle Restart and ASM configuration. This configuration will be performed later manually.

After installation finished run noted perl script as root to configure Grid Infrastructure for a Stand-Alone server.
For my configuration script looks like this:
To configure Grid Infrastructure for a Stand-Alone Server run the following command as the root user:
/u01/app/11.2.0.1/grid/perl/bin/perl -I/u01/app/11.2.0.1/grid/perl/lib -I/u01/app/11.2.0.1/grid/crs/install /u01/app/11.2.0.1/grid/crs/install/roothas.pl


Start cssd if it’s not running.

# ./crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        OFFLINE OFFLINE
ora.diskmon
      1        OFFLINE OFFLINE

# ./crs_start ora.cssd
Attempting to start `ora.cssd` on member `asterix`
Attempting to stop `ora.diskmon` on member `asterix`
Stop of `ora.diskmon` on member `asterix` succeeded.
Attempting to start `ora.diskmon` on member `asterix`
Start of `ora.diskmon` on member `asterix` succeeded.
Start of `ora.cssd` on member `asterix` succeeded.


Create parameter file for ASM instance in $ORACLE_HOME/dbs directory of Grid Infrastructure.

init+ASM.ora
*.asm_diskstring='/dev/oracleasm/disks'
*.asm_power_limit=1
*.diagnostic_dest='/u01/app/grid'
*.instance_type='asm'
*.large_pool_size=12M
*.remote_login_passwordfile='EXCLUSIVE'


Register and start ASM instance.

$ export ORACLE_SID=+ASM
$ export ORACLE_HOME=/u01/app/11.2.0.1/grid
$ srvctl add asm -p $ORACLE_HOME/dbs/init+ASM.ora

$ srvctl start asm
$ srvctl status asm
ASM is running on asterix


Now notice what I see when I start ASM configuration assistant.

$ ./asmca



These are diskgroups with my database and recovery files.
Click "Mount all" to mount them all.






Install Oracle database software and create parameter file in "$ORACLE_HOME/dbs" to start database.

$ export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1
$ export ORACLE_SID=ora11gr2

$ cd $ORACLE_HOME/dbs
$ cat initora11gr2.ora
*.spfile='+DATA1/ora11gr2/spfileora11gr2.ora'

$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on Wed Oct 29 14:29:37 2014

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area  668082176 bytes
Fixed Size                  2216344 bytes
Variable Size             222301800 bytes
Database Buffers          436207616 bytes
Redo Buffers                7356416 bytes
Database mounted.
Database opened.
SQL>
SQL>
SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
+DATA1/ora11gr2/datafile/system.297.844627929
+DATA1/ora11gr2/datafile/sysaux.265.844627967
+DATA1/ora11gr2/datafile/undotbs1.266.844627991
+DATA1/ora11gr2/datafile/users.267.844628031
+DATA2/ora11gr2/datafile/marko.261.859213577


Database is successfully opened and you can register instance using SRVCTL command.

$ srvctl add database -d $ORACLE_SID -o $ORACLE_HOME -p $ORACLE_HOME/dbs/initora11gr2.ora
$ srvctl start database -d $ORACLE_SID


Final status.

$ ./crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA1.dg
               ONLINE  ONLINE       asterix
ora.DATA2.dg
               ONLINE  ONLINE       asterix
ora.FRA1.dg
               ONLINE  ONLINE       asterix
ora.asm
               ONLINE  ONLINE       asterix                  Started
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        ONLINE  ONLINE       asterix
ora.diskmon
      1        ONLINE  ONLINE       asterix
ora.ora11gr2.db
      1        ONLINE  ONLINE       asterix                  Open


Be aware that this demo is performed in virtual environment on my notebook.

Friday, July 25, 2014

Using Oracle Flex ASM with single instance database

Oracle Flex ASM was introduced in 12c version. This is one of the best features introduced with new version in my opinion.

I won’t speak in detail about Flex ASM because you can find more information in documentation. In this post I will concentrate on how Flex ASM handles crash of ASM instance.

For this test I’ve created 2 node cluster - 12c Grid Infrastructure with Flex ASM enabled.

$ asmcmd showclustermode
ASM cluster : Flex mode enabled

$ srvctl config asm ASM home: /u01/app/12.1.0/grid_1 Password file: +OCRVOTE/ASM/PASSWORD/pwdasm.256.853771307 ASM listener: LISTENER ASM instance count: ALL Cluster ASM listener: ASMNET1LSNR_ASM
$ srvctl status asm ASM is running on cluster1,cluster2


Install single instance database on one of the nodes.

$ ./dbca -silent \
> -createDatabase \
> -templateName General_Purpose.dbc \
> -gdbName singl12 \
> -sid singl12 \
> -sysPassword oracle \
> -SystemPassword oracle \
> -emConfiguration none \
> -recoveryAreaDestination FRA \
> -storageType ASM \
> -asmSysPassword oracle \
> -diskGroupName DATA \
> -characterSet AL32UTF8 \
> -nationalCharacterSet AL16UTF16 \
> -totalMemory 768 \

Copying database files 1% complete 3% complete 10% complete 17% complete 24% complete 31% complete 35% complete Creating and starting Oracle instance 37% complete 42% complete 47% complete 52% complete 53% complete 56% complete 58% complete Registering database with Oracle Restart 64% complete Completing Database Creation 68% complete 71% complete 75% complete 85% complete 96% complete 100% complete Look at the log file "/u01/app/orcl12/cfgtoollogs/dbca/singl12/singl12.log" for further details.


Single instance database is registered to the OCR.

$ srvctl config database -d singl12
Database unique name: singl12
Database name: singl12
Oracle home: /u01/app/orcl12/product/12.1.0/dbhome_1
Oracle user: orcl12
Spfile: +DATA/singl12/spfilesingl12.ora
Password file:
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: singl12
Database instance: singl12
Disk Groups: DATA
Mount point paths:
Services:
Type: SINGLE  <<<<<-------
Database is administrator managed

V$ASM_CLIENT shows that my database is managed by the Oracle ASM instance.

SQL> select instance_name, db_name, status
  2  from v$asm_client
  3  where db_name='singl12';

INSTANCE_NAME        DB_NAME  STATUS
-------------------- -------- ------------
singl12              singl12  CONNECTED


Check that ASM instances are running on both nodes.

$ ./crsctl status resource ora.asm
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE            , ONLINE
STATE=ONLINE on cluster2, ONLINE on cluster1


My database is running on cluster1 node.

$ srvctl status database -d singl12
Instance singl12 is running on node cluster1

SQL> select instance_name, host_name from v$instance; INSTANCE_NAME HOST_NAME --------------- -------------------- singl12 cluster1.localdomain

Now I will simulate crash of ASM instance on cluster1 node where I have my database running.

# ps -ef|grep asm_pmon|grep -v grep
oracle    3072     1  0 10:12 ?        00:00:01 asm_pmon_+ASM1

# kill -9 3072

Without Flex ASM I would expect that crash of ASM instance would crash database instance also but with Flex ASM my database stays up and running.

Check alert log of database instance:
...
NOTE: ASMB registering with ASM instance as client 0x10005 (reg:2156157897)
NOTE: ASMB connected to ASM instance +ASM2 (Flex mode; client id 0x10005)
NOTE: ASMB rebuilding ASM server state
NOTE: ASMB rebuilt 1 (of 1) groups
NOTE: ASMB rebuilt 13 (of 13) allocated files
NOTE: fetching new locked extents from server
NOTE: 0 locks established; 0 pending writes sent to server
SUCCESS: ASMB reconnected & completed ASM server state

Check line - "NOTE: ASMB connected to ASM instance +ASM2 (Flex mode; client id 0x10005)"

As +ASM1 instance crashed ASMB connected to ASM instance +ASM2.


Check status:
# ./crsctl status resource ora.asm
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE            , ONLINE
STATE=ONLINE on cluster2, INTERMEDIATE on cluster1


SQL> select instance_name, host_name from v$instance;

INSTANCE_NAME   HOST_NAME
--------------- --------------------
singl12         cluster1.localdomain

Oracle Clusterware restarted crashed ASM instance and both instances were up in a minute.

# ./crsctl status resource ora.asm
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE            , ONLINE
STATE=ONLINE on cluster2, ONLINE on cluster1

Now to test crash ASM instance on second node.

SQL> select instance_name from v$instance;

INSTANCE_NAME
----------------
+ASM2

SQL> shutdown abort;
ASM instance shutdown

Excerpt from alertlog:

...
Fri Jul 25 12:44:33 2014
NOTE: ASMB registering with ASM instance as client 0x10005 (reg:4169355750)
NOTE: ASMB connected to ASM instance +ASM1 (Flex mode; client id 0x10005)
NOTE: ASMB rebuilding ASM server state
NOTE: ASMB rebuilt 1 (of 1) groups
NOTE: ASMB rebuilt 13 (of 13) allocated files
NOTE: fetching new locked extents from server
NOTE: 0 locks established; 0 pending writes sent to server
SUCCESS: ASMB reconnected & completed ASM server state


Again, user connected to database instance didn’t even noticed that something is happening with ASM.

Flex ASM enables for ASM instance to run on separate nodes than database servers. If ASM instance fails database will failover to another available ASM instance.

In case you are running <12c databases on your cluster you can still configure Flex ASM but you are required to configure local ASM instances on nodes. ASM instance failover won’t work for 10g or 11g databases.

Good reason to move towards 12c? ;-)



Thursday, July 17, 2014

ORA-19909: datafile 1 belongs to an orphan incarnation

I love to read Oracle related blogs, forum posts and mailing lists much more often than books. Why? Because there many Oracle DBA’s and developers share their experiences, problems, "best practices",... which are very valuable to me.

It's great that we have so big and active Oracle community.

Today I noticed mail from Oracle-L list where someone asked for help with recovery after overwriting production controlfiles. Check Oracle-L for more info.

It reminded me that I haven’t played with controlfile recoveries for a while.
Mistakes or various disasters could happen when you least expect it. When problem occurs it is essential that DBA is confident with recovery procedure.
Confidence comes only with practice. Create test environment and enjoy in destroying/recovering databases.
Of course - be creative with disasters :)


So I’ve imagined one scenario and decided to share it in form of blog post.


Create backup of current controlfile.

RMAN> backup current controlfile;

Starting backup at 17.07.2014 13:45:29
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
channel ORA_DISK_1: starting piece 1 at 17.07.2014 13:45:30
channel ORA_DISK_1: finished piece 1 at 17.07.2014 13:45:33
piece handle=/u01/oradata/fra/ORAKL/backupset/2014_07_17/o1_mf_ncnnf_TAG20140717T142921_9whjf2sc_.bkp tag=TAG20140717T134529 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 17.07.2014 13:45:33

After that I’ve performed incomplete recovery and opened database with resetlogs option.

...
...
...
SQL> alter database open resetlogs;
Database altered.

Create small status table and insert one row.

SQL> create table admin.test (datum date);
Table created.

SQL> insert into admin.test values(sysdate);
1 row created.

SQL> commit;
Commit complete.

SQL> select to_char(datum,'dd.mm.yyyy hh24:mi:ss') from admin.test;

TO_CHAR(DATUM,'DD.M
-------------------
17.07.2014 14:44:44


Where are my controlfiles:

SQL> select name from v$controlfile;

NAME
--------------------------------------------------------------------------------
/u01/oradata/orakl/orakl/control01.ctl
/u01/oradata/orakl/orakl/control02.ctl
/u01/oradata/orakl/orakl/control03.ctl

Overwrite all controlfiles with random file to simulate problem.

$ cp o1_mf_s_853163282_9whfzmk1_.bkp /u01/oradata/orakl/orakl/control01.ctl
$ cp o1_mf_s_853163282_9whfzmk1_.bkp /u01/oradata/orakl/orakl/control02.ctl
$ cp o1_mf_s_853163282_9whfzmk1_.bkp /u01/oradata/orakl/orakl/control03.ctl

After that instance crashed with error in alertlog:

Thu Jul 17 14:45:30 2014
Errors in file /u01/app/oracle/diag/rdbms/orakl/orakl/trace/orakl_ckpt_8103.trc:
ORA-00201: control file version  incompatible with ORACLE version
ORA-00202: control file: '/u01/oradata/orakl/orakl/control01.ctl'
CKPT (ospid: 8103): terminating the instance due to error 201
Instance terminated by CKPT, pid = 8103

To perform quick recovery I’ve restored previously backed up controlfile and mounted database.

RMAN> run
2> {
3> restore controlfile from '/u01/oradata/fra/ORAKL/backupset/2014_07_17/o1_mf_ncnnf_TAG20140717T142921_9whjf2sc_.bkp';
4> }

Starting restore at 17.07.2014 14:46:34
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=153 device type=DISK

channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
output file name=/u01/oradata/orakl/orakl/control01.ctl
output file name=/u01/oradata/orakl/orakl/control02.ctl
output file name=/u01/oradata/orakl/orakl/control03.ctl
Finished restore at 17.07.2014 14:46:37

RMAN> alter database mount;
database mounted

Recover database using backup controlfile.

SQL> recover database using backup controlfile;
ORA-00283: recovery session canceled due to errors
ORA-19909: datafile 1 belongs to an orphan incarnation
ORA-01110: data file 1: '/u01/oradata/orakl/orakl/system01.dbf'

Recovery failed with ORA-19909 error.
If you remember I’ve performed incomplete recovery after creating backup of controlfile.
Controlfile backup belongs to another incarnation, and I don’t have fresh controlfile backup.


I will use this “old” controlfile and create SQL script to reproduce new control file.

SQL> alter database backup controlfile to trace as '/tmp/ctlfile.sql';

Database altered.

Delete unnecessary lines and create SQL script "/tmp/ctlfile.sql":

STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "ORAKL" RESETLOGS  ARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 292
LOGFILE
  GROUP 1 '/u01/oradata/orakl/orakl/redo01.log'  SIZE 50M,
  GROUP 2 '/u01/oradata/orakl/orakl/redo02.log'  SIZE 50M,
  GROUP 3 '/u01/oradata/orakl/orakl/redo03.log'  SIZE 50M
DATAFILE
  '/u01/oradata/orakl/orakl/system01.dbf',
  '/u01/oradata/orakl/orakl/sysaux01.dbf',
  '/u01/oradata/orakl/orakl/undotbs01.dbf',
  '/u01/oradata/orakl/orakl/users01.dbf'
CHARACTER SET UTF8
;

Shutdown instance and create controlfile.

SQL> shutdown abort;
ORACLE instance shut down.
SQL>
SQL>
SQL> @/tmp/ctlfile.sql
ORACLE instance started.

Total System Global Area  626327552 bytes
Fixed Size                  2162280 bytes
Variable Size             171966872 bytes
Database Buffers          444596224 bytes
Redo Buffers                7602176 bytes

Control file created.

Use information from redologs for recovery.

SQL> select member, status from v$logfile;

MEMBER                                                                 STATUS
---------------------------------------------------------------------- ---------
/u01/oradata/orakl/orakl/redo03.log                                    STALE
/u01/oradata/orakl/orakl/redo02.log                                    STALE
/u01/oradata/orakl/orakl/redo01.log                                    STALE

SQL> recover database using backup controlfile;
ORA-00279: change 7015200 generated at 07/17/2014 14:43:54 needed for thread 1
ORA-00289: suggestion :
/u01/oradata/fra/ORAKL/archivelog/2014_07_17/o1_mf_1_1_%u_.arc
ORA-00280: change 7015200 for thread 1 is in sequence #1


Specify log: {=suggested | filename | AUTO | CANCEL}
/u01/oradata/orakl/orakl/redo01.log
Log applied.
Media recovery complete.

Open database with resetlogs.

SQL> alter database open resetlogs;

Database altered.

Check status table.

SQL> select to_char(datum,'dd.mm.yyyy hh24:mi:ss') from admin.test;

TO_CHAR(DATUM,'DD.M
-------------------
17.07.2014 14:44:44

Check incarnations:

RMAN> list incarnation;

using target database controlfile instead of recovery catalog

List of Database Incarnations
DB Key  Inc Key DB Name  DB ID            STATUS  Reset SCN  Reset Time
------- ------- -------- ---------------- --- ---------- ----------
2       2       ORAKL    3724279545       PARENT  641001     12.10.2013 15:30:46
1       1       ORAKL    3724279545       PARENT  7015197    17.07.2014 14:43:44
3       3       ORAKL    3724279545       CURRENT 7015386    17.07.2014 14:55:50



Cheers!

Friday, June 27, 2014

DBMS_SPM.LOAD_PLAN_FROM_CURSOR_CACHE typo

I've played a little with Oracle SQL Plan Management and something caught my eye.

Let's create simple test.
SQL> select /* test_spm */ count(*) from admin.objekti;

  COUNT(*)
----------
     76378

SQL> set lines 200
SQL> col sql_text for a50 wrapped
SQL> select sql_id, sql_text from v$sql
  2  where sql_text like '%test_spm%';

SQL_ID        SQL_TEXT
------------- --------------------------------------------------
cpzgdw9swdvzk select /* test_spm */ count(*) from admin.objekti
ap9td7vafq26n select sql_id, sql_text from v$sql where sql_text
              like '%test_spm%'

I will try to load plan from the cursor cache using SQL_ID.
SQL> var cnt number
SQL> execute :cnt := DBMS_SPM.LOAD_PLAN_FROM_CURSOR_CACHE(sql_id => 'cpzgdw9swdvzk');
BEGIN :cnt := DBMS_SPM.LOAD_PLAN_FROM_CURSOR_CACHE(sql_id => 'cpzgdw9swdvzk'); END;

                       *
ERROR at line 1:
ORA-06550: line 1, column 24:
PLS-00302: component 'LOAD_PLAN_FROM_CURSOR_CACHE' must be declared
ORA-06550: line 1, column 7:
PL/SQL: Statement ignored

Hm... PLS-00302 error when I use LOAD_PLAN_FROM_CURSOR_CACHE.

Change function name and use PLANS instead of PLAN.
SQL> execute :cnt := DBMS_SPM.LOAD_PLANS_FROM_CURSOR_CACHE(sql_id => 'cpzgdw9swdvzk');

PL/SQL procedure successfully completed.

SQL> print cnt

       CNT
----------
         1

Now everything works as it should.

It is well written in documentation to use PLANS:

LOAD_PLANS_FROM_CURSOR_CACHE Functions

This function loads one or more plans present in the cursor cache for a SQL statement, or a set of SQL statements. It has four overloads: using SQL statement text, using SQL handle, using SQL ID, or using attribute_name and attribute_value pair.

But, try to search LOAD_PLAN_FROM_CURSOR_CACHE on Google.

There are many sites mentioning that procedure.

Even in Oracle documentation - "Oracle® Database Upgrade Guide 12c Release 1 (12.1)" (and earlier releases) this procedure is mentioned:

1. In the source release of Oracle Database, use the DBMS_SPM.LOAD_PLAN_FROM_CURSOR_CACHE procedure or Oracle Enterprise Manager to load all of the execution plans in the cursor cache into the SQL Management Base.


Now I'm a bit confused.

Let's check procedure names on three database releases.


12.1.0.1.0
SQL> select procedure_name from dba_procedures
  2  where procedure_name like 'LOAD_PLAN%';

PROCEDURE_NAME
--------------------------------------------------------------------------------------------------------------------------------
LOAD_PLANS_SET
LOAD_PLANS_FROM_SQLSET
LOAD_PLANS_FROM_CURSOR_CACHE
LOAD_PLANS_FROM_CURSOR_CACHE
LOAD_PLANS_FROM_CURSOR_CACHE
LOAD_PLANS_FROM_CURSOR_CACHE

6 rows selected.

11.2.0.4.0
SQL> select procedure_name from dba_procedures
  2  where procedure_name like 'LOAD_PLAN%';

PROCEDURE_NAME
------------------------------
LOAD_PLANS_SET
LOAD_PLANS_FROM_SQLSET
LOAD_PLANS_FROM_CURSOR_CACHE
LOAD_PLANS_FROM_CURSOR_CACHE
LOAD_PLANS_FROM_CURSOR_CACHE
LOAD_PLANS_FROM_CURSOR_CACHE

6 rows selected.

11.1.0.7.0
SQL> select procedure_name from dba_procedures
  2  where procedure_name like 'LOAD_PLAN%';

PROCEDURE_NAME
------------------------------
LOAD_PLANS_SET
LOAD_PLANS_FROM_SQLSET
LOAD_PLANS_FROM_CURSOR_CACHE
LOAD_PLANS_FROM_CURSOR_CACHE
LOAD_PLANS_FROM_CURSOR_CACHE
LOAD_PLANS_FROM_CURSOR_CACHE

6 rows selected.


Still not a sign about LOAD_PLAN_FROM_CURSOR_CACHE or maybe I'm looking in wrong direction.

At the end I think that this is just often used typo.