I won’t speak in detail about Flex ASM because you can find more information in documentation. In this post I will concentrate on how Flex ASM handles crash of ASM instance.
For this test I’ve created 2 node cluster - 12c Grid Infrastructure with Flex ASM enabled.
$ asmcmd showclustermode ASM cluster : Flex mode enabled
$ srvctl config asm ASM home: /u01/app/12.1.0/grid_1 Password file: +OCRVOTE/ASM/PASSWORD/pwdasm.256.853771307 ASM listener: LISTENER ASM instance count: ALL Cluster ASM listener: ASMNET1LSNR_ASM
$ srvctl status asm ASM is running on cluster1,cluster2
Install single instance database on one of the nodes.
$ ./dbca -silent \ > -createDatabase \ > -templateName General_Purpose.dbc \ > -gdbName singl12 \ > -sid singl12 \ > -sysPassword oracle \ > -SystemPassword oracle \ > -emConfiguration none \ > -recoveryAreaDestination FRA \ > -storageType ASM \ > -asmSysPassword oracle \ > -diskGroupName DATA \ > -characterSet AL32UTF8 \ > -nationalCharacterSet AL16UTF16 \ > -totalMemory 768 \
Copying database files 1% complete 3% complete 10% complete 17% complete 24% complete 31% complete 35% complete Creating and starting Oracle instance 37% complete 42% complete 47% complete 52% complete 53% complete 56% complete 58% complete Registering database with Oracle Restart 64% complete Completing Database Creation 68% complete 71% complete 75% complete 85% complete 96% complete 100% complete Look at the log file "/u01/app/orcl12/cfgtoollogs/dbca/singl12/singl12.log" for further details.
Single instance database is registered to the OCR.
$ srvctl config database -d singl12 Database unique name: singl12 Database name: singl12 Oracle home: /u01/app/orcl12/product/12.1.0/dbhome_1 Oracle user: orcl12 Spfile: +DATA/singl12/spfilesingl12.ora Password file: Domain: Start options: open Stop options: immediate Database role: PRIMARY Management policy: AUTOMATIC Server pools: singl12 Database instance: singl12 Disk Groups: DATA Mount point paths: Services: Type: SINGLE <<<<<------- Database is administrator managed
V$ASM_CLIENT shows that my database is managed by the Oracle ASM instance.
SQL> select instance_name, db_name, status 2 from v$asm_client 3 where db_name='singl12'; INSTANCE_NAME DB_NAME STATUS -------------------- -------- ------------ singl12 singl12 CONNECTED
Check that ASM instances are running on both nodes.
$ ./crsctl status resource ora.asm NAME=ora.asm TYPE=ora.asm.type TARGET=ONLINE , ONLINE STATE=ONLINE on cluster2, ONLINE on cluster1
My database is running on cluster1 node.
$ srvctl status database -d singl12 Instance singl12 is running on node cluster1
SQL> select instance_name, host_name from v$instance; INSTANCE_NAME HOST_NAME --------------- -------------------- singl12 cluster1.localdomain
Now I will simulate crash of ASM instance on cluster1 node where I have my database running.
# ps -ef|grep asm_pmon|grep -v grep oracle 3072 1 0 10:12 ? 00:00:01 asm_pmon_+ASM1 # kill -9 3072
Without Flex ASM I would expect that crash of ASM instance would crash database instance also but with Flex ASM my database stays up and running.
Check alert log of database instance:
... NOTE: ASMB registering with ASM instance as client 0x10005 (reg:2156157897) NOTE: ASMB connected to ASM instance +ASM2 (Flex mode; client id 0x10005) NOTE: ASMB rebuilding ASM server state NOTE: ASMB rebuilt 1 (of 1) groups NOTE: ASMB rebuilt 13 (of 13) allocated files NOTE: fetching new locked extents from server NOTE: 0 locks established; 0 pending writes sent to server SUCCESS: ASMB reconnected & completed ASM server state
Check line - "NOTE: ASMB connected to ASM instance +ASM2 (Flex mode; client id 0x10005)"
As +ASM1 instance crashed ASMB connected to ASM instance +ASM2.
Check status:
# ./crsctl status resource ora.asm NAME=ora.asm TYPE=ora.asm.type TARGET=ONLINE , ONLINE STATE=ONLINE on cluster2, INTERMEDIATE on cluster1 SQL> select instance_name, host_name from v$instance; INSTANCE_NAME HOST_NAME --------------- -------------------- singl12 cluster1.localdomain
Oracle Clusterware restarted crashed ASM instance and both instances were up in a minute.
# ./crsctl status resource ora.asm NAME=ora.asm TYPE=ora.asm.type TARGET=ONLINE , ONLINE STATE=ONLINE on cluster2, ONLINE on cluster1
Now to test crash ASM instance on second node.
SQL> select instance_name from v$instance; INSTANCE_NAME ---------------- +ASM2 SQL> shutdown abort; ASM instance shutdown
Excerpt from alertlog:
... Fri Jul 25 12:44:33 2014 NOTE: ASMB registering with ASM instance as client 0x10005 (reg:4169355750) NOTE: ASMB connected to ASM instance +ASM1 (Flex mode; client id 0x10005) NOTE: ASMB rebuilding ASM server state NOTE: ASMB rebuilt 1 (of 1) groups NOTE: ASMB rebuilt 13 (of 13) allocated files NOTE: fetching new locked extents from server NOTE: 0 locks established; 0 pending writes sent to server SUCCESS: ASMB reconnected & completed ASM server state
Again, user connected to database instance didn’t even noticed that something is happening with ASM.
Flex ASM enables for ASM instance to run on separate nodes than database servers. If ASM instance fails database will failover to another available ASM instance.
In case you are running <12c databases on your cluster you can still configure Flex ASM but you are required to configure local ASM instances on nodes. ASM instance failover won’t work for 10g or 11g databases.
Good reason to move towards 12c? ;-)
As if the only issue with Rac instances crashing were crashing of ASM.
ReplyDeleteThis feature is hardly useful. Its another set of complicated code which means more bugs.
RAC is becoming unnecessary almost in this virtualization era.
You are running 12c RAC in production and you had problems with Flex ASM?
ReplyDeleteCan you please show me some proof where RAC instance crashed due to bug/problem with Flex ASM?
I'm not running 12c RAC with Flex ASM in production and I would be glad to learn more about possible problems.
Please, share with me blog post, forum topic...
Regards,
Marko