Oracle by Anand: Oracle RAC Split Brain and Enhancement in 12

What is Split Brain

In oracle RAC cluster N number of Servers are connected to each other to form a cluster. There servers are communicated to each other via Private Interconnect.

In case if nodes fail to communicate to each other (due to various reasons eg. eth Failure) they form a subcluster and they start operating independly as an independent cluster. This situation is called Split Brain and is really dangerous and lead to a integrity Problem.
Oracle do not let the independent clusters running and managing the database.

How does the Oracle Grid Infrastructure Clusterware resolve a “split brain” situation?

This is where the role of Voting disks come in picture. Each node in the cluster periodically sends their hatbeat to the voting disk. They mark their attendence to the voting disk saying they are alive in the cluster and the number of nodes they can communicate.
In Split Brain situations Voting disk is used to determine which node/sub-cluster will servive and which will be evicted.

and the algorith is as follows

If the sub-clusters are of the different sizes, the clusterware identifies the largest sub-cluster which servives and evicts the smaller subcluster
If all the sub-clusters are of the same size, the sub-cluster having the lowest numbered node survives.

What is Changed in 12c ?

The algorithm is changed a bit and Weight Based Node eviction concept is introduced.

Weight Based Node eviction?

getting curious....................................

Here you go..............

If the sub-clusters are of the different sizes, the functionality is same as earlier the bigger one survives and the the smaller one is evicted.
If the sub-clusters have unequal node weights, the sub-cluster having the higher weight survives so that, in a 2-node cluster, the node with the lowest node number might be evicted if it has a lower weight.
If the sub-clusters have equal node weights, the sub-cluster with the lowest numbered node in it survives so that, in a 2-node cluster, the node with the lowest node number will survive.

and the best thing here is, you can use crsctl command to assign weight to instruct clusterware to consider your desires while taking eviction decision.

you can assign weigts to verious components as follows.

To assign weight to database instances or services, you use the -css_critical yes parameter with the srvctl add database or srvctl add service commands when adding a database instance or service. You can also use the parameter with the srvctl modify database and srvctl modify service commands.

• To assign weight to non ora.* resources, use the -attr "CSS_CRITICAL=yes" parameter with the crsctl add resource and crsctl modify resource commands when you are adding or modifying resources.
• To assign weight to a server, use the -css_critical yes parameter with the crsctl set server command.

You can also check the current settings

crsctl get server css_critical
CRS-5092: Current value of the server attribute CSS_CRITICAL is no.

Enough Theory ...Lets have some Practical

No Manual Weight is Assigned

crsctl get server css_critical
CRS-5092: Current value of the server attribute CSS_CRITICAL is no.

enp0s8 is used as Private Interconnect

oifcfg getif
enp0s3 192.9.1.0 global public
enp0s8 10.0.0.0 global cluster_interconnect,asm

2 node cluster and both the nodes are active

olsnodes -s -n
node1   1       Active
node2   2       Active

Lets stop the enp0s8 to simulate the communication failure between node1 and node2 and see what happens

ifdown enp0s8

olsnodes -s -n
node1   1       Active
node2   2       Inactive

2017-06-18 16:08:21.220 :    CSSD:1825834752: clssnmrCheckNodeWeight: node(1) has weight stamp(393228187) pebbles (0) goldstars (0) flags (3) SpoolVersion (0)
2017-06-18 16:08:21.220 :    CSSD:1825834752: clssnmrCheckNodeWeight: node(2) has weight stamp(0) pebbles (0) goldstars (0) flags (0) SpoolVersion (0)
2017-06-18 16:08:21.727 :    CSSD:1825834752: clssnmrCheckNodeWeight: node(1) has weight stamp(393228187) pebbles (0) goldstars (0) flags (3) SpoolVersion (0)
2017-06-18 16:08:21.727 :    CSSD:1825834752: clssnmrCheckNodeWeight: node(2) has weight stamp(0) pebbles (0) goldstars (0) flags (0) SpoolVersion (0)
2017-06-18 16:08:21.727 :    CSSD:1825834752: clssnmrCheckNodeWeight: Server pool version not consistent
2017-06-18 16:08:21.727 :    CSSD:1825834752: clssnmrCheckNodeWeight: stamp(393228187), completed(1/2)
2017-06-18 16:08:21.791 :    CSSD:1833719552: clssnmvDiskKillCheck: not evicted, file /dev/oracleasm/disks/DISK1 flags 0x00000000, kill block unique 0, my unique 1497779187
2017-06-18 16:08:21.791 :    CSSD:1843181312: clssnmvDiskKillCheck: not evicted, file /dev/oracleasm/disks/DISK3 flags 0x00000000, kill block unique 0, my unique 1497779187
2017-06-18 16:08:21.792 :    CSSD:1838450432: clssnmvDiskKillCheck: not evicted, file /dev/oracleasm/disks/DISK2 flags 0x00000000, kill block unique 0, my unique 2017-06-18 16:08:24.267 :    CSSD:1825834752: clssnmRemove: Start
2017-06-18 16:08:24.267 :    CSSD:1825834752: (:CSSNM00007:)clssnmrRemoveNode: Evicting node 2, node2, from the cluster in incarnation 393228188, node birth incarnation 393228179, death incarnation 393228188, stateflags 0x264000 uniqueness value 1497745656
2017-06-18 16:08:24.267 : default:1825834752: kgzf_gen_node_reid2: generated reid cid=fbf86955c586cfcbbf3d85761fe601d7,icin=393228178,nmn=2,lnid=393228179,gid=0,gin=0,gmn=0,umemid=0,opid=0,opsn=0,lvl=node hdr=0xfece0100
2017-06-18 16:08:24.267 :    CSSD:1825834752: clssscFenceSage: Fenced node node2, number 2, with EXADATA, handle 0
2017-06-18 16:08:24.267 :    CSSD:1825834752: clssnmrFenceSage: Fenced node node2, number 2, with EXADATA, handle 0
2017-06-18 16:08:24.267 :    CSSD:1825834752: clssnmrFenceCLSFA: clsfaFence request issued for node(2), name(node2), fence type(3), handle(0x7f1a74274880)
2017-06-18 16:08:24.267 :    CSSD:1825834752: clssnmSendShutdown: Sending shutdown to node node2, number 2, with kill time 36297354 and reason clssnmKillReasonEvicted
2017-06-18 16:08:24.267 :    CSSD:1825834752: clssnmsendmsg: not connected to node 2

Conclusion:- Based on the running Services on the Server clusterware has itself calculated the node Weigt and evicted the node (Node2) which had lower weights as we can see after scaning the OCSSD.trc logs

1 comment:

12345July 22, 2019 at 3:46 AM
Excellent blog I visit this blog it's really awesome. The important thing is that in this blog content written clearly and understandable. The content of information is very informative.
Workday Online Training
Oracle Fusion HCM Online Training
Oracle Fusion SCM Online Training
Oracle Fusion Financials Online Training