RAC Failover Test 10.2.0.2 64bit on linux 4.5
################ 환경 #############################
OS: Oracle Enterprise Linux 4.5 64bit
DBMS: Oracle 10.2.0.2 Enterprise Edition 64bit
CRS: 10.2.0.2 64bit
Opatch: 없음
VM: Virtual Box
Node: 2ea
Shared Storage: ocfs2
Time: NTP 사용
service: web idb1 선호, idb2 available
intra idb1 vailable, idb2 선호
그 CRS 및 서비스 설정: 기본값
################public line 절체######################
평상시. 현재 마스터 노드는 idb1
[oracle@idb1 ~]$ crsstat
HA Resource Target State
----------- ------ -----
ora.idb.db ONLINE ONLINE on idb1
ora.idb.idb1.inst ONLINE ONLINE on idb1
ora.idb.idb2.inst ONLINE ONLINE on idb2
ora.idb.intra.cs ONLINE ONLINE on idb2
ora.idb.intra.idb2.srv ONLINE ONLINE on idb2
ora.idb.web.cs ONLINE ONLINE on idb1
ora.idb.web.idb1.srv ONLINE ONLINE on idb1
ora.idb1.LISTENER_IDB1.lsnr ONLINE ONLINE on idb1
ora.idb1.gsd ONLINE ONLINE on idb1
ora.idb1.ons ONLINE ONLINE on idb1
ora.idb1.vip ONLINE ONLINE on idb1
ora.idb2.LISTENER_IDB2.lsnr ONLINE ONLINE on idb2
ora.idb2.gsd ONLINE ONLINE on idb2
ora.idb2.ons ONLINE ONLINE on idb2
ora.idb2.vip ONLINE ONLINE on idb2
idb1의 public line 절체시 결과
[oracle@idb1 cssd]$ crsstat
HA Resource Target State
----------- ------ -----
ora.idb.db ONLINE ONLINE on idb1
ora.idb.idb1.inst ONLINE OFFLINE
ora.idb.idb2.inst ONLINE ONLINE on idb2
ora.idb.intra.cs ONLINE ONLINE on idb2
ora.idb.intra.idb2.srv ONLINE ONLINE on idb2
ora.idb.web.cs ONLINE ONLINE on idb1
ora.idb.web.idb1.srv ONLINE ONLINE on idb1
ora.idb1.LISTENER_IDB1.lsnr ONLINE OFFLINE
ora.idb1.gsd ONLINE ONLINE on idb1
ora.idb1.ons ONLINE ONLINE on idb1
ora.idb1.vip ONLINE ONLINE on idb2
ora.idb2.LISTENER_IDB2.lsnr ONLINE ONLINE on idb2
ora.idb2.gsd ONLINE ONLINE on idb2
ora.idb2.ons ONLINE ONLINE on idb2
ora.idb2.vip ONLINE ONLINE on idb2
idb1의 public line을 다시 살려도 변화 없음. web 서비스가 옮겨지지 않음. web 서비스 불가. vip는 넘어감
직접 올림
[oracle@idb2 ~]$ srvctl start instance -d idb -i idb1
[oracle@idb2 ~]$ crsstat
HA Resource Target State
----------- ------ -----
ora.idb.db ONLINE ONLINE on idb1
ora.idb.idb1.inst ONLINE ONLINE on idb1
ora.idb.idb2.inst ONLINE ONLINE on idb2
ora.idb.intra.cs ONLINE ONLINE on idb2
ora.idb.intra.idb2.srv ONLINE ONLINE on idb2
ora.idb.web.cs ONLINE ONLINE on idb1
ora.idb.web.idb1.srv ONLINE ONLINE on idb1
ora.idb1.LISTENER_IDB1.lsnr ONLINE ONLINE on idb1
ora.idb1.gsd ONLINE ONLINE on idb1
ora.idb1.ons ONLINE ONLINE on idb1
ora.idb1.vip ONLINE ONLINE on idb1
ora.idb2.LISTENER_IDB2.lsnr ONLINE ONLINE on idb2
ora.idb2.gsd ONLINE ONLINE on idb2
ora.idb2.ons ONLINE ONLINE on idb2
ora.idb2.vip ONLINE ONLINE on idb2
리스너도 자동으로 올라왔고 vip도 자동으로 idb1으로 돌아갔으며 web 서비스도 다시 idb1으로 이동됨
##################### interconnect 절체########################
현재 idb1이 마스터노드.
[oracle@idb1 ~]$ crsstat
HA Resource Target State
----------- ------ -----
ora.idb.db ONLINE ONLINE on idb1
ora.idb.idb1.inst ONLINE ONLINE on idb1
ora.idb.idb2.inst ONLINE ONLINE on idb2
ora.idb.intra.cs ONLINE ONLINE on idb2
ora.idb.intra.idb2.srv ONLINE ONLINE on idb2
ora.idb.web.cs ONLINE ONLINE on idb1
ora.idb.web.idb1.srv ONLINE ONLINE on idb1
ora.idb1.LISTENER_IDB1.lsnr ONLINE ONLINE on idb1
ora.idb1.gsd ONLINE ONLINE on idb1
ora.idb1.ons ONLINE ONLINE on idb1
ora.idb1.vip ONLINE ONLINE on idb1
ora.idb2.LISTENER_IDB2.lsnr ONLINE ONLINE on idb2
ora.idb2.gsd ONLINE ONLINE on idb2
ora.idb2.ons ONLINE ONLINE on idb2
ora.idb2.vip ONLINE ONLINE on idb2
idb1의 interconenct 절체
idb2의 ocssd.log
[ CSSD]2013-06-28 14:15:11.422 [213005] >TRACE: clssnmPollingThread: node idb1 (1) missed(59) checkin(s)
[ CSSD]2013-06-28 14:15:12.424 [213005] >TRACE: clssnmPollingThread: node idb1 (1) is impending reconfig
[ CSSD]2013-06-28 14:15:12.424 [213005] >TRACE: clssnmPollingThread: Eviction started for node idb1 (1), flags 0x000f, state 3, wt4c 0
[ CSSD]2013-06-28 14:15:12.424 [245775] >TRACE: clssnmDoSyncUpdate: Initiating sync 2
[ CSSD]2013-06-28 14:15:12.424 [245775] >TRACE: clssnmDoSyncUpdate: diskTimeout set to (57000)ms
[ CSSD]2013-06-28 14:15:12.424 [245775] >TRACE: clssnmSetupAckWait: Ack message type (11)
[ CSSD]2013-06-28 14:15:12.424 [245775] >TRACE: clssnmSetupAckWait: node(1) is ALIVE
[ CSSD]2013-06-28 14:15:12.424 [245775] >TRACE: clssnmSetupAckWait: node(2) is ALIVE
[ CSSD]2013-06-28 14:15:12.424 [245775] >TRACE: clssnmSendSync: syncSeqNo(2)
[ CSSD]2013-06-28 14:15:12.424 [131080] >TRACE: clssnmHandleSync: Acknowledging sync: src[2] srcName[idb2] seq[1] sync[2]
[ CSSD]2013-06-28 14:15:12.424 [131080] >TRACE: clssnmHandleSync: diskTimeout set to (57000)ms
[ CSSD]2013-06-28 14:15:12.425 [245775] >TRACE: clssnmWaitForAcks: Ack message type(11), ackCount(1)
[ CSSD]2013-06-28 14:15:12.425 [245775] >TRACE: clssnmWaitForAcks: node(1) is expiring, msg type(11)
[ CSSD]2013-06-28 14:15:12.425 [245775] >TRACE: clssnmWaitForAcks: done, msg type(11)
[ CSSD]2013-06-28 14:15:12.425 [245775] >TRACE: clssnmDoSyncUpdate: node(0) missCount(7677) state(0)
[ CSSD]2013-06-28 14:15:12.425 [245775] >TRACE: clssnmDoSyncUpdate: node(1) missCount(60) state(3)
[ CSSD]2013-06-28 14:15:12.425 [245775] >TRACE: clssnmSetupAckWait: Ack message type (13)
[ CSSD]2013-06-28 14:15:12.425 [245775] >TRACE: clssnmSetupAckWait: node(2) is ACTIVE
[ CSSD]2013-06-28 14:15:12.425 [245775] >TRACE: clssnmSendVote: syncSeqNo(2)
[ CSSD]2013-06-28 14:15:12.425 [131080] >TRACE: clssnmSendVoteInfo: node(2) syncSeqNo(2)
[ CSSD]2013-06-28 14:15:12.425 [245775] >TRACE: clssnmWaitForAcks: Ack message type(13), ackCount(0)
[ CSSD]2013-06-28 14:15:12.425 [245775] >TRACE: clssnmCheckDskInfo: Checking disk info...
[ CSSD]2013-06-28 14:15:12.425 [245775] >TRACE: clssnmCheckDskInfo: node(1) timeout(570) state_network(0) state_disk(3) missCount(60)
[ CSSD]2013-06-28 14:15:12.489 [16384] >USER: NMEVENT_SUSPEND [00][00][00][06]
[ CSSD]2013-06-28 14:15:12.859 [81925] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(7679) LATS(6867374) Disk lastSeqNo(7679)
[ CSSD]2013-06-28 14:15:13.427 [245775] >TRACE: clssnmCheckDskInfo: node(1) disk HB found, network state 0, disk state(3) missCount(61)
[ CSSD]2013-06-28 14:15:13.862 [81925] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(7680) LATS(6868374) Disk lastSeqNo(7680)
[ CSSD]2013-06-28 14:15:14.429 [245775] >TRACE: clssnmCheckDskInfo: node(1) disk HB found, network state 0, disk state(3) missCount(62)
[ CSSD]2013-06-28 14:15:14.866 [81925] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(7681) LATS(6869374) Disk lastSeqNo(7681)
[ CSSD]2013-06-28 14:15:15.429 [245775] >TRACE: clssnmCheckDskInfo: node(1) disk HB found, network state 0, disk state(3) missCount(63)
[ CSSD]2013-06-28 14:15:15.868 [81925] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(7682) LATS(6870384) Disk lastSeqNo(7682)
[ CSSD]2013-06-28 14:15:16.431 [245775] >ERROR: clssnmCheckDskInfo: Terminating local instance to avoid splitbrain.
[ CSSD]2013-06-28 14:15:16.431 [245775] >ERROR: : Node(2), Leader(2), Size(1) VS Node(1), Leader(1), Size(1)
[ CSSD]2013-06-28 14:15:16.431 [245775] >TRACE: clssscctx: dump of 0x0x5d7750, len 3808
idb1의 ocssd.log
[ CSSD]2013-06-28 14:15:11.307 [213005] >TRACE: clssnmPollingThread: node idb2 (2) missed(59) checkin(s)
[ CSSD]2013-06-28 14:15:12.309 [213005] >TRACE: clssnmPollingThread: node idb2 (2) is impending reconfig
[ CSSD]2013-06-28 14:15:12.309 [213005] >TRACE: clssnmPollingThread: Eviction started for node idb2 (2), flags 0x000d, state 3, wt4c 0
[ CSSD]2013-06-28 14:15:12.309 [245775] >TRACE: clssnmDoSyncUpdate: Initiating sync 2
[ CSSD]2013-06-28 14:15:12.309 [245775] >TRACE: clssnmDoSyncUpdate: diskTimeout set to (57000)ms
[ CSSD]2013-06-28 14:15:12.309 [245775] >TRACE: clssnmSetupAckWait: Ack message type (11)
[ CSSD]2013-06-28 14:15:12.310 [245775] >TRACE: clssnmSetupAckWait: node(1) is ALIVE
[ CSSD]2013-06-28 14:15:12.310 [245775] >TRACE: clssnmSetupAckWait: node(2) is ALIVE
[ CSSD]2013-06-28 14:15:12.310 [245775] >TRACE: clssnmSendSync: syncSeqNo(2)
[ CSSD]2013-06-28 14:15:12.310 [131080] >TRACE: clssnmHandleSync: Acknowledging sync: src[1] srcName[idb1] seq[5] sync[2]
[ CSSD]2013-06-28 14:15:12.310 [131080] >TRACE: clssnmHandleSync: diskTimeout set to (57000)ms
[ CSSD]2013-06-28 14:15:12.310 [245775] >TRACE: clssnmWaitForAcks: Ack message type(11), ackCount(1)
[ CSSD]2013-06-28 14:15:12.310 [245775] >TRACE: clssnmWaitForAcks: node(2) is expiring, msg type(11)
[ CSSD]2013-06-28 14:15:12.310 [245775] >TRACE: clssnmWaitForAcks: done, msg type(11)
[ CSSD]2013-06-28 14:15:12.310 [245775] >TRACE: clssnmDoSyncUpdate: node(0) missCount(7682) state(0)
[ CSSD]2013-06-28 14:15:12.310 [245775] >TRACE: clssnmDoSyncUpdate: node(2) missCount(60) state(3)
[ CSSD]2013-06-28 14:15:12.310 [245775] >TRACE: clssnmSetupAckWait: Ack message type (13)
[ CSSD]2013-06-28 14:15:12.310 [245775] >TRACE: clssnmSetupAckWait: node(1) is ACTIVE
[ CSSD]2013-06-28 14:15:12.310 [245775] >TRACE: clssnmSendVote: syncSeqNo(2)
[ CSSD]2013-06-28 14:15:12.310 [131080] >TRACE: clssnmSendVoteInfo: node(1) syncSeqNo(2)
[ CSSD]2013-06-28 14:15:12.310 [245775] >TRACE: clssnmWaitForAcks: Ack message type(13), ackCount(0)
[ CSSD]2013-06-28 14:15:12.310 [245775] >TRACE: clssnmCheckDskInfo: Checking disk info...
[ CSSD]2013-06-28 14:15:12.310 [245775] >TRACE: clssnmCheckDskInfo: node(2) timeout(390) state_network(0) state_disk(3) missCount(60)
[ CSSD]2013-06-28 14:15:12.320 [16384] >USER: NMEVENT_SUSPEND [00][00][00][06]
[ CSSD]2013-06-28 14:15:12.922 [81925] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(7672) LATS(6867784) Disk lastSeqNo(7672)
[ CSSD]2013-06-28 14:15:13.312 [245775] >TRACE: clssnmCheckDskInfo: node(2) disk HB found, network state 0, disk state(3) missCount(61)
[ CSSD]2013-06-28 14:15:13.925 [81925] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(7673) LATS(6868784) Disk lastSeqNo(7673)
[ CSSD]2013-06-28 14:15:14.313 [245775] >TRACE: clssnmCheckDskInfo: node(2) disk HB found, network state 0, disk state(3) missCount(62)
[ CSSD]2013-06-28 14:15:14.927 [81925] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(7674) LATS(6869784) Disk lastSeqNo(7674)
[ CSSD]2013-06-28 14:15:15.315 [245775] >TRACE: clssnmCheckDskInfo: node(2) disk HB found, network state 0, disk state(3) missCount(63)
[ CSSD]2013-06-28 14:15:15.930 [81925] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(7675) LATS(6870784) Disk lastSeqNo(7675)
[ CSSD]2013-06-28 14:15:16.318 [245775] >TRACE: clssnmCheckDskInfo: node(2) missCount(64) state(0). Smaller(1) cluster node 2. mine is 1. (2/1)
[ CSSD]2013-06-28 14:15:16.318 [245775] >TRACE: clssnmEvict: Start
[ CSSD]2013-06-28 14:15:16.318 [245775] >TRACE: clssnmEvict: Evicting node 2, birth 1, death 2, killme 1
[ CSSD]2013-06-28 14:15:16.318 [245775] >TRACE: clssnmSendShutdown: req to node 2, kill time 6871174
[ CSSD]2013-06-28 14:15:16.318 [245775] >TRACE: clssnmDiscHelper: node idb2 (2) connection failed
[ CSSD]2013-06-28 14:15:16.319 [81925] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(7676) LATS(6871174) Disk lastSeqNo(7676)
[ CSSD]2013-06-28 14:15:16.321 [245775] >TRACE: clssnmWaitOnEvictions: Start
[ CSSD]2013-06-28 14:15:46.380 [245775] >WARNING: clssnmWaitOnEvictions: Unconfirmed dead node count 1
[ CSSD]2013-06-28 14:15:46.380 [245775] >TRACE: clssnmSetupAckWait: Ack message type (15)
[ CSSD]2013-06-28 14:15:46.380 [245775] >TRACE: clssnmSetupAckWait: node(1) is ACTIVE
[ CSSD]2013-06-28 14:15:46.380 [245775] >TRACE: clssnmSendUpdate: syncSeqNo(2)
[ CSSD]2013-06-28 14:15:46.381 [131080] >TRACE: clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
[ CSSD]2013-06-28 14:15:46.381 [131080] >TRACE: clssnmDeactivateNode: node 0 () left cluster
[ CSSD]2013-06-28 14:15:46.381 [131080] >TRACE: clssnmUpdateNodeState: node 1, state (3/3) unique (1372388811/1372388811) prevConuni(0) birth (1/1) (old/new)
[ CSSD]2013-06-28 14:15:46.381 [131080] >TRACE: clssnmUpdateNodeState: node 2, state (0/0) unique (1372388818/1372388818) prevConuni(1372388818) birth (1/0) (old/new)
[ CSSD]2013-06-28 14:15:46.381 [131080] >TRACE: clssnmDeactivateNode: node 2 (idb2) left cluster
[ CSSD]2013-06-28 14:15:46.381 [131080] >USER: clssnmHandleUpdate: SYNC(2) from node(1) completed
[ CSSD]2013-06-28 14:15:46.381 [131080] >USER: clssnmHandleUpdate: NODE 1 (idb1) IS ACTIVE MEMBER OF CLUSTER
[ CSSD]2013-06-28 14:15:46.381 [131080] >TRACE: clssnmHandleUpdate: diskTimeout set to (200000)ms
[ CSSD]2013-06-28 14:15:46.381 [245775] >TRACE: clssnmWaitForAcks: Ack message type(15), ackCount(0)
[ CSSD]2013-06-28 14:15:46.381 [245775] >TRACE: clssnmDoSyncUpdate: Sync Complete!
[ CSSD]2013-06-28 14:15:46.400 [278544] >TRACE: clssgmReconfigThread: started for reconfig (2)
[ CSSD]2013-06-28 14:15:46.400 [278544] >USER: NMEVENT_RECONFIG [00][00][00][02]
[ CSSD]2013-06-28 14:15:46.401 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock crs_version type 2
[ CSSD]2013-06-28 14:15:46.401 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(1) grock(crs_version) birth(1/1)
[ CSSD]2013-06-28 14:15:46.401 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_1_idb type 2
[ CSSD]2013-06-28 14:15:46.401 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_1_idb type 3
[ CSSD]2013-06-28 14:15:46.401 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_2_idb type 2
[ CSSD]2013-06-28 14:15:46.402 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(ORA_CLSRD_2_idb) birth(1/1)
[ CSSD]2013-06-28 14:15:46.402 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_2_idb type 3
[ CSSD]2013-06-28 14:15:46.402 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(ORA_CLSRD_2_idb) birth(1/1)
[ CSSD]2013-06-28 14:15:46.402 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock DBIDB type 2
[ CSSD]2013-06-28 14:15:46.403 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(1) grock(DBIDB) birth(1/1)
[ CSSD]2013-06-28 14:15:46.403 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock DGIDB type 2
서로 상대방을 evict하려 했지만 master node가 아닌 node가 evict 되었음(idb2)
crsstat 상태
[oracle@idb1 cssd]$ crsstat
HA Resource Target State
----------- ------ -----
ora.idb.db ONLINE ONLINE on idb1
ora.idb.idb1.inst ONLINE ONLINE on idb1
ora.idb.idb2.inst ONLINE OFFLINE
ora.idb.intra.cs ONLINE ONLINE on idb1
ora.idb.intra.idb2.srv ONLINE ONLINE on idb1
ora.idb.web.cs ONLINE ONLINE on idb1
ora.idb.web.idb1.srv ONLINE ONLINE on idb1
ora.idb1.LISTENER_IDB1.lsnr ONLINE ONLINE on idb1
ora.idb1.gsd ONLINE ONLINE on idb1
ora.idb1.ons ONLINE ONLINE on idb1
ora.idb1.vip ONLINE ONLINE on idb1
ora.idb2.LISTENER_IDB2.lsnr ONLINE OFFLINE
ora.idb2.gsd ONLINE OFFLINE
ora.idb2.ons ONLINE OFFLINE
ora.idb2.vip ONLINE ONLINE on idb1
이후 idb2가 리붓되어 올라오지만 interconnect가 절체된 상태이기에 변화 없음
만약 ocfs를 사용하고 인터페이스를 interconnect 사용한다면 idb2는 disk 마운트도 안됨.
[root@idb2 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 17G 7.7G 8.0G 49% /
none 754M 0 754M 0% /dev/shm
/dev/sda3 373M 92M 262M 27% /var
[root@idb2 ~]# cat /etc/fstab
# This file is edited by fstab-sync - see 'man fstab-sync' for details
LABEL=/ / ext3 defaults 1 1
none /dev/pts devpts gid=5,mode=620 0 0
none /dev/shm tmpfs defaults 0 0
none /proc proc defaults 0 0
none /sys sysfs defaults 0 0
LABEL=/var /var ext3 defaults 1 2
LABEL=SWAP-sda2 swap swap defaults 0 0
/dev/sdb /oradata ocfs2 _netdev,datavolume,nointr 0 0
/dev/hdc /media/cdrom auto pamconsole,exec,noauto,managed 0 0
interconnect 살림
시간이 지난 후 mount -a 아니면 수동 명령 수행
[root@idb2 ~]# mount -a
[root@idb2 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 17G 7.7G 8.0G 49% /
none 754M 0 754M 0% /dev/shm
/dev/sda3 373M 92M 262M 27% /var
/dev/sdb 9.8G 1.4G 8.5G 14% /oradata
잠시 후 CRSD start 자동 됨. 하지만 vip는 넘어가도 서비스는 다시 복구되지 않음.
즉 idb2에 접속해도 web과 intra 서비스로 접속 불가
[oracle@idb2 ~]$ crsstat
HA Resource Target State
----------- ------ -----
ora.idb.db ONLINE ONLINE on idb1
ora.idb.idb1.inst ONLINE ONLINE on idb1
ora.idb.idb2.inst ONLINE ONLINE on idb2
ora.idb.intra.cs ONLINE ONLINE on idb1
ora.idb.intra.idb2.srv ONLINE ONLINE on idb1
ora.idb.web.cs ONLINE ONLINE on idb1
ora.idb.web.idb1.srv ONLINE ONLINE on idb1
ora.idb1.LISTENER_IDB1.lsnr ONLINE ONLINE on idb1
ora.idb1.gsd ONLINE ONLINE on idb1
ora.idb1.ons ONLINE ONLINE on idb1
ora.idb1.vip ONLINE ONLINE on idb1
ora.idb2.LISTENER_IDB2.lsnr ONLINE ONLINE on idb2
ora.idb2.gsd ONLINE ONLINE on idb2
ora.idb2.ons ONLINE ONLINE on idb2
ora.idb2.vip ONLINE ONLINE on idb2
intra 서비스를 idb2로 다시 복구시킴
[oracle@idb2 ~]$ srvctl relocate service -d idb -s intra -i idb1 -t idb2
[oracle@idb2 ~]$ crsstat
HA Resource Target State
----------- ------ -----
ora.idb.db ONLINE ONLINE on idb1
ora.idb.idb1.inst ONLINE ONLINE on idb1
ora.idb.idb2.inst ONLINE ONLINE on idb2
ora.idb.intra.cs ONLINE ONLINE on idb1
ora.idb.intra.idb2.srv ONLINE ONLINE on idb2
ora.idb.web.cs ONLINE ONLINE on idb1
ora.idb.web.idb1.srv ONLINE ONLINE on idb1
ora.idb1.LISTENER_IDB1.lsnr ONLINE ONLINE on idb1
ora.idb1.gsd ONLINE ONLINE on idb1
ora.idb1.ons ONLINE ONLINE on idb1
ora.idb1.vip ONLINE ONLINE on idb1
ora.idb2.LISTENER_IDB2.lsnr ONLINE ONLINE on idb2
ora.idb2.gsd ONLINE ONLINE on idb2
ora.idb2.ons ONLINE ONLINE on idb2
ora.idb2.vip ONLINE ONLINE on idb2
##################################### master node 전원 off #######################################
현재 master node는 idb1
[oracle@idb2 ~]$ crsstat
HA Resource Target State
----------- ------ -----
ora.idb.db ONLINE ONLINE on idb1
ora.idb.idb1.inst ONLINE ONLINE on idb1
ora.idb.idb2.inst ONLINE ONLINE on idb2
ora.idb.intra.cs ONLINE ONLINE on idb1
ora.idb.intra.idb2.srv ONLINE ONLINE on idb2
ora.idb.web.cs ONLINE ONLINE on idb1
ora.idb.web.idb1.srv ONLINE ONLINE on idb1
ora.idb1.LISTENER_IDB1.lsnr ONLINE ONLINE on idb1
ora.idb1.gsd ONLINE ONLINE on idb1
ora.idb1.ons ONLINE ONLINE on idb1
ora.idb1.vip ONLINE ONLINE on idb1
ora.idb2.LISTENER_IDB2.lsnr ONLINE ONLINE on idb2
ora.idb2.gsd ONLINE ONLINE on idb2
ora.idb2.ons ONLINE ONLINE on idb2
ora.idb2.vip ONLINE ONLINE on idb2
파워 off 진행
idb2의 ocssd.log
[ CSSD]2013-06-28 14:42:12.569 [213005] >TRACE: clssnmPollingThread: node idb1 (1) missed(59) checkin(s)
[ CSSD]2013-06-28 14:42:13.571 [213005] >TRACE: clssnmPollingThread: node idb1 (1) is impending reconfig
[ CSSD]2013-06-28 14:42:13.571 [213005] >TRACE: clssnmPollingThread: Eviction started for node idb1 (1), flags 0x000f, state 3, wt4c 0
[ CSSD]2013-06-28 14:42:13.571 [245775] >TRACE: clssnmDoSyncUpdate: Initiating sync 4
[ CSSD]2013-06-28 14:42:13.571 [245775] >TRACE: clssnmDoSyncUpdate: diskTimeout set to (57000)ms
[ CSSD]2013-06-28 14:42:13.571 [245775] >TRACE: clssnmSetupAckWait: Ack message type (11)
[ CSSD]2013-06-28 14:42:13.571 [245775] >TRACE: clssnmSetupAckWait: node(1) is ALIVE
[ CSSD]2013-06-28 14:42:13.571 [245775] >TRACE: clssnmSetupAckWait: node(2) is ALIVE
[ CSSD]2013-06-28 14:42:13.571 [245775] >TRACE: clssnmSendSync: syncSeqNo(4)
[ CSSD]2013-06-28 14:42:13.571 [131080] >TRACE: clssnmHandleSync: Acknowledging sync: src[2] srcName[idb2] seq[1] sync[4]
[ CSSD]2013-06-28 14:42:13.571 [131080] >TRACE: clssnmHandleSync: diskTimeout set to (57000)ms
[ CSSD]2013-06-28 14:42:13.572 [245775] >TRACE: clssnmWaitForAcks: Ack message type(11), ackCount(1)
[ CSSD]2013-06-28 14:42:13.572 [245775] >TRACE: clssnmWaitForAcks: node(1) is expiring, msg type(11)
[ CSSD]2013-06-28 14:42:13.572 [245775] >TRACE: clssnmWaitForAcks: done, msg type(11)
[ CSSD]2013-06-28 14:42:13.572 [245775] >TRACE: clssnmDoSyncUpdate: node(0) missCount(734) state(0)
[ CSSD]2013-06-28 14:42:13.572 [245775] >TRACE: clssnmDoSyncUpdate: node(1) missCount(60) state(3)
[ CSSD]2013-06-28 14:42:13.572 [245775] >TRACE: clssnmSetupAckWait: Ack message type (13)
[ CSSD]2013-06-28 14:42:13.572 [245775] >TRACE: clssnmSetupAckWait: node(2) is ACTIVE
[ CSSD]2013-06-28 14:42:13.572 [245775] >TRACE: clssnmSendVote: syncSeqNo(4)
[ CSSD]2013-06-28 14:42:13.572 [131080] >TRACE: clssnmSendVoteInfo: node(2) syncSeqNo(4)
[ CSSD]2013-06-28 14:42:13.572 [245775] >TRACE: clssnmWaitForAcks: Ack message type(13), ackCount(0)
[ CSSD]2013-06-28 14:42:13.572 [245775] >TRACE: clssnmCheckDskInfo: Checking disk info...
[ CSSD]2013-06-28 14:42:13.572 [245775] >TRACE: clssnmCheckDskInfo: node(1) timeout(58670) state_network(0) state_disk(3) missCount(60)
[ CSSD]2013-06-28 14:42:13.643 [16384] >USER: NMEVENT_SUSPEND [00][00][00][06]
[ CSSD]2013-06-28 14:42:14.573 [245775] >TRACE: clssnmCheckDskInfo: node(1) timeout(59680) state_network(0) state_disk(3) missCount(61)
[ CSSD]2013-06-28 14:42:14.896 [245775] >TRACE: clssnmEvict: Start
[ CSSD]2013-06-28 14:42:14.896 [245775] >TRACE: clssnmEvict: Evicting node 1, birth 1, death 4, killme 1
[ CSSD]2013-06-28 14:42:14.896 [245775] >TRACE: clssnmEvict: Evicting Node(1), timeout(60000)
[ CSSD]2013-06-28 14:42:14.896 [245775] >TRACE: clssnmSendShutdown: req to node 1, kill time 659124
[ CSSD]2013-06-28 14:42:14.896 [245775] >TRACE: clssnmDiscHelper: node idb1 (1) connection failed
[ CSSD]2013-06-28 14:42:14.896 [245775] >TRACE: clssnmWaitOnEvictions: Start
[ CSSD]2013-06-28 14:42:14.896 [245775] >TRACE: clssnmWaitOnEvictions: Node(1) down, LATS(599124),timeout(60000)
[ CSSD]2013-06-28 14:42:14.896 [245775] >TRACE: clssnmSetupAckWait: Ack message type (15)
[ CSSD]2013-06-28 14:42:14.896 [245775] >TRACE: clssnmSetupAckWait: node(2) is ACTIVE
[ CSSD]2013-06-28 14:42:14.896 [245775] >TRACE: clssnmSendUpdate: syncSeqNo(4)
[ CSSD]2013-06-28 14:42:14.897 [245775] >TRACE: clssnmWaitForAcks: Ack message type(15), ackCount(1)
[ CSSD]2013-06-28 14:42:14.897 [131080] >TRACE: clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
[ CSSD]2013-06-28 14:42:14.897 [131080] >TRACE: clssnmDeactivateNode: node 0 () left cluster
[ CSSD]2013-06-28 14:42:14.897 [131080] >TRACE: clssnmUpdateNodeState: node 1, state (0/0) unique (1372388811/1372388811) prevConuni(1372388811) birth (1/0) (old/new)
[ CSSD]2013-06-28 14:42:14.897 [131080] >TRACE: clssnmDeactivateNode: node 1 (idb1) left cluster
[ CSSD]2013-06-28 14:42:14.897 [131080] >TRACE: clssnmUpdateNodeState: node 2, state (3/3) unique (1372397396/1372397396) prevConuni(0) birth (3/3) (old/new)
[ CSSD]2013-06-28 14:42:14.897 [131080] >USER: clssnmHandleUpdate: SYNC(4) from node(2) completed
[ CSSD]2013-06-28 14:42:14.897 [131080] >USER: clssnmHandleUpdate: NODE 2 (idb2) IS ACTIVE MEMBER OF CLUSTER
[ CSSD]2013-06-28 14:42:14.897 [131080] >TRACE: clssnmHandleUpdate: diskTimeout set to (200000)ms
[ CSSD]2013-06-28 14:42:14.897 [245775] >TRACE: clssnmWaitForAcks: done, msg type(15)
[ CSSD]2013-06-28 14:42:14.897 [245775] >TRACE: clssnmDoSyncUpdate: Sync Complete!
[ CSSD]2013-06-28 14:42:14.969 [278544] >TRACE: clssgmReconfigThread: started for reconfig (4)
[ CSSD]2013-06-28 14:42:14.969 [278544] >USER: NMEVENT_RECONFIG [00][00][00][04]
[ CSSD]2013-06-28 14:42:14.970 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock crs_version type 2
[ CSSD]2013-06-28 14:42:14.970 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(crs_version) birth(1/1)
[ CSSD]2013-06-28 14:42:14.970 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_1_idb type 3
[ CSSD]2013-06-28 14:42:14.970 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(ORA_CLSRD_1_idb) birth(1/1)
[ CSSD]2013-06-28 14:42:14.971 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_1_idb type 2
[ CSSD]2013-06-28 14:42:14.971 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(ORA_CLSRD_1_idb) birth(1/1)
[ CSSD]2013-06-28 14:42:14.971 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_2_idb type 2
[ CSSD]2013-06-28 14:42:14.971 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_2_idb type 3
[ CSSD]2013-06-28 14:42:14.971 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock DBIDB type 2
[ CSSD]2013-06-28 14:42:14.971 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(DBIDB) birth(1/1)
[ CSSD]2013-06-28 14:42:14.972 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock DGIDB type 2
[ CSSD]2013-06-28 14:42:14.972 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(DGIDB) birth(1/1)
[ CSSD]2013-06-28 14:42:14.972 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock IGIDBALL type 2
[ CSSD]2013-06-28 14:42:14.972 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(1) grock(IGIDBALL) birth(1/1)
[ CSSD]2013-06-28 14:42:14.973 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock DAALL_DB type 2
[ CSSD]2013-06-28 14:42:14.973 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(DAALL_DB) birth(1/1)
[ CSSD]2013-06-28 14:42:14.973 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock EVMDMAIN type 2
[ CSSD]2013-06-28 14:42:14.973 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(1) grock(EVMDMAIN) birth(1/1)
[ CSSD]2013-06-28 14:42:14.975 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock CRSDMAIN type 2
[ CSSD]2013-06-28 14:42:14.975 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(1) grock(CRSDMAIN) birth(1/1)
[ CSSD]2013-06-28 14:42:14.977 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock _ORA_CRS_MEMBER_idb1 type 3
[ CSSD]2013-06-28 14:42:14.977 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(_ORA_CRS_MEMBER_idb1) birth(1/1)
[ CSSD]2013-06-28 14:42:14.979 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock _ORA_CRS_MEMBER_idb2 type 3
[ CSSD]2013-06-28 14:42:14.980 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock ocr_crs type 2
[ CSSD]2013-06-28 14:42:14.980 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(1) grock(ocr_crs) birth(1/1)
[ CSSD]2013-06-28 14:42:14.987 [278544] >TRACE: clssgmCleanupGrocks: cleaning up grock #CSS_CLSSOMON type 2
[ CSSD]2013-06-28 14:42:14.988 [278544] >TRACE: clssgmCleanupOrphanMembers: cleaning up remote mbr(1) grock(#CSS_CLSSOMON) birth(1/1)
[ CSSD]2013-06-28 14:42:14.994 [278544] >TRACE: clssgmEstablishConnections: 1 nodes in cluster incarn 4
[ CSSD]2013-06-28 14:42:14.996 [196620] >TRACE: clssgmPeerDeactivate: node 1 (idb1), death 4, state 0x80000000 connstate 0xa
[ CSSD]2013-06-28 14:42:14.996 [196620] >TRACE: clssgmPeerListener: connects done (1/1)
[ CSSD]2013-06-28 14:42:14.996 [278544] >TRACE: clssgmEstablishMasterNode: MASTER for 4 is node(2) birth(3)
[ CSSD]2013-06-28 14:42:14.996 [278544] >TRACE: clssgmChangeMasterNode: requeued 0 RPCs
[ CSSD]2013-06-28 14:42:14.999 [278544] >TRACE: clssgmMasterCMSync: Synchronizing group/lock status
[ CSSD]2013-06-28 14:42:15.007 [278544] >TRACE: clssgmMasterSendDBDone: group/lock status synchronization complete
[ CSSD]CLSS-3000: reconfiguration successful, incarnation 4 with 1 nodes
[ CSSD]CLSS-3001: local node number 2, master node number 2
masternode가 idb2로 변경됨.
[oracle@idb2 ~]$ crsstat
HA Resource Target State
----------- ------ -----
ora.idb.db ONLINE ONLINE on idb2
ora.idb.idb1.inst ONLINE OFFLINE
ora.idb.idb2.inst ONLINE ONLINE on idb2
ora.idb.intra.cs ONLINE ONLINE on idb2
ora.idb.intra.idb2.srv ONLINE ONLINE on idb2
ora.idb.web.cs ONLINE ONLINE on idb2
ora.idb.web.idb1.srv ONLINE ONLINE on idb2
ora.idb1.LISTENER_IDB1.lsnr ONLINE OFFLINE
ora.idb1.gsd ONLINE OFFLINE
ora.idb1.ons ONLINE OFFLINE
ora.idb1.vip ONLINE ONLINE on idb2
ora.idb2.LISTENER_IDB2.lsnr ONLINE ONLINE on idb2
ora.idb2.gsd ONLINE ONLINE on idb2
ora.idb2.ons ONLINE ONLINE on idb2
ora.idb2.vip ONLINE ONLINE on idb2
vip와 서비스가 idb2로 넘어감.
idb1의 전원 on
[oracle@idb2 ~]$ crsstat
HA Resource Target State
----------- ------ -----
ora.idb.db ONLINE ONLINE on idb2
ora.idb.idb1.inst ONLINE ONLINE on idb1
ora.idb.idb2.inst ONLINE ONLINE on idb2
ora.idb.intra.cs ONLINE ONLINE on idb2
ora.idb.intra.idb2.srv ONLINE ONLINE on idb2
ora.idb.web.cs ONLINE ONLINE on idb2
ora.idb.web.idb1.srv ONLINE ONLINE on idb2
ora.idb1.LISTENER_IDB1.lsnr ONLINE ONLINE on idb1
ora.idb1.gsd ONLINE ONLINE on idb1
ora.idb1.ons ONLINE ONLINE on idb1
ora.idb1.vip ONLINE ONLINE on idb1
ora.idb2.LISTENER_IDB2.lsnr ONLINE ONLINE on idb2
ora.idb2.gsd ONLINE ONLINE on idb2
ora.idb2.ons ONLINE ONLINE on idb2
ora.idb2.vip ONLINE ONLINE on idb2
최종으로 위와 같은 형태가 됨. 서비스는 자동 복구되지 않음
커맨드를 통해 서비스 이동
[oracle@idb2 ~]$ srvctl relocate service -d idb -s web -i idb2 -t idb1
[oracle@idb2 ~]$ crsstat
HA Resource Target State
----------- ------ -----
ora.idb.db ONLINE ONLINE on idb2
ora.idb.idb1.inst ONLINE ONLINE on idb1
ora.idb.idb2.inst ONLINE ONLINE on idb2
ora.idb.intra.cs ONLINE ONLINE on idb2
ora.idb.intra.idb2.srv ONLINE ONLINE on idb2
ora.idb.web.cs ONLINE ONLINE on idb2
ora.idb.web.idb1.srv ONLINE ONLINE on idb1
ora.idb1.LISTENER_IDB1.lsnr ONLINE ONLINE on idb1
ora.idb1.gsd ONLINE ONLINE on idb1
ora.idb1.ons ONLINE ONLINE on idb1
ora.idb1.vip ONLINE ONLINE on idb1
ora.idb2.LISTENER_IDB2.lsnr ONLINE ONLINE on idb2
ora.idb2.gsd ONLINE ONLINE on idb2
ora.idb2.ons ONLINE ONLINE on idb2
ora.idb2.vip ONLINE ONLINE on idb2