Knowledge/RAC

RAC Failover Test 10.2.0.2 64bit on linux 4.5

neo-orcl 2013. 6. 28. 15:25

################ 환경 #############################

OS: Oracle Enterprise Linux 4.5 64bit

DBMS: Oracle 10.2.0.2 Enterprise Edition 64bit

CRS: 10.2.0.2 64bit

Opatch: 없음

VM: Virtual Box

Node: 2ea

Shared Storage: ocfs2

Time: NTP 사용

service: web idb1 선호, idb2 available

         intra idb1 vailable, idb2 선호

CRS 및 서비스 설정: 기본값

 

################public line 절체######################

 

평상시. 현재 마스터 노드는 idb1

 

[oracle@idb1 ~]$ crsstat

HA Resource                                   Target     State            

-----------                                   ------     -----            

ora.idb.db                                    ONLINE     ONLINE on idb1   

ora.idb.idb1.inst                             ONLINE     ONLINE on idb1   

ora.idb.idb2.inst                             ONLINE     ONLINE on idb2   

ora.idb.intra.cs                              ONLINE     ONLINE on idb2   

ora.idb.intra.idb2.srv                        ONLINE     ONLINE on idb2   

ora.idb.web.cs                                ONLINE     ONLINE on idb1   

ora.idb.web.idb1.srv                          ONLINE     ONLINE on idb1   

ora.idb1.LISTENER_IDB1.lsnr                   ONLINE     ONLINE on idb1   

ora.idb1.gsd                                  ONLINE     ONLINE on idb1   

ora.idb1.ons                                  ONLINE     ONLINE on idb1   

ora.idb1.vip                                  ONLINE     ONLINE on idb1   

ora.idb2.LISTENER_IDB2.lsnr                   ONLINE     ONLINE on idb2   

ora.idb2.gsd                                  ONLINE     ONLINE on idb2   

ora.idb2.ons                                  ONLINE     ONLINE on idb2   

ora.idb2.vip                                  ONLINE     ONLINE on idb2   

 

idb1 public line 절체시 결과

 

[oracle@idb1 cssd]$ crsstat

HA Resource                                   Target     State            

-----------                                   ------     -----            

ora.idb.db                                    ONLINE     ONLINE on idb1   

ora.idb.idb1.inst                             ONLINE     OFFLINE          

ora.idb.idb2.inst                             ONLINE     ONLINE on idb2   

ora.idb.intra.cs                              ONLINE     ONLINE on idb2   

ora.idb.intra.idb2.srv                        ONLINE     ONLINE on idb2   

ora.idb.web.cs                                ONLINE     ONLINE on idb1   

ora.idb.web.idb1.srv                          ONLINE     ONLINE on idb1   

ora.idb1.LISTENER_IDB1.lsnr                   ONLINE     OFFLINE          

ora.idb1.gsd                                  ONLINE     ONLINE on idb1   

ora.idb1.ons                                  ONLINE     ONLINE on idb1   

ora.idb1.vip                                  ONLINE     ONLINE on idb2   

ora.idb2.LISTENER_IDB2.lsnr                   ONLINE     ONLINE on idb2   

ora.idb2.gsd                                  ONLINE     ONLINE on idb2   

ora.idb2.ons                                  ONLINE     ONLINE on idb2   

ora.idb2.vip                                  ONLINE     ONLINE on idb2   

 

idb1 public line을 다시 살려도 변화 없음. web 서비스가 옮겨지지 않음. web 서비스 불가. vip는 넘어감

 

직접 올림

[oracle@idb2 ~]$ srvctl start instance -d idb -i idb1

[oracle@idb2 ~]$ crsstat

HA Resource                                   Target     State            

-----------                                   ------     -----            

ora.idb.db                                    ONLINE     ONLINE on idb1   

ora.idb.idb1.inst                             ONLINE     ONLINE on idb1   

ora.idb.idb2.inst                             ONLINE     ONLINE on idb2   

ora.idb.intra.cs                              ONLINE     ONLINE on idb2   

ora.idb.intra.idb2.srv                        ONLINE     ONLINE on idb2   

ora.idb.web.cs                                ONLINE     ONLINE on idb1   

ora.idb.web.idb1.srv                          ONLINE     ONLINE on idb1   

ora.idb1.LISTENER_IDB1.lsnr                   ONLINE     ONLINE on idb1   

ora.idb1.gsd                                  ONLINE     ONLINE on idb1   

ora.idb1.ons                                  ONLINE     ONLINE on idb1   

ora.idb1.vip                                  ONLINE     ONLINE on idb1   

ora.idb2.LISTENER_IDB2.lsnr                   ONLINE     ONLINE on idb2   

ora.idb2.gsd                                  ONLINE     ONLINE on idb2   

ora.idb2.ons                                  ONLINE     ONLINE on idb2   

ora.idb2.vip                                  ONLINE     ONLINE on idb2   

 

리스너도 자동으로 올라왔고 vip도 자동으로 idb1으로 돌아갔으며 web 서비스도 다시 idb1으로 이동됨

 

##################### interconnect 절체########################

현재 idb1이 마스터노드.

 

[oracle@idb1 ~]$ crsstat

HA Resource                                   Target     State            

-----------                                   ------     -----            

ora.idb.db                                    ONLINE     ONLINE on idb1   

ora.idb.idb1.inst                             ONLINE     ONLINE on idb1   

ora.idb.idb2.inst                             ONLINE     ONLINE on idb2   

ora.idb.intra.cs                              ONLINE     ONLINE on idb2   

ora.idb.intra.idb2.srv                        ONLINE     ONLINE on idb2   

ora.idb.web.cs                                ONLINE     ONLINE on idb1   

ora.idb.web.idb1.srv                          ONLINE     ONLINE on idb1   

ora.idb1.LISTENER_IDB1.lsnr                   ONLINE     ONLINE on idb1   

ora.idb1.gsd                                  ONLINE     ONLINE on idb1   

ora.idb1.ons                                  ONLINE     ONLINE on idb1   

ora.idb1.vip                                  ONLINE     ONLINE on idb1   

ora.idb2.LISTENER_IDB2.lsnr                   ONLINE     ONLINE on idb2   

ora.idb2.gsd                                  ONLINE     ONLINE on idb2   

ora.idb2.ons                                  ONLINE     ONLINE on idb2   

ora.idb2.vip                                  ONLINE     ONLINE on idb2   

 

idb1 interconenct 절체

 

idb2 ocssd.log

 

[    CSSD]2013-06-28 14:15:11.422 [213005] >TRACE:   clssnmPollingThread: node idb1 (1) missed(59) checkin(s)

[    CSSD]2013-06-28 14:15:12.424 [213005] >TRACE:   clssnmPollingThread: node idb1 (1) is impending reconfig

[    CSSD]2013-06-28 14:15:12.424 [213005] >TRACE:   clssnmPollingThread: Eviction started for node idb1 (1), flags 0x000f, state 3, wt4c 0

[    CSSD]2013-06-28 14:15:12.424 [245775] >TRACE:   clssnmDoSyncUpdate: Initiating sync 2

[    CSSD]2013-06-28 14:15:12.424 [245775] >TRACE:   clssnmDoSyncUpdate: diskTimeout set to (57000)ms

[    CSSD]2013-06-28 14:15:12.424 [245775] >TRACE:   clssnmSetupAckWait: Ack message type (11)

[    CSSD]2013-06-28 14:15:12.424 [245775] >TRACE:   clssnmSetupAckWait: node(1) is ALIVE

[    CSSD]2013-06-28 14:15:12.424 [245775] >TRACE:   clssnmSetupAckWait: node(2) is ALIVE

[    CSSD]2013-06-28 14:15:12.424 [245775] >TRACE:   clssnmSendSync: syncSeqNo(2)

[    CSSD]2013-06-28 14:15:12.424 [131080] >TRACE:   clssnmHandleSync: Acknowledging sync: src[2] srcName[idb2] seq[1] sync[2]

[    CSSD]2013-06-28 14:15:12.424 [131080] >TRACE:   clssnmHandleSync: diskTimeout set to (57000)ms

[    CSSD]2013-06-28 14:15:12.425 [245775] >TRACE:   clssnmWaitForAcks: Ack message type(11), ackCount(1)

[    CSSD]2013-06-28 14:15:12.425 [245775] >TRACE:   clssnmWaitForAcks: node(1) is expiring, msg type(11)

[    CSSD]2013-06-28 14:15:12.425 [245775] >TRACE:   clssnmWaitForAcks: done, msg type(11)

[    CSSD]2013-06-28 14:15:12.425 [245775] >TRACE:   clssnmDoSyncUpdate: node(0) missCount(7677) state(0)

[    CSSD]2013-06-28 14:15:12.425 [245775] >TRACE:   clssnmDoSyncUpdate: node(1) missCount(60) state(3)

[    CSSD]2013-06-28 14:15:12.425 [245775] >TRACE:   clssnmSetupAckWait: Ack message type (13)

[    CSSD]2013-06-28 14:15:12.425 [245775] >TRACE:   clssnmSetupAckWait: node(2) is ACTIVE

[    CSSD]2013-06-28 14:15:12.425 [245775] >TRACE:   clssnmSendVote: syncSeqNo(2)

[    CSSD]2013-06-28 14:15:12.425 [131080] >TRACE:   clssnmSendVoteInfo: node(2) syncSeqNo(2)

[    CSSD]2013-06-28 14:15:12.425 [245775] >TRACE:   clssnmWaitForAcks: Ack message type(13), ackCount(0)

[    CSSD]2013-06-28 14:15:12.425 [245775] >TRACE:   clssnmCheckDskInfo: Checking disk info...

[    CSSD]2013-06-28 14:15:12.425 [245775] >TRACE:   clssnmCheckDskInfo: node(1) timeout(570) state_network(0) state_disk(3) missCount(60)

[    CSSD]2013-06-28 14:15:12.489 [16384] >USER:    NMEVENT_SUSPEND [00][00][00][06]

[    CSSD]2013-06-28 14:15:12.859 [81925] >TRACE:   clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(7679) LATS(6867374) Disk lastSeqNo(7679)

[    CSSD]2013-06-28 14:15:13.427 [245775] >TRACE:   clssnmCheckDskInfo: node(1) disk HB found, network state 0, disk state(3) missCount(61)

[    CSSD]2013-06-28 14:15:13.862 [81925] >TRACE:   clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(7680) LATS(6868374) Disk lastSeqNo(7680)

[    CSSD]2013-06-28 14:15:14.429 [245775] >TRACE:   clssnmCheckDskInfo: node(1) disk HB found, network state 0, disk state(3) missCount(62)

[    CSSD]2013-06-28 14:15:14.866 [81925] >TRACE:   clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(7681) LATS(6869374) Disk lastSeqNo(7681)

[    CSSD]2013-06-28 14:15:15.429 [245775] >TRACE:   clssnmCheckDskInfo: node(1) disk HB found, network state 0, disk state(3) missCount(63)

[    CSSD]2013-06-28 14:15:15.868 [81925] >TRACE:   clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(7682) LATS(6870384) Disk lastSeqNo(7682)

[    CSSD]2013-06-28 14:15:16.431 [245775] >ERROR:   clssnmCheckDskInfo: Terminating local instance to avoid splitbrain.

[    CSSD]2013-06-28 14:15:16.431 [245775] >ERROR:                 : Node(2), Leader(2), Size(1) VS Node(1), Leader(1), Size(1)

[    CSSD]2013-06-28 14:15:16.431 [245775] >TRACE:   clssscctx:  dump of 0x0x5d7750, len 3808

 

idb1 ocssd.log

 

[    CSSD]2013-06-28 14:15:11.307 [213005] >TRACE:   clssnmPollingThread: node idb2 (2) missed(59) checkin(s)

[    CSSD]2013-06-28 14:15:12.309 [213005] >TRACE:   clssnmPollingThread: node idb2 (2) is impending reconfig

[    CSSD]2013-06-28 14:15:12.309 [213005] >TRACE:   clssnmPollingThread: Eviction started for node idb2 (2), flags 0x000d, state 3, wt4c 0

[    CSSD]2013-06-28 14:15:12.309 [245775] >TRACE:   clssnmDoSyncUpdate: Initiating sync 2

[    CSSD]2013-06-28 14:15:12.309 [245775] >TRACE:   clssnmDoSyncUpdate: diskTimeout set to (57000)ms

[    CSSD]2013-06-28 14:15:12.309 [245775] >TRACE:   clssnmSetupAckWait: Ack message type (11)

[    CSSD]2013-06-28 14:15:12.310 [245775] >TRACE:   clssnmSetupAckWait: node(1) is ALIVE

[    CSSD]2013-06-28 14:15:12.310 [245775] >TRACE:   clssnmSetupAckWait: node(2) is ALIVE

[    CSSD]2013-06-28 14:15:12.310 [245775] >TRACE:   clssnmSendSync: syncSeqNo(2)

[    CSSD]2013-06-28 14:15:12.310 [131080] >TRACE:   clssnmHandleSync: Acknowledging sync: src[1] srcName[idb1] seq[5] sync[2]

[    CSSD]2013-06-28 14:15:12.310 [131080] >TRACE:   clssnmHandleSync: diskTimeout set to (57000)ms

[    CSSD]2013-06-28 14:15:12.310 [245775] >TRACE:   clssnmWaitForAcks: Ack message type(11), ackCount(1)

[    CSSD]2013-06-28 14:15:12.310 [245775] >TRACE:   clssnmWaitForAcks: node(2) is expiring, msg type(11)

[    CSSD]2013-06-28 14:15:12.310 [245775] >TRACE:   clssnmWaitForAcks: done, msg type(11)

[    CSSD]2013-06-28 14:15:12.310 [245775] >TRACE:   clssnmDoSyncUpdate: node(0) missCount(7682) state(0)

[    CSSD]2013-06-28 14:15:12.310 [245775] >TRACE:   clssnmDoSyncUpdate: node(2) missCount(60) state(3)

[    CSSD]2013-06-28 14:15:12.310 [245775] >TRACE:   clssnmSetupAckWait: Ack message type (13)

[    CSSD]2013-06-28 14:15:12.310 [245775] >TRACE:   clssnmSetupAckWait: node(1) is ACTIVE

[    CSSD]2013-06-28 14:15:12.310 [245775] >TRACE:   clssnmSendVote: syncSeqNo(2)

[    CSSD]2013-06-28 14:15:12.310 [131080] >TRACE:   clssnmSendVoteInfo: node(1) syncSeqNo(2)

[    CSSD]2013-06-28 14:15:12.310 [245775] >TRACE:   clssnmWaitForAcks: Ack message type(13), ackCount(0)

[    CSSD]2013-06-28 14:15:12.310 [245775] >TRACE:   clssnmCheckDskInfo: Checking disk info...

[    CSSD]2013-06-28 14:15:12.310 [245775] >TRACE:   clssnmCheckDskInfo: node(2) timeout(390) state_network(0) state_disk(3) missCount(60)

[    CSSD]2013-06-28 14:15:12.320 [16384] >USER:    NMEVENT_SUSPEND [00][00][00][06]

[    CSSD]2013-06-28 14:15:12.922 [81925] >TRACE:   clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(7672) LATS(6867784) Disk lastSeqNo(7672)

[    CSSD]2013-06-28 14:15:13.312 [245775] >TRACE:   clssnmCheckDskInfo: node(2) disk HB found, network state 0, disk state(3) missCount(61)

[    CSSD]2013-06-28 14:15:13.925 [81925] >TRACE:   clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(7673) LATS(6868784) Disk lastSeqNo(7673)

[    CSSD]2013-06-28 14:15:14.313 [245775] >TRACE:   clssnmCheckDskInfo: node(2) disk HB found, network state 0, disk state(3) missCount(62)

[    CSSD]2013-06-28 14:15:14.927 [81925] >TRACE:   clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(7674) LATS(6869784) Disk lastSeqNo(7674)

[    CSSD]2013-06-28 14:15:15.315 [245775] >TRACE:   clssnmCheckDskInfo: node(2) disk HB found, network state 0, disk state(3) missCount(63)

[    CSSD]2013-06-28 14:15:15.930 [81925] >TRACE:   clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(7675) LATS(6870784) Disk lastSeqNo(7675)

[    CSSD]2013-06-28 14:15:16.318 [245775] >TRACE:   clssnmCheckDskInfo: node(2) missCount(64) state(0). Smaller(1) cluster node 2. mine is 1. (2/1)

[    CSSD]2013-06-28 14:15:16.318 [245775] >TRACE:   clssnmEvict: Start

[    CSSD]2013-06-28 14:15:16.318 [245775] >TRACE:   clssnmEvict: Evicting node 2, birth 1, death 2, killme 1

[    CSSD]2013-06-28 14:15:16.318 [245775] >TRACE:   clssnmSendShutdown: req to node 2, kill time 6871174

[    CSSD]2013-06-28 14:15:16.318 [245775] >TRACE:   clssnmDiscHelper: node idb2 (2) connection failed

[    CSSD]2013-06-28 14:15:16.319 [81925] >TRACE:   clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(7676) LATS(6871174) Disk lastSeqNo(7676)

[    CSSD]2013-06-28 14:15:16.321 [245775] >TRACE:   clssnmWaitOnEvictions: Start

[    CSSD]2013-06-28 14:15:46.380 [245775] >WARNING: clssnmWaitOnEvictions: Unconfirmed dead node count 1

[    CSSD]2013-06-28 14:15:46.380 [245775] >TRACE:   clssnmSetupAckWait: Ack message type (15)

[    CSSD]2013-06-28 14:15:46.380 [245775] >TRACE:   clssnmSetupAckWait: node(1) is ACTIVE

[    CSSD]2013-06-28 14:15:46.380 [245775] >TRACE:   clssnmSendUpdate: syncSeqNo(2)

[    CSSD]2013-06-28 14:15:46.381 [131080] >TRACE:   clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)

[    CSSD]2013-06-28 14:15:46.381 [131080] >TRACE:   clssnmDeactivateNode: node 0 () left cluster

 

[    CSSD]2013-06-28 14:15:46.381 [131080] >TRACE:   clssnmUpdateNodeState: node 1, state (3/3) unique (1372388811/1372388811) prevConuni(0) birth (1/1) (old/new)

[    CSSD]2013-06-28 14:15:46.381 [131080] >TRACE:   clssnmUpdateNodeState: node 2, state (0/0) unique (1372388818/1372388818) prevConuni(1372388818) birth (1/0) (old/new)

[    CSSD]2013-06-28 14:15:46.381 [131080] >TRACE:   clssnmDeactivateNode: node 2 (idb2) left cluster

 

[    CSSD]2013-06-28 14:15:46.381 [131080] >USER:    clssnmHandleUpdate: SYNC(2) from node(1) completed

[    CSSD]2013-06-28 14:15:46.381 [131080] >USER:    clssnmHandleUpdate: NODE 1 (idb1) IS ACTIVE MEMBER OF CLUSTER

[    CSSD]2013-06-28 14:15:46.381 [131080] >TRACE:   clssnmHandleUpdate: diskTimeout set to (200000)ms

[    CSSD]2013-06-28 14:15:46.381 [245775] >TRACE:   clssnmWaitForAcks: Ack message type(15), ackCount(0)

[    CSSD]2013-06-28 14:15:46.381 [245775] >TRACE:   clssnmDoSyncUpdate: Sync Complete!

[    CSSD]2013-06-28 14:15:46.400 [278544] >TRACE:   clssgmReconfigThread:  started for reconfig (2)

[    CSSD]2013-06-28 14:15:46.400 [278544] >USER:    NMEVENT_RECONFIG [00][00][00][02]

[    CSSD]2013-06-28 14:15:46.401 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock crs_version type 2

[    CSSD]2013-06-28 14:15:46.401 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(1) grock(crs_version) birth(1/1)

[    CSSD]2013-06-28 14:15:46.401 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_1_idb type 2

[    CSSD]2013-06-28 14:15:46.401 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_1_idb type 3

[    CSSD]2013-06-28 14:15:46.401 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_2_idb type 2

[    CSSD]2013-06-28 14:15:46.402 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(ORA_CLSRD_2_idb) birth(1/1)

[    CSSD]2013-06-28 14:15:46.402 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_2_idb type 3

[    CSSD]2013-06-28 14:15:46.402 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(ORA_CLSRD_2_idb) birth(1/1)

[    CSSD]2013-06-28 14:15:46.402 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock DBIDB type 2

[    CSSD]2013-06-28 14:15:46.403 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(1) grock(DBIDB) birth(1/1)

[    CSSD]2013-06-28 14:15:46.403 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock DGIDB type 2

 

서로 상대방을 evict하려 했지만 master node가 아닌 node evict 되었음(idb2)

 

crsstat 상태

[oracle@idb1 cssd]$ crsstat

HA Resource                                   Target     State            

-----------                                   ------     -----            

ora.idb.db                                    ONLINE     ONLINE on idb1   

ora.idb.idb1.inst                             ONLINE     ONLINE on idb1   

ora.idb.idb2.inst                             ONLINE     OFFLINE          

ora.idb.intra.cs                              ONLINE     ONLINE on idb1   

ora.idb.intra.idb2.srv                        ONLINE     ONLINE on idb1   

ora.idb.web.cs                                ONLINE     ONLINE on idb1   

ora.idb.web.idb1.srv                          ONLINE     ONLINE on idb1   

ora.idb1.LISTENER_IDB1.lsnr                   ONLINE     ONLINE on idb1   

ora.idb1.gsd                                  ONLINE     ONLINE on idb1   

ora.idb1.ons                                  ONLINE     ONLINE on idb1   

ora.idb1.vip                                  ONLINE     ONLINE on idb1   

ora.idb2.LISTENER_IDB2.lsnr                   ONLINE     OFFLINE          

ora.idb2.gsd                                  ONLINE     OFFLINE          

ora.idb2.ons                                  ONLINE     OFFLINE          

ora.idb2.vip                                  ONLINE     ONLINE on idb1   

 

이후 idb2가 리붓되어 올라오지만 interconnect가 절체된 상태이기에 변화 없음

 

만약 ocfs를 사용하고 인터페이스를 interconnect 사용한다면 idb2 disk 마운트도 안됨.

 

[root@idb2 ~]# df -h

Filesystem            Size  Used Avail Use% Mounted on

/dev/sda1              17G  7.7G  8.0G  49% /

none                  754M     0  754M   0% /dev/shm

/dev/sda3             373M   92M  262M  27% /var

[root@idb2 ~]# cat /etc/fstab

# This file is edited by fstab-sync - see 'man fstab-sync' for details

LABEL=/                 /                       ext3    defaults        1 1

none                    /dev/pts                devpts  gid=5,mode=620  0 0

none                    /dev/shm                tmpfs   defaults        0 0

none                    /proc                   proc    defaults        0 0

none                    /sys                    sysfs   defaults        0 0

LABEL=/var              /var                    ext3    defaults        1 2

LABEL=SWAP-sda2         swap                    swap    defaults        0 0

/dev/sdb                /oradata                ocfs2   _netdev,datavolume,nointr 0 0

/dev/hdc                /media/cdrom            auto    pamconsole,exec,noauto,managed 0 0

 

 

interconnect 살림

시간이 지난 후 mount -a 아니면 수동 명령 수행

 

[root@idb2 ~]# mount -a

[root@idb2 ~]# df -h

Filesystem            Size  Used Avail Use% Mounted on

/dev/sda1              17G  7.7G  8.0G  49% /

none                  754M     0  754M   0% /dev/shm

/dev/sda3             373M   92M  262M  27% /var

/dev/sdb              9.8G  1.4G  8.5G  14% /oradata

 

잠시 후 CRSD start 자동 됨. 하지만 vip는 넘어가도 서비스는 다시 복구되지 않음.

idb2에 접속해도 web intra 서비스로 접속 불가

 

[oracle@idb2 ~]$ crsstat

HA Resource                                   Target     State            

-----------                                   ------     -----            

ora.idb.db                                    ONLINE     ONLINE on idb1   

ora.idb.idb1.inst                             ONLINE     ONLINE on idb1   

ora.idb.idb2.inst                             ONLINE     ONLINE on idb2   

ora.idb.intra.cs                              ONLINE     ONLINE on idb1   

ora.idb.intra.idb2.srv                        ONLINE     ONLINE on idb1   

ora.idb.web.cs                                ONLINE     ONLINE on idb1   

ora.idb.web.idb1.srv                          ONLINE     ONLINE on idb1   

ora.idb1.LISTENER_IDB1.lsnr                   ONLINE     ONLINE on idb1   

ora.idb1.gsd                                  ONLINE     ONLINE on idb1   

ora.idb1.ons                                  ONLINE     ONLINE on idb1   

ora.idb1.vip                                  ONLINE     ONLINE on idb1   

ora.idb2.LISTENER_IDB2.lsnr                   ONLINE     ONLINE on idb2   

ora.idb2.gsd                                  ONLINE     ONLINE on idb2   

ora.idb2.ons                                  ONLINE     ONLINE on idb2   

ora.idb2.vip                                  ONLINE     ONLINE on idb2   

 

intra 서비스를 idb2로 다시 복구시킴

 

[oracle@idb2 ~]$ srvctl relocate service -d idb -s intra -i idb1 -t idb2

[oracle@idb2 ~]$ crsstat

HA Resource                                   Target     State            

-----------                                   ------     -----            

ora.idb.db                                    ONLINE     ONLINE on idb1   

ora.idb.idb1.inst                             ONLINE     ONLINE on idb1   

ora.idb.idb2.inst                             ONLINE     ONLINE on idb2   

ora.idb.intra.cs                              ONLINE     ONLINE on idb1   

ora.idb.intra.idb2.srv                        ONLINE     ONLINE on idb2   

ora.idb.web.cs                                ONLINE     ONLINE on idb1   

ora.idb.web.idb1.srv                          ONLINE     ONLINE on idb1   

ora.idb1.LISTENER_IDB1.lsnr                   ONLINE     ONLINE on idb1   

ora.idb1.gsd                                  ONLINE     ONLINE on idb1   

ora.idb1.ons                                  ONLINE     ONLINE on idb1   

ora.idb1.vip                                  ONLINE     ONLINE on idb1   

ora.idb2.LISTENER_IDB2.lsnr                   ONLINE     ONLINE on idb2   

ora.idb2.gsd                                  ONLINE     ONLINE on idb2   

ora.idb2.ons                                  ONLINE     ONLINE on idb2   

ora.idb2.vip                                  ONLINE     ONLINE on idb2   

 

##################################### master node 전원 off #######################################

현재 master node idb1

 

[oracle@idb2 ~]$ crsstat

HA Resource                                   Target     State            

-----------                                   ------     -----            

ora.idb.db                                    ONLINE     ONLINE on idb1   

ora.idb.idb1.inst                             ONLINE     ONLINE on idb1   

ora.idb.idb2.inst                             ONLINE     ONLINE on idb2   

ora.idb.intra.cs                              ONLINE     ONLINE on idb1   

ora.idb.intra.idb2.srv                        ONLINE     ONLINE on idb2   

ora.idb.web.cs                                ONLINE     ONLINE on idb1   

ora.idb.web.idb1.srv                          ONLINE     ONLINE on idb1   

ora.idb1.LISTENER_IDB1.lsnr                   ONLINE     ONLINE on idb1   

ora.idb1.gsd                                  ONLINE     ONLINE on idb1   

ora.idb1.ons                                  ONLINE     ONLINE on idb1   

ora.idb1.vip                                  ONLINE     ONLINE on idb1   

ora.idb2.LISTENER_IDB2.lsnr                   ONLINE     ONLINE on idb2   

ora.idb2.gsd                                  ONLINE     ONLINE on idb2   

ora.idb2.ons                                  ONLINE     ONLINE on idb2   

ora.idb2.vip                                  ONLINE     ONLINE on idb2   

 

파워 off 진행

 

idb2 ocssd.log

 

[    CSSD]2013-06-28 14:42:12.569 [213005] >TRACE:   clssnmPollingThread: node idb1 (1) missed(59) checkin(s)

[    CSSD]2013-06-28 14:42:13.571 [213005] >TRACE:   clssnmPollingThread: node idb1 (1) is impending reconfig

[    CSSD]2013-06-28 14:42:13.571 [213005] >TRACE:   clssnmPollingThread: Eviction started for node idb1 (1), flags 0x000f, state 3, wt4c 0

[    CSSD]2013-06-28 14:42:13.571 [245775] >TRACE:   clssnmDoSyncUpdate: Initiating sync 4

[    CSSD]2013-06-28 14:42:13.571 [245775] >TRACE:   clssnmDoSyncUpdate: diskTimeout set to (57000)ms

[    CSSD]2013-06-28 14:42:13.571 [245775] >TRACE:   clssnmSetupAckWait: Ack message type (11)

[    CSSD]2013-06-28 14:42:13.571 [245775] >TRACE:   clssnmSetupAckWait: node(1) is ALIVE

[    CSSD]2013-06-28 14:42:13.571 [245775] >TRACE:   clssnmSetupAckWait: node(2) is ALIVE

[    CSSD]2013-06-28 14:42:13.571 [245775] >TRACE:   clssnmSendSync: syncSeqNo(4)

[    CSSD]2013-06-28 14:42:13.571 [131080] >TRACE:   clssnmHandleSync: Acknowledging sync: src[2] srcName[idb2] seq[1] sync[4]

[    CSSD]2013-06-28 14:42:13.571 [131080] >TRACE:   clssnmHandleSync: diskTimeout set to (57000)ms

[    CSSD]2013-06-28 14:42:13.572 [245775] >TRACE:   clssnmWaitForAcks: Ack message type(11), ackCount(1)

[    CSSD]2013-06-28 14:42:13.572 [245775] >TRACE:   clssnmWaitForAcks: node(1) is expiring, msg type(11)

[    CSSD]2013-06-28 14:42:13.572 [245775] >TRACE:   clssnmWaitForAcks: done, msg type(11)

[    CSSD]2013-06-28 14:42:13.572 [245775] >TRACE:   clssnmDoSyncUpdate: node(0) missCount(734) state(0)

[    CSSD]2013-06-28 14:42:13.572 [245775] >TRACE:   clssnmDoSyncUpdate: node(1) missCount(60) state(3)

[    CSSD]2013-06-28 14:42:13.572 [245775] >TRACE:   clssnmSetupAckWait: Ack message type (13)

[    CSSD]2013-06-28 14:42:13.572 [245775] >TRACE:   clssnmSetupAckWait: node(2) is ACTIVE

[    CSSD]2013-06-28 14:42:13.572 [245775] >TRACE:   clssnmSendVote: syncSeqNo(4)

[    CSSD]2013-06-28 14:42:13.572 [131080] >TRACE:   clssnmSendVoteInfo: node(2) syncSeqNo(4)

[    CSSD]2013-06-28 14:42:13.572 [245775] >TRACE:   clssnmWaitForAcks: Ack message type(13), ackCount(0)

[    CSSD]2013-06-28 14:42:13.572 [245775] >TRACE:   clssnmCheckDskInfo: Checking disk info...

[    CSSD]2013-06-28 14:42:13.572 [245775] >TRACE:   clssnmCheckDskInfo: node(1) timeout(58670) state_network(0) state_disk(3) missCount(60)

[    CSSD]2013-06-28 14:42:13.643 [16384] >USER:    NMEVENT_SUSPEND [00][00][00][06]

[    CSSD]2013-06-28 14:42:14.573 [245775] >TRACE:   clssnmCheckDskInfo: node(1) timeout(59680) state_network(0) state_disk(3) missCount(61)

[    CSSD]2013-06-28 14:42:14.896 [245775] >TRACE:   clssnmEvict: Start

[    CSSD]2013-06-28 14:42:14.896 [245775] >TRACE:   clssnmEvict: Evicting node 1, birth 1, death 4, killme 1

[    CSSD]2013-06-28 14:42:14.896 [245775] >TRACE:   clssnmEvict: Evicting Node(1), timeout(60000)

[    CSSD]2013-06-28 14:42:14.896 [245775] >TRACE:   clssnmSendShutdown: req to node 1, kill time 659124

[    CSSD]2013-06-28 14:42:14.896 [245775] >TRACE:   clssnmDiscHelper: node idb1 (1) connection failed

[    CSSD]2013-06-28 14:42:14.896 [245775] >TRACE:   clssnmWaitOnEvictions: Start

[    CSSD]2013-06-28 14:42:14.896 [245775] >TRACE:   clssnmWaitOnEvictions: Node(1) down, LATS(599124),timeout(60000)

[    CSSD]2013-06-28 14:42:14.896 [245775] >TRACE:   clssnmSetupAckWait: Ack message type (15)

[    CSSD]2013-06-28 14:42:14.896 [245775] >TRACE:   clssnmSetupAckWait: node(2) is ACTIVE

[    CSSD]2013-06-28 14:42:14.896 [245775] >TRACE:   clssnmSendUpdate: syncSeqNo(4)

[    CSSD]2013-06-28 14:42:14.897 [245775] >TRACE:   clssnmWaitForAcks: Ack message type(15), ackCount(1)

[    CSSD]2013-06-28 14:42:14.897 [131080] >TRACE:   clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)

[    CSSD]2013-06-28 14:42:14.897 [131080] >TRACE:   clssnmDeactivateNode: node 0 () left cluster

 

[    CSSD]2013-06-28 14:42:14.897 [131080] >TRACE:   clssnmUpdateNodeState: node 1, state (0/0) unique (1372388811/1372388811) prevConuni(1372388811) birth (1/0) (old/new)

[    CSSD]2013-06-28 14:42:14.897 [131080] >TRACE:   clssnmDeactivateNode: node 1 (idb1) left cluster

 

[    CSSD]2013-06-28 14:42:14.897 [131080] >TRACE:   clssnmUpdateNodeState: node 2, state (3/3) unique (1372397396/1372397396) prevConuni(0) birth (3/3) (old/new)

[    CSSD]2013-06-28 14:42:14.897 [131080] >USER:    clssnmHandleUpdate: SYNC(4) from node(2) completed

[    CSSD]2013-06-28 14:42:14.897 [131080] >USER:    clssnmHandleUpdate: NODE 2 (idb2) IS ACTIVE MEMBER OF CLUSTER

[    CSSD]2013-06-28 14:42:14.897 [131080] >TRACE:   clssnmHandleUpdate: diskTimeout set to (200000)ms

[    CSSD]2013-06-28 14:42:14.897 [245775] >TRACE:   clssnmWaitForAcks: done, msg type(15)

[    CSSD]2013-06-28 14:42:14.897 [245775] >TRACE:   clssnmDoSyncUpdate: Sync Complete!

[    CSSD]2013-06-28 14:42:14.969 [278544] >TRACE:   clssgmReconfigThread:  started for reconfig (4)

[    CSSD]2013-06-28 14:42:14.969 [278544] >USER:    NMEVENT_RECONFIG [00][00][00][04]

[    CSSD]2013-06-28 14:42:14.970 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock crs_version type 2

[    CSSD]2013-06-28 14:42:14.970 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(crs_version) birth(1/1)

[    CSSD]2013-06-28 14:42:14.970 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_1_idb type 3

[    CSSD]2013-06-28 14:42:14.970 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(ORA_CLSRD_1_idb) birth(1/1)

[    CSSD]2013-06-28 14:42:14.971 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_1_idb type 2

[    CSSD]2013-06-28 14:42:14.971 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(ORA_CLSRD_1_idb) birth(1/1)

[    CSSD]2013-06-28 14:42:14.971 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_2_idb type 2

[    CSSD]2013-06-28 14:42:14.971 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock ORA_CLSRD_2_idb type 3

[    CSSD]2013-06-28 14:42:14.971 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock DBIDB type 2

[    CSSD]2013-06-28 14:42:14.971 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(DBIDB) birth(1/1)

[    CSSD]2013-06-28 14:42:14.972 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock DGIDB type 2

[    CSSD]2013-06-28 14:42:14.972 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(DGIDB) birth(1/1)

[    CSSD]2013-06-28 14:42:14.972 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock IGIDBALL type 2

[    CSSD]2013-06-28 14:42:14.972 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(1) grock(IGIDBALL) birth(1/1)

[    CSSD]2013-06-28 14:42:14.973 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock DAALL_DB type 2

[    CSSD]2013-06-28 14:42:14.973 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(DAALL_DB) birth(1/1)

[    CSSD]2013-06-28 14:42:14.973 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock EVMDMAIN type 2

[    CSSD]2013-06-28 14:42:14.973 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(1) grock(EVMDMAIN) birth(1/1)

[    CSSD]2013-06-28 14:42:14.975 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock CRSDMAIN type 2

[    CSSD]2013-06-28 14:42:14.975 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(1) grock(CRSDMAIN) birth(1/1)

[    CSSD]2013-06-28 14:42:14.977 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock _ORA_CRS_MEMBER_idb1 type 3

[    CSSD]2013-06-28 14:42:14.977 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(0) grock(_ORA_CRS_MEMBER_idb1) birth(1/1)

[    CSSD]2013-06-28 14:42:14.979 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock _ORA_CRS_MEMBER_idb2 type 3

[    CSSD]2013-06-28 14:42:14.980 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock ocr_crs type 2

[    CSSD]2013-06-28 14:42:14.980 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(1) grock(ocr_crs) birth(1/1)

[    CSSD]2013-06-28 14:42:14.987 [278544] >TRACE:   clssgmCleanupGrocks: cleaning up grock #CSS_CLSSOMON type 2

[    CSSD]2013-06-28 14:42:14.988 [278544] >TRACE:   clssgmCleanupOrphanMembers: cleaning up remote mbr(1) grock(#CSS_CLSSOMON) birth(1/1)

[    CSSD]2013-06-28 14:42:14.994 [278544] >TRACE:   clssgmEstablishConnections: 1 nodes in cluster incarn 4

[    CSSD]2013-06-28 14:42:14.996 [196620] >TRACE:   clssgmPeerDeactivate: node 1 (idb1), death 4, state 0x80000000 connstate 0xa

[    CSSD]2013-06-28 14:42:14.996 [196620] >TRACE:   clssgmPeerListener: connects done (1/1)

[    CSSD]2013-06-28 14:42:14.996 [278544] >TRACE:   clssgmEstablishMasterNode: MASTER for 4 is node(2) birth(3)

[    CSSD]2013-06-28 14:42:14.996 [278544] >TRACE:   clssgmChangeMasterNode: requeued 0 RPCs

[    CSSD]2013-06-28 14:42:14.999 [278544] >TRACE:   clssgmMasterCMSync: Synchronizing group/lock status

[    CSSD]2013-06-28 14:42:15.007 [278544] >TRACE:   clssgmMasterSendDBDone: group/lock status synchronization complete

[    CSSD]CLSS-3000: reconfiguration successful, incarnation 4 with 1 nodes

 

[    CSSD]CLSS-3001: local node number 2, master node number 2

 

masternode idb2로 변경됨.

 

[oracle@idb2 ~]$ crsstat

HA Resource                                   Target     State            

-----------                                   ------     -----            

ora.idb.db                                    ONLINE     ONLINE on idb2   

ora.idb.idb1.inst                             ONLINE     OFFLINE          

ora.idb.idb2.inst                             ONLINE     ONLINE on idb2   

ora.idb.intra.cs                              ONLINE     ONLINE on idb2   

ora.idb.intra.idb2.srv                        ONLINE     ONLINE on idb2   

ora.idb.web.cs                                ONLINE     ONLINE on idb2   

ora.idb.web.idb1.srv                          ONLINE     ONLINE on idb2   

ora.idb1.LISTENER_IDB1.lsnr                   ONLINE     OFFLINE          

ora.idb1.gsd                                  ONLINE     OFFLINE          

ora.idb1.ons                                  ONLINE     OFFLINE          

ora.idb1.vip                                  ONLINE     ONLINE on idb2   

ora.idb2.LISTENER_IDB2.lsnr                   ONLINE     ONLINE on idb2   

ora.idb2.gsd                                  ONLINE     ONLINE on idb2   

ora.idb2.ons                                  ONLINE     ONLINE on idb2   

ora.idb2.vip                                  ONLINE     ONLINE on idb2   

 

vip와 서비스가 idb2로 넘어감.

 

idb1의 전원 on

 

[oracle@idb2 ~]$ crsstat

HA Resource                                   Target     State            

-----------                                   ------     -----            

ora.idb.db                                    ONLINE     ONLINE on idb2   

ora.idb.idb1.inst                             ONLINE     ONLINE on idb1   

ora.idb.idb2.inst                             ONLINE     ONLINE on idb2   

ora.idb.intra.cs                              ONLINE     ONLINE on idb2   

ora.idb.intra.idb2.srv                        ONLINE     ONLINE on idb2   

ora.idb.web.cs                                ONLINE     ONLINE on idb2   

ora.idb.web.idb1.srv                          ONLINE     ONLINE on idb2   

ora.idb1.LISTENER_IDB1.lsnr                   ONLINE     ONLINE on idb1   

ora.idb1.gsd                                  ONLINE     ONLINE on idb1   

ora.idb1.ons                                  ONLINE     ONLINE on idb1   

ora.idb1.vip                                  ONLINE     ONLINE on idb1   

ora.idb2.LISTENER_IDB2.lsnr                   ONLINE     ONLINE on idb2   

ora.idb2.gsd                                  ONLINE     ONLINE on idb2   

ora.idb2.ons                                  ONLINE     ONLINE on idb2   

ora.idb2.vip                                  ONLINE     ONLINE on idb2   

 

최종으로 위와 같은 형태가 됨. 서비스는 자동 복구되지 않음

 

커맨드를 통해 서비스 이동

 

[oracle@idb2 ~]$ srvctl relocate service -d idb -s web -i idb2 -t idb1

[oracle@idb2 ~]$ crsstat

HA Resource                                   Target     State            

-----------                                   ------     -----            

ora.idb.db                                    ONLINE     ONLINE on idb2   

ora.idb.idb1.inst                             ONLINE     ONLINE on idb1   

ora.idb.idb2.inst                             ONLINE     ONLINE on idb2   

ora.idb.intra.cs                              ONLINE     ONLINE on idb2   

ora.idb.intra.idb2.srv                        ONLINE     ONLINE on idb2   

ora.idb.web.cs                                ONLINE     ONLINE on idb2   

ora.idb.web.idb1.srv                          ONLINE     ONLINE on idb1   

ora.idb1.LISTENER_IDB1.lsnr                   ONLINE     ONLINE on idb1   

ora.idb1.gsd                                  ONLINE     ONLINE on idb1   

ora.idb1.ons                                  ONLINE     ONLINE on idb1   

ora.idb1.vip                                  ONLINE     ONLINE on idb1   

ora.idb2.LISTENER_IDB2.lsnr                   ONLINE     ONLINE on idb2   

ora.idb2.gsd                                  ONLINE     ONLINE on idb2   

ora.idb2.ons                                  ONLINE     ONLINE on idb2   

ora.idb2.vip                                  ONLINE     ONLINE on idb2