Knowledge/RAC

11.2.0.4 crsd.bin kill test

neo-orcl 2016. 10. 19. 18:20

개요

정상적으로 운영중인 11.2.0.4 grid 환경에서 crsd.bin을 kill하면 바로 다시 시작되는데

이럴 상황은 거의 없겠지만 crsd를 다시 시작되지 않을 count만큼 kill 하고 어떻게 되는지 봅니다.

가끔 crsd만 죽어있는 상황에 적용할 수 있겠습니다.


 

kill 진행 - crsd.bin을 너무 빨리 kill하면 crsd startup hang이 걸리니 2초 간격으로 진행

[root@node01 ~]# ps -ef | grep crsd.bin | grep -v grep
root     16554     1  2 11:56 ?        00:00:01 /u01/app/11.2.0.4/grid/bin/crsd.bin reboot
[root@node01 ~]# ps -ef | grep crsd.bin | grep -v grep|  awk '{print $2}' | xargs kill -9
[root@node01 ~]# ps -ef | grep crsd.bin | grep -v grep|  awk '{print $2}' | xargs kill -9
[root@node01 ~]# ps -ef | grep crsd.bin | grep -v grep|  awk '{print $2}' | xargs kill -9
[root@node01 ~]# ps -ef | grep crsd.bin | grep -v grep|  awk '{print $2}' | xargs kill -9
[root@node01 ~]# ps -ef | grep crsd.bin | grep -v grep|  awk '{print $2}' | xargs kill -9
[root@node01 ~]# ps -ef | grep crsd.bin | grep -v grep|  awk '{print $2}' | xargs kill -9
[root@node01 ~]# ps -ef | grep crsd.bin | grep -v grep|  awk '{print $2}' | xargs kill -9
[root@node01 ~]# ps -ef | grep crsd.bin | grep -v grep|  awk '{print $2}' | xargs kill -9
[root@node01 ~]# ps -ef | grep crsd.bin | grep -v grep|  awk '{print $2}' | xargs kill -9
[root@node01 ~]# ps -ef | grep crsd.bin | grep -v grep|  awk '{print $2}' | xargs kill -9
[root@node01 ~]# ps -ef | grep crsd.bin | grep -v grep|  awk '{print $2}' | xargs kill -9
[root@node01 ~]# ps -ef | grep crsd.bin | grep -v grep|  awk '{print $2}' | xargs kill -9
usage: kill [ -s signal | -p ] [ -a ] pid ...
       kill -l [ signal ]                                       --프로세스가 없어서 fail


=> 11번 죽였더니 안살아납니다. crsd의 maximun restart attempts은 확인을 못하겠네요. 테스트로 11번이 max 라는건 알게 되었습니다. 확인하는 다른 방법 아시는분은 알려주시면 감사하겠습니다.


alertcrs로그의 상태 중 마지막 영역

.....생략... 

2016-10-12 11:58:09.533:
[crsd(17859)]CRS-1201:CRSD started on node node01.
2016-10-12 11:58:10.123:
[ohasd(15872)]CRS-2765:Resource 'ora.crsd' has failed on server 'node01'.
2016-10-12 11:58:10.123:
[ohasd(15872)]CRS-2771:Maximum restart attempts reached for resource 'ora.crsd'; will not restart.

 

리소스 상태 체크

[root@node01 ~]# crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.
[root@node01 ~]# crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       node01                   Started
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       node01
ora.crf
      1        ONLINE  ONLINE       node01
ora.crsd
      1        ONLINE  OFFLINE

ora.cssd
      1        ONLINE  ONLINE       node01
ora.cssdmonitor
      1        ONLINE  ONLINE       node01
ora.ctssd
      1        ONLINE  ONLINE       node01                   ACTIVE:0
ora.diskmon
      1        OFFLINE OFFLINE
ora.drivers.acfs
      1        ONLINE  ONLINE       node01
ora.evmd
      1        ONLINE  ONLINE       node01
ora.gipcd
      1        ONLINE  ONLINE       node01
ora.gpnpd
      1        ONLINE  ONLINE       node01
ora.mdnsd
      1        ONLINE  ONLINE       node01

 

 


 

정상화 시키기 위해 stop 및 start 시도

[root@node01 ~]# crsctl stop crs
CRS-2796: The command may not proceed when Cluster Ready Services is not running
CRS-4687: Shutdown command has completed with errors.
CRS-4000: Command Stop failed, or completed with errors.
-- crsd가 실행중이 아니라고 합니다

 

[root@node01 ~]# crsctl start crs
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.

-- 이번엔 OHAS는 이미 active라며 안됩니다.

 

crsd만 시작

[root@node01 ~]# crsctl start resource ora.crsd -init
CRS-2672: Attempting to start 'ora.crsd' on 'node01'
CRS-2676: Start of 'ora.crsd' on 'node01' succeeded      

 

확인

[root@node01 ~]# crsctl stat res -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       node01                   Started
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       node01
ora.crf
      1        ONLINE  ONLINE       node01
ora.crsd
      1        ONLINE  ONLINE       node01

ora.cssd
      1        ONLINE  ONLINE       node01
ora.cssdmonitor
      1        ONLINE  ONLINE       node01
ora.ctssd
      1        ONLINE  ONLINE       node01                   ACTIVE:0
ora.diskmon
      1        OFFLINE OFFLINE
ora.drivers.acfs
      1        ONLINE  ONLINE       node01
ora.evmd
      1        ONLINE  ONLINE       node01
ora.gipcd
      1        ONLINE  ONLINE       node01
ora.gpnpd
      1        ONLINE  ONLINE       node01
ora.mdnsd
      1        ONLINE  ONLINE       node01