您的位置:首页 > 数据库 > Oracle

Oracle 11.2.0.1 RAC GRID 无法启动 : Oracle High Availability Services startup failed

2012-11-21 13:26 489 查看
、在虚拟机上安装的11.2.0.1的RAC,之所以选择11.2.0.1,是因为public IP和Private 网段的问题。 安装实例过程中,电脑死机,重启后,CRS 无法启动。

[root@rac1 bin]# ./crsctlstart crs

CRS-4124: Oracle HighAvailability Services startup failed.

CRS-4000: Command Startfailed, or completed with errors.

[root@rac1 bin]# ps -ef|grep has

root 8081 1 0 03:14 ? 00:00:00/u01/app/grid/11.2.0/bin/ohasd.bin reboot

root 8137 4230 1 03:23 pts/0 00:00:00 grep has

[root@rac1 bin]# kill -9 8081

[root@rac1 bin]# ./crsctl start crs

CRS-4124: Oracle High Availability Servicesstartup failed.

CRS-4000: Command Start failed, orcompleted with errors.

查看log:

[grid@rac2 rac2]$ ll

total 72

drwxr-x--- 2 grid oinstall 4096 Nov 2100:38 admin

drwxrwxr-t 4 root oinstall 4096 Nov 2100:38 agent

-rw-rw-r-- 1 rootroot 9693 Nov 21 02:26 alertrac2.log

drwxr-x--- 2 grid oinstall 4096 Nov 2100:43 client

drwxr-x--- 2 root oinstall 4096 Nov 2100:42 crsd

drwxr-x--- 2 grid oinstall 4096 Nov 2100:39 cssd

drwxr-x--- 2 root oinstall 4096 Nov 2100:41 ctssd

drwxr-x--- 2 grid oinstall 4096 Nov 2100:39 diskmon

drwxr-x--- 2 grid oinstall 4096 Nov 2100:42 evmd

drwxr-x--- 2 grid oinstall 4096 Nov 2100:38 gipcd

drwxr-x--- 2 root oinstall 4096 Nov 2100:38 gnsd

drwxr-x--- 2 grid oinstall 4096 Nov 2100:40 gpnpd

drwxr-x--- 2 grid oinstall 4096 Nov 2100:38 mdnsd

drwxr-x--- 2 root oinstall 4096 Nov 2100:39 ohasd

drwxrwxr-t 5 grid oinstall 4096 Nov 2100:38 racg

drwxr-x--- 2 grid oinstall 4096 Nov 2100:42 srvm

除了alertrac2.log 在宕机的时候有更新外,其他文件均无更新。到节点1重启了一下:

[root@rac1 client]# ll

total 124

-rw-r--r-- 1 root root 193 Nov 21 00:31 clscfg.log

-rw-rw-rw- 1 root root 28635 Nov 21 00:32 crsctl.log

-rw-r--r-- 1 root root 114 Nov 21 00:32 crsctl.trc

-rw-r--r-- 1 gridoinstall 663 Nov 21 03:08 css.log

-rw-r--r-- 1 grid oinstall 1051 Nov 21 00:28 gpnptool_11653.log

-rw-r--r-- 1 grid oinstall 114 Nov 21 00:28 gpnptool_11653.trc

-rw-r--r-- 1 grid oinstall 1461 Nov 21 00:28 gpnptool_11660.log

-rw-r--r-- 1 grid oinstall 114 Nov 21 00:28 gpnptool_11660.trc

-rw-r--r-- 1 grid oinstall 551 Nov 21 00:35 oclskd.log

-rw-r----- 1 root root 6100 Nov 21 00:27 ocrconfig_11312.log

-rw-r--r-- 1 root root 3170 Nov 21 00:31 ocrconfig_12191.log

-rw-r----- 1 root root 342 Nov 21 00:37 ocrconfig_13798.log

-rw-r--r-- 1 grid oinstall 33862 Nov 2100:45 oifcfg.log

-rw-r--r-- 1 grid oinstall 114 Nov 21 00:45 oifcfg.trc

-rw-r--r-- 1 root root 1067 Nov 21 00:36 olsnodes.log

-rw-r--r-- 1 grid oinstall 114 Nov 21 00:37 olsnodes.trc

--css.log 的也只有如下错误:

[root@rac1 client]# cat css.log

Oracle Database 11g Clusterware Release11.2.0.1.0 - Production Copyright 1996, 2009 Oracle. All rights reserved.

2012-11-21 03:08:22.764: [CSSCLNT][4171966208]clssscConnect: gipc request failed with 29 (0x13)

2012-11-21 03:08:22.764: [ CSSCLNT][4171966208]clsssInitNative:connect failed, rc 29

2012-11-21 03:08:28.140: [CSSCLNT][4171966208]clssscConnect: gipc request failed with 29 (0x13)

2012-11-21 03:08:28.140: [CSSCLNT][4171966208]clsssInitNative: connect failed, rc 29

2012-11-21 03:08:37.908: [CSSCLNT][4171966208]clssscConnect: gipc request failed with 29 (0x13)

2012-11-21 03:08:37.908:[ CSSCLNT][4171966208]clsssInitNative: connect failed, rc 29

根据MOS 说明:

How toTroubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]

/article/1363201.html

1. ocssd is fully up

If ocssd.bin is not fully up, crsd.log will show messages like following:

2010-02-03 22:37:51.638: [CSSCLNT][1548456880]clssscConnect: gipc request failed with 29 (0x16)
2010-02-03 22:37:51.638: [ CSSCLNT][1548456880]clsssInitNative: connect failed,rc 29
2010-02-03 22:37:51.639: [ CRSRTI][1548456880] CSS is not ready. Receivedstatus 3 from CSS. Waiting for good status ..

是OCSSD 进程无法启动。那么为什么OCSS进程无法启动? 我们对ohasd进程进行strace:

[root@rac1 client]# ps -ef|grep has

root 12192 1 012:44 ? 00:00:00/u01/app/grid/11.2.0/bin/ohasd.bin reboot

root 12281 8085 0 13:05 pts/2 00:00:00 grep has

[root@rac1 client]# strace -p 12192 -o dave.log

Process 12192 attached - interrupt to quit

quit

Process 12192 detached

[root@rac1 client]#

[root@rac1 client]# ls

clscfg.log dave.log gpnptool_11660.trc ocrconfig_13798.log olsnodes.trc

crsctl.log gpnptool_11653.log oclskd.log oifcfg.log

crsctl.trc gpnptool_11653.trc ocrconfig_11312.log oifcfg.trc

css.log gpnptool_11660.log ocrconfig_12191.log olsnodes.log

[root@rac1 client]# cat dave.log

open("/var/tmp/.oracle/npohasd",O_WRONLY <unfinished ...>

这里提示了一条很重要的信息。就是这里的文件,这个文件,我们在安装11.2.0.1的RAC时也会遇到,其应该说是11.2.0.1的一个bug。

参考:

Oracle 11gRAC ohasd failed to start at /u01/app/11.2.0/grid/crs/install/rootcrs.pl line443 解决方法

/article/1448705.html

所以在启动CRS之前,先在2个节点指定dd命令:

[root@rac1 client]# /bin/ddif=/var/tmp/.oracle/npohasd of=/dev/null bs=1024 count=1

然后启动,这没有问题了:

[root@rac1 bin]# ./crsctlstart crs

CRS-4123: Oracle High Availability Serviceshas been started.

[root@rac2 bin]# ./crsctlstart crs

CRS-4123: Oracle High Availability Serviceshas been started.

[root@rac2 bin]#./crsctl check crs

CRS-4638: Oracle High AvailabilityServices is online

CRS-4535: Cannot communicate with ClusterReady Services

CRS-4530: Communications failure contactingCluster Synchronization Services daemon

CRS-4534: Cannotcommunicate with Event Manager

[root@rac1 bin]# ./crsctlcheck crs

CRS-4638: Oracle High Availability Servicesis online

CRS-4535: Cannot communicate with ClusterReady Services

CRS-4530: Communications failure contactingCluster Synchronization Services daemon

CRS-4534: Cannot communicate with EventManager

[root@rac1 bin]# ./crsctlstart cluster -all

CRS-5702: Resource 'ora.crsd' is alreadyrunning on 'rac1'

CRS-5702: Resource 'ora.crsd' is alreadyrunning on 'rac2'

[root@rac1 bin]# ./crsctlcheck crs

CRS-4638: Oracle High Availability Servicesis online

CRS-4535: Cannot communicate with ClusterReady Services

CRS-4529: Cluster Synchronization Servicesis online

CRS-4533: Event Manager is online

[root@rac2 bin]# ./crsctlcheck crs

CRS-4638: Oracle High Availability Servicesis online

CRS-4535: Cannot communicate with ClusterReady Services

CRS-4529: Cluster Synchronization Servicesis online

CRS-4533: Event Manager is online

--查看进程,都拉起来了。注意11g的进程启动有些慢,多等一会。

[root@rac2 u01]# sh crs_stat.sh

Name Target State Host

------------------------------ ------------------- -------

ora.DATA.dg ONLINE ONLINE rac1

ora.FRA.dg ONLINE ONLINE rac1

ora.LISTENER.lsnr ONLINE ONLINE rac1

ora.LISTENER_SCAN1.lsnr ONLINE ONLINE rac2

ora.OCRVOTING.dg ONLINE ONLINE rac1

ora.asm ONLINE ONLINE rac1

ora.dave.db OFFLINE OFFLINE

ora.eons ONLINE ONLINE rac1

ora.gsd OFFLINE OFFLINE

ora.net1.network ONLINE ONLINE rac1

ora.oc4j OFFLINE OFFLINE

ora.ons ONLINE ONLINE rac1

ora.rac1.ASM1.asm ONLINE ONLINE rac1

ora.rac1.LISTENER_RAC1.lsnr ONLINE ONLINE rac1

ora.rac1.gsd OFFLINE OFFLINE

ora.rac1.ons ONLINE ONLINE rac1

ora.rac1.vip ONLINE ONLINE rac1

ora.rac2.ASM2.asm ONLINE ONLINE rac2

ora.rac2.LISTENER_RAC2.lsnr ONLINE ONLINE rac2

ora.rac2.gsd OFFLINE OFFLINE

ora.rac2.ons ONLINE ONLINE rac2

ora.rac2.vip ONLINE ONLINE rac2

ora.scan1.vip ONLINE ONLINE rac2

现在可以处理我们实例,弄好之后在升级到11.2.0.3.4. 免得每次都遇到这种问题。

---------------------------------------------------------------------------------------

版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!

Skype: tianlesoftware

QQ: tianlesoftware@gmail.com

Email: tianlesoftware@gmail.com

Blog: http://blog.csdn.net/tianlesoftware

Weibo: http://weibo.com/tianlesoftware

Twitter: http://twitter.com/tianlesoftware

Facebook: http://www.facebook.com/tianlesoftware

Linkedin: http://cn.linkedin.com/in/tianlesoftware
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐