11.2.0.4添加节点时遇到ORA-12547: TNS:lost contact
2015-12-30 00:00
1186 查看
环境描述:
11.2.0.4的2个节点rac,RHEL 6 Update 5
在添加第三个节点的dbca时遇到如下报错,然后第三个db instance添加不成功
/u01/app/11.2.0/grid/log/rac3/agent/crsd/oraagent_oracle/oraagent_oracle.log 的部分报错如下:
用如上的报错,到mos上搜索,不过没啥有价值的东西。
于是就改变策略,用sqlplus / as sysdba 登陆看看有啥报错:
在mos文章SYSDBA Connections Fail With ORA-12547 Error (文档 ID 782276.1)的提示下,
在 $ORACLE_HOME/rdbms/log下,找到了很多trc文件,其内容截取如下:
----此时你也许又疑问,到bdump下看看?其实此时instance尚未建立,是没有bdump目录的。
发现了比较关键的报错:
到mos上搜索到了文章ORA-600 [spstp: ORACLE_HOME uid does not match euid] When Changing Permissions On $ORACLE_HOME/bin/oracle (文档 ID 747456.1)
得到如下的信息:该报错中的500是uid,而1200是euid
于是就去检查该节点上的oracle用户和grid用户的id信息,如下:
上面输出中没有500.那500是从哪里来的?继续检查ORACLE_DB_HOME的属主,发现了问题:
改变属主为oracle之后,再添加节点就没问题了。
总结一下:/u02/app/oracle/product的属主之所以会显示500,是因为rac3主机oracle用户一开始的uid是500,而其他两个节点上oracle用户的uid是1200.大家知道,rac节点的uid不一致的话,是不行的。于是就修改rac3上的uid,结果/u02/app/oracle/product的属主没改,就开始加节点。后续的就不说了。。
11.2.0.4的2个节点rac,RHEL 6 Update 5
[root@rac2 ~]# uname -a Linux rac2 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux [root@rac2 ~]# uname -r 2.6.32-431.el6.x86_64
[oracle@rac2 ~]$ cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.188.18 rac1 192.168.188.19 rac2 192.168.188.20 rac3 192.168.188.118 rac1-vip 192.168.188.119 rac2-vip 192.168.188.120 rac3-vip 192.168.182.18 rac1-priv 192.168.182.19 rac2-priv 192.168.182.20 rac3-priv 192.168.188.105 scan [oracle@rac2 ~]$
在添加第三个节点的dbca时遇到如下报错,然后第三个db instance添加不成功
/u01/app/11.2.0/grid/log/rac3/agent/crsd/oraagent_oracle/oraagent_oracle.log 的部分报错如下:
2015-09-10 01:38:21.978: [ora.orcl.db][3571566336]{1:28142:484} [start] crsHome = /u01/app/11.2.0/grid 2015-09-10 01:38:21.978: [ora.orcl.db][3571566336]{1:28142:484} [start] oracleHome = /u02/app/oracle/product/11.2.0/dbhome_1 2015-09-10 01:38:21.978: [ora.orcl.db][3571566336]{1:28142:484} [start] command = '/u01/app/11.2.0/grid/bin/setasmgidwrap oracle_binary_path=/u02/app/oracle/product/11.2.0/dbhome_1/bin/oracle' 2015-09-10 01:38:21.979: [ora.orcl.db][3571566336]{1:28142:484} [start] start dependency = hard(ora.DATA.dg) weak(type:ora.listener.type,global:type:ora.scan_listener.type,uniform:ora.ons,global:ora.gns,ora.FRA.dg) pullup(ora.DATA.dg) 2015-09-10 01:38:21.979: [ora.orcl.db][3571566336]{1:28142:484} [start] ASM disk group dependency found 2015-09-10 01:38:21.979: [ora.orcl.db][3571566336]{1:28142:484} [start] Utils:execCmd action = 1 flags = 6 ohome = /u01/app/11.2.0/grid cmdname = setasmgidwrap. 2015-09-10 01:38:23.937: [ AGFW][3567363840]{1:28142:484} Agent received the message: RESOURCE_MODIFY_ATTR[ora.orcl.db 3 1] ID 4355:671 2015-09-10 01:38:50.992: [ora.orcl.db][3571566336]{1:28142:484} [start] execCmd ret = 0 2015-09-10 01:38:50.992: [ USRTHRD][3571566336]{1:28142:484} InstConnection::initMutex AttachLock 00ae3210 DetachLock 00ae3228 2015-09-10 01:38:50.994: [ora.orcl.db][3571566336]{1:28142:484} [start] clsnInstConnection::makeConnectStr UsrOraEnv m_oracleHome /u02/app/oracle/product/11.2.0/dbhome_1 Crshome /u01/app/11.2.0/grid 2015-09-10 01:38:50.994: [ora.orcl.db][3571566336]{1:28142:484} [start] makeConnectStr = (DESCRIPTION=(ADDRESS=(PROTOCOL=beq)(PROGRAM=/u02/app/oracle/product/11.2.0/dbhome_1/bin/oracle)(ARGV0=oracleorcl3)(ENVS='ORACLE_HOME=/u02/app/oracle/product/11.2.0/dbhome_1,ORACLE_SID=orcl3,LD_LIBRARY_PATH=')(ARGS='(DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))'))(CONNECT_DATA=(SID=orcl3))) 2015-09-10 01:38:51.223: [ora.orcl.db][3571566336]{1:28142:484} [start] Container:start oracle home /u02/app/oracle/product/11.2.0/dbhome_1 2015-09-10 01:38:51.224: [ora.orcl.db][3571566336]{1:28142:484} [start] InstConnection::connectInt: server not attached 2015-09-10 01:38:52.996: [ora.orcl.db][3571566336]{1:28142:484} [start] ORA-12547: TNS:lost contact 2015-09-10 01:38:53.030: [ora.orcl.db][3571566336]{1:28142:484} [start] InstConnection::connectInt (1) Exception OCIException 2015-09-10 01:38:53.032: [ora.orcl.db][3571566336]{1:28142:484} [start] InstConnection:connect:excp OCIException OCI error 12547 2015-09-10 01:38:53.033: [ora.orcl.db][3571566336]{1:28142:484} [start] InstConnection::connectInt: server not attached 2015-09-10 01:38:53.712: [ora.orcl.db][3571566336]{1:28142:484} [start] ORA-12547: TNS:lost contact 2015-09-10 01:38:53.713: [ora.orcl.db][3571566336]{1:28142:484} [start] InstConnection::c 3ff0 onnectInt (1) Exception OCIException 2015-09-10 01:38:53.713: [ora.orcl.db][3571566336]{1:28142:484} [start] InstAgent::start: 1 errcode 12547 2015-09-10 01:38:53.713: [ora.orcl.db][3571566336]{1:28142:484} [start] ConnectionPool::resetConnection s_statusOfConnectionMap 00ae9760 2015-09-10 01:38:53.713: [ora.orcl.db][3571566336]{1:28142:484} [start] ConnectionPool::resetConnection sid orcl3 status 2 2015-09-10 01:38:53.713: [ora.orcl.db][3571566336]{1:28142:484} [start] Gimh::check OH /u02/app/oracle/product/11.2.0/dbhome_1 SID orcl3 2015-09-10 01:38:53.754: [ora.orcl.db][3571566336]{1:28142:484} [start] GIMH: GIM-00104: Health check failed to connect to instance. GIM-00090: OS-dependent operation:open failed with status: 2 GIM-00091: OS failure message: No such file or directory GIM-00092: OS failure occurred at: sskgmsmr_7 2015-09-10 01:38:53.754: [ora.orcl.db][3571566336]{1:28142:484} [start] (:CLSN00007:)DbAgent::check failed gimh state 0 2015-09-10 01:38:53.763: [ora.orcl.db][3571566336]{1:28142:484} [start] clsnDbAgent:checkCbk clsagfw_res_status ret 5 2015-09-10 01:38:53.763: [ora.orcl.db][3571566336]{1:28142:484} [start] ConnectionPool::stopConnection 2015-09-10 01:38:53.763: [ora.orcl.db][3571566336]{1:28142:484} [start] ConnectionPool::removeConnection connection count 0 2015-09-10 01:38:53.763: [ora.orcl.db][3571566336]{1:28142:484} [start] ConnectionPool::removeConnection freed 0 2015-09-10 01:38:53.763: [ora.orcl.db][3571566336]{1:28142:484} [start] ConnectionPool::stopConnection sid orcl3 status 1 2015-09-10 01:38:53.763: [ora.orcl.db][3571566336]{1:28142:484} [start] InstAgent::check 1 prev clsagfw_res_status 0 current clsagfw_res_status 5 2015-09-10 01:38:53.764: [ora.orcl.db][3571566336]{1:28142:484} [start] InstAgent::start not logged on check state details Abnormal Termination 2015-09-10 01:38:53.764: [ora.orcl.db][3571566336]{1:28142:484} [start] InstAgent::start: ORA-1012 or Lost Contact try cleanOracleIpc and start force 2015-09-10 01:38:53.764: [ USRTHRD][3571566336]{1:28142:484} InstConnection:~InstConnection: this b00070c0 2015-09-10 01:38:53.766: [ora.orcl.db][3571566336]{1:28142:484} [start] InstAgent::start call sysresv 2015-09-10 01:38:53.766: [ora.orcl.db][3571566336]{1:28142:484} [start] Container:start scls_clean_oracle_ipc Container orcl3 dbHome /u02/app/oracle/product/11.2.0/dbhome_1
用如上的报错,到mos上搜索,不过没啥有价值的东西。
于是就改变策略,用sqlplus / as sysdba 登陆看看有啥报错:
[oracle@rac3 oracle]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Thu Sep 10 12:09:13 2015 Copyright (c) 1982, 2013, Oracle. All rights reserved. ERROR: ORA-12547: TNS:lost contact Enter user-name: ERROR: ORA-12547: TNS:lost contact Enter user-name: ERROR: ORA-12547: TNS:lost contact SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus [oracle@rac3 oracle]$
在mos文章SYSDBA Connections Fail With ORA-12547 Error (文档 ID 782276.1)的提示下,
在 $ORACLE_HOME/rdbms/log下,找到了很多trc文件,其内容截取如下:
----此时你也许又疑问,到bdump下看看?其实此时instance尚未建立,是没有bdump目录的。
[oracle@rac3 log]$ more orcl3_ora_14292.trc Dump file /u02/app/oracle/product/11.2.0/dbhome_1/rdbms/log/orcl3_ora_14292.trc Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, Real Application Clusters, OLAP, Data Mining and Real Application Testing options ORACLE_HOME = /u02/app/oracle/product/11.2.0/dbhome_1 System name: Linux Node name: rac3 Release: 2.6.32-431.el6.x86_64 Version: #1 SMP Sun Nov 10 22:19:54 EST 2013 Machine: x86_64 Instance name: orcl3 Redo thread mounted by this instance: 0 <none> Oracle process number: 0 Unix process pid: 14292, image: oracle@rac3 *** 2015-09-10 11:32:38.641 dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=3, mask=0x0) ----- Error Stack Dump ----- ORA-00600: internal error code, arguments: [spstp: ORACLE_HOME uid does not match euid], [500], [1200], [], [], [], [], [], [], [], [], [] ----- SQL Statement (None) ----- Current SQL information unavailable - no SGA. ----- Call Stack Trace ----- calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ---------------------------- skdstdst()+41 call kgdsdst() 000000000 ? 000000000 ? 7FFFB8AFF650 ? 7FFFB8AFF728 ? 7FFFB8B041D0 ? 000000002 ? ksedst1()+103 call skdstdst() 000000000 ? 000000000 ? 7FFFB8AFF650 ? 7FFFB8AFF728 ? 7FFFB8B041D0 ? 000000002 ?
发现了比较关键的报错:
spstp: ORACLE_HOME uid does not match euid], [500], [1200], [], [], [], [], [], [], [], [], []
到mos上搜索到了文章ORA-600 [spstp: ORACLE_HOME uid does not match euid] When Changing Permissions On $ORACLE_HOME/bin/oracle (文档 ID 747456.1)
得到如下的信息:该报错中的500是uid,而1200是euid
于是就去检查该节点上的oracle用户和grid用户的id信息,如下:
[oracle@rac3 oracle]$ id oracle uid=1200(oracle) gid=1000(oinstall) groups=1000(oinstall),1200(dba),1201(oper),1300(asmdba) [oracle@rac3 oracle]$ id grid uid=1100(grid) gid=1000(oinstall) groups=1000(oinstall),1200(dba),1100(asmadmin),1301(asmoper),1300(asmdba) [oracle@rac3 oracle]$
上面输出中没有500.那500是从哪里来的?继续检查ORACLE_DB_HOME的属主,发现了问题:
[oracle@rac3 ~]$ pwd /home/oracle [oracle@rac3 ~]$ cd /u02/app/oracle/product/11.2.0/ [oracle@rac3 11.2.0]$ ls -lrt total 4 drwxrwxr-x 74 500 oinstall 4096 Sep 10 01:12 dbhome_1 [oracle@rac3 11.2.0]$ cd .. [oracle@rac3 product]$ ls -lrt total 4 drwxrwxr-x 3 500 oinstall 4096 Sep 9 21:46 11.2.0 [oracle@rac3 product]$ cd .. [oracle@rac3 oracle]$ ls -lrt total 12 drwxrwxr-x 3 500 oinstall 4096 Sep 9 21:36 product --------->此出product的属主是500,问题得到定位 drwxr-xr-x 3 oracle oinstall 4096 Sep 10 01:37 cfgtoollogs drwxr-xr-x 3 oracle oinstall 4096 Sep 10 11:31 admin [oracle@rac3 oracle]$ pwd /u02/app/oracle [oracle@rac3 oracle]$
改变属主为oracle之后,再添加节点就没问题了。
总结一下:/u02/app/oracle/product的属主之所以会显示500,是因为rac3主机oracle用户一开始的uid是500,而其他两个节点上oracle用户的uid是1200.大家知道,rac节点的uid不一致的话,是不行的。于是就修改rac3上的uid,结果/u02/app/oracle/product的属主没改,就开始加节点。后续的就不说了。。
相关文章推荐
- 收集Linux操作系统的网卡信息的命令
- 关于rpm 命令的--changelog参数
- RHEL6下禁用selinux的方法
- redhat的bugdb
- TAB_TEST_04.dmp文件大小不对时的impdp提示
- Oracle 数据库怎么从 Linux x86平台向 IA64 安腾平台(RH or SUSE)迁移?
- Linux操作系统日志中常用的搜索关键字
- 使用openfiler来模拟主机到存储的多条路径(多路径)
- 一次打11.2.0.3.8 PSU时遇到的问题
- 为openfiler添加第二块网卡
- Android 百度地图开发(二)
- 百度地图开发POI(三)
- Java子类,父类方法调用时序
- seajs 源码 学习心得
- 常用的js校验比较全
- Java提升-工厂模式、工厂方法模式(二)
- 类使用语法
- StringBuffer类
- 超低功耗水气表GSM/GPRS无线抄表模块
- 轻便、高效的企业网盘产品为企业提供便利