GIPCHA down 0,OLR fetch for parameter logsize (8) failed with rc 21
2015-01-14 15:06
651 查看
最近遇到一个RAC one Node的问题,在crs和db都停掉之后,启动失败了
看css日志,没懂啊,然后一步一步解决的。。。
[grid@11g ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.DATA.dg ora....up.type OFFLINE OFFLINE
ora.DATADG.dg ora....up.type OFFLINE OFFLINE
ora....ER.lsnr ora....er.type ONLINE ONLINE 11g
ora.asm ora.asm.type OFFLINE OFFLINE
ora.asmdb.db ora....se.type OFFLINE OFFLINE
ora.cssd ora.cssd.type OFFLINE OFFLINE
ora.diskmon ora....on.type OFFLINE OFFLINE
ora.evmd ora.evm.type ONLINE ONLINE 11g
ora.ons ora.ons.type OFFLINE OFFLINE
看到这个地方,定位应该是ASM的问题,尝试重新启动
[grid@11g cssd]$ crs_start -all
CRS-5702: Resource 'ora.LISTENER.lsnr' is already running on '11g'
CRS-5702: Resource 'ora.evmd' is already running on '11g'
CRS-2501: Resource 'ora.ons' is disabled
Attempting to start `ora.cssd` on member `11g`
Attempting to start `ora.diskmon` on member `11g`
Start of `ora.diskmon` on member `11g` succeeded.
Start of `ora.cssd` on member `11g` succeeded.
Attempting to start `ora.asm` on member `11g`
ORA-01031: insufficient privileges
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-01031: insufficient privileges
. For details refer to "(:CLSN00107:)" in "/u01/app/11.2.0/grid/log/11g/agent/ohasd/oraagent_grid/oraagent_grid.log".
Start of `ora.asm` on member `11g` failed.
Attempting to stop `ora.asm` on member `11g`
ORA-01031: insufficient privileges
Stop of `ora.asm` on member `11g` succeeded.
Attempting to start `ora.asm` on member `11g`
ORA-01031: insufficient privileges
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-01031: insufficient privileges
. For details refer to "(:CLSN00107:)" in "/u01/app/11.2.0/grid/log/11g/agent/ohasd/oraagent_grid/oraagent_grid.log".
Start of `ora.asm` on member `11g` failed.
Attempting to stop `ora.asm` on member `11g`
ORA-01031: insufficient privileges
Stop of `ora.asm` on member `11g` succeeded.
CRS-0215: Could not start resource 'ora.DATADG.dg'.
CRS-0223: Resource 'ora.LISTENER.lsnr' has placement error.
CRS-0215: Could not start resource 'ora.asmdb.db'.
CRS-0223: Resource 'ora.evmd' has placement error.
CRS-2660: Resource 'ora.ons' or all of its instances are disabled
ocssd.log的日志发现报错:
2015-01-14 10:37:07.760: [ CSSD][3330419984]clsu_load_ENV_levels: Module = OLR, LogLevel = 0, TraceLevel = 0
[ CSSD][3330419984]clsugetconf : Configuration type [3].
2015-01-14 10:37:07.760: [ CSSD][3330419984]clssscmain: Starting CSS daemon, version 11.2.0.3.0, in (local-only) mode with uniqueness value 1421203027
2015-01-14 10:37:07.761: [ CSSD][3330419984]clssscmain: Environment is production
2015-01-14 10:37:07.761: [ CSSD][3330419984]clssscmain: Core file size limit extended
2015-01-14 10:37:07.766: [ CSSD][3330419984]clssscmain: GIPCHA down 0
2015-01-14 10:37:07.767: [ CSSD][3330419984]clssscGetParameterOLR: OLR fetch for parameter logsize (8) failed with rc 21
2015-01-14 10:37:07.767: [ CSSD][3330419984]clssscExtendLimits: The current soft limit for file descriptors is 65536, hard limit is 65536
2015-01-14 10:37:07.767: [ CSSD][3330419984]clssscExtendLimits: The current soft limit for locked memory is 4294967295, hard limit is 4294967295
2015-01-14 10:37:07.767: [ CSSD][3330419984]clssscmain: Running as user grid
尝试使用sqlplus / as sysdba进入asm,报错ORA-01031: insufficient privileges
定位为操作系统验证问题
[grid@11g admin]$ more sqlnet.ora
# sqlnet.ora Network Configuration File: /u01/app/11.2.0/grid/network/admin/sqlnet.ora
# Generated by Oracle configuration tools.
NAMES.DIRECTORY_PATH= (TNSNAMES, EZCONNECT)
ADR_BASE = /u01/app/grid 有问题
Sqlnet.authentication_services=(nts)
[grid@11g admin]$ ls
listener1412302PM5934.bak listener.ora.bak listener.ora.bak.11g samples shrept.lst sqlnet1412302PM5934.bak sqlnet1.ora sqlnet.ora
[grid@11g admin]$ mv sqlnet.ora sqlnet.ora.bak 直接删掉
再次尝试,成功。
[grid@11g cssd]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Wed Jan 14 11:01:06 2015
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup mount;
ORA-01031: insufficient privileges
SQL> conn / as sysasm
Connected to an idle instance.
SQL> startup mount
ASM instance started
Total System Global Area 283930624 bytes
Fixed Size 2227664 bytes
Variable Size 256537136 bytes
ASM Cache 25165824 bytes
ASM diskgroups mounted
至此,asm启动成功
启动db
[oracle@11g ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Wed Jan 14 11:06:56 2015
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area 839282688 bytes
Fixed Size 2233000 bytes
Variable Size 541068632 bytes
Database Buffers 293601280 bytes
Redo Buffers 2379776 bytes
Database mounted.
Database opened.
查看集群,集群正常
再次重启验证
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_stop -all
CRS-2500: Cannot stop resource 'ora.diskmon' as it is not running
CRS-2500: Cannot stop resource 'ora.ons' as it is not running
Attempting to stop `ora.LISTENER.lsnr` on member `11g`
Attempting to stop `ora.evmd` on member `11g`
Attempting to stop `ora.DATA.dg` on member `11g`
Attempting to stop `ora.DATADG.dg` on member `11g`
Attempting to stop `ora.asmdb.db` on member `11g`
ORA-01031: insufficient privileges
Stop of `ora.evmd` on member `11g` succeeded.
Stop of `ora.LISTENER.lsnr` on member `11g` succeeded.
Stop of `ora.DATADG.dg` on member `11g` succeeded.
Attempting to stop `ora.cssd` on member `11g`
Stop of `ora.cssd` on member `11g` succeeded.
CRS-5017: The resource action "ora.DATA.dg stop" encountered the following error:
DgpAgent::getConnxn aborted. For details refer to "(:CLSN00108:)" in "/u01/app/11.2.0/grid/log/11g/agent/ohasd/oraagent_grid/oraagent_grid.log".
CRS-5017: The resource action "ora.DATA.dg check" encountered the following error:
DgpAgent::getConnxn aborted. For details refer to "(:CLSN00109:)" in "/u01/app/11.2.0/grid/log/11g/agent/ohasd/oraagent_grid/oraagent_grid.log".
Attempting to stop `ora.DATA.dg` on member `11g`
`ora.DATA.dg` on member `11g` has experienced an unrecoverable failure.
Attempting to start `ora.cssd` on member `11g`
Start of `ora.cssd` on member `11g` succeeded.
CRS-0216: Could not stop resource 'ora.cssd'.
CRS-0216: Could not stop resource 'ora.diskmon'.
CRS-0216: Could not stop resource 'ora.ons'.
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.DATA.dg ora....up.type OFFLINE UNKNOWN 11g
4000
ora.DATADG.dg ora....up.type OFFLINE OFFLINE
ora....ER.lsnr ora....er.type OFFLINE OFFLINE
ora.asm ora.asm.type OFFLINE OFFLINE
ora.asmdb.db ora....se.type OFFLINE OFFLINE
ora.cssd ora.cssd.type OFFLINE ONLINE 11g
ora.diskmon ora....on.type OFFLINE OFFLINE
ora.evmd ora.evm.type OFFLINE OFFLINE
ora.ons ora.ons.type OFFLINE OFFLINE
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_stop ora.cssd
Attempting to stop `ora.cssd` on member `11g`
Stop of `ora.cssd` on member `11g` succeeded.
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.DATA.dg ora....up.type OFFLINE UNKNOWN 11g
ora.DATADG.dg ora....up.type OFFLINE OFFLINE
ora....ER.lsnr ora....er.type OFFLINE OFFLINE
ora.asm ora.asm.type OFFLINE OFFLINE
ora.asmdb.db ora....se.type OFFLINE OFFLINE
ora.cssd ora.cssd.type OFFLINE OFFLINE
ora.diskmon ora....on.type OFFLINE OFFLINE
ora.evmd ora.evm.type OFFLINE OFFLINE
ora.ons ora.ons.type OFFLINE OFFLINE
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_stop ora.DATA.dg
Attempting to stop `ora.DATA.dg` on member `11g`
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_stop -f ora.DATA.dg
CRS-2545: Cannot operate on 'instance of ora.DATA.dg assigned to 11g'. It is locked by 'root' for command 'Stop Resource' issued from '11g'
CRS-0233: Resource or relatives are currently involved with another operation.
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_stop ora.DATA.dg -f
CRS-2545: Cannot operate on 'instance of ora.DATA.dg assigned to 11g'. It is locked by 'root' for command 'Stop Resource' issued from '11g'
CRS-0233: Resource or relatives are currently involved with another operation.
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_start -all
CRS-2501: Resource 'ora.ons' is disabled
Attempting to start `ora.evmd` on member `11g`
Attempting to start `ora.LISTENER.lsnr` on member `11g`
Attempting to start `ora.cssd` on member `11g`
Attempting to start `ora.diskmon` on member `11g`
Start of `ora.diskmon` on member `11g` succeeded.
Start of `ora.cssd` on member `11g` succeeded.
Start of `ora.LISTENER.lsnr` on member `11g` succeeded.
Attempting to stop `ora.asm` on member `11g`
Stop of `ora.asm` on member `11g` succeeded.
Attempting to start `ora.asm` on member `11g`
Start of `ora.evmd` on member `11g` succeeded.
Start of `ora.asm` on member `11g` succeeded.
Attempting to start `ora.DATA.dg` on member `11g`
Attempting to start `ora.DATADG.dg` on member `11g`
Start of `ora.DATA.dg` on member `11g` succeeded.
Attempting to stop `ora.asmdb.db` on member `11g`
Start of `ora.DATADG.dg` on member `11g` succeeded.
Stop of `ora.asmdb.db` on member `11g` succeeded.
Attempting to start `ora.asmdb.db` on member `11g`
Start of `ora.asmdb.db` on member `11g` succeeded.
CRS-2660: Resource 'ora.ons' or all of its instances are disabled
=============
针对diskmon找了个文档,记录下
[grid@vm11gr2] /home/grid> sqlplus "/as sysasm"
SQL*Plus: Release 11.2.0.1.0 Production on Sun Oct 25 10:16:21 2009
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup
ORA-01078: failure in processing system parameters
ORA-29701: unable to connect to Cluster Synchronization Service
SQL>
无法连接到CSS服务上.到操作系统上检查一下看看
[grid@vm11gr2] /home/grid> crsctl check css
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
[grid@vm11gr2] /home/grid>
[grid@vm11gr2] /home/grid> ps -ef|grep cssd
果然没有CSS的服务daemon进程,再看一下HAS(High Availability Service)的状态
[grid@vm11gr2] /home/grid> crsctl check has
CRS-4638: Oracle High Availability Services is online
[grid@vm11gr2] /home/grid> ps -ef|grep d.bin
grid 5886 1 0 10:06 ? 00:00:01 /u01/app/grid/product/11.2/grid/bin/ohasd.bin reboot
[grid@vm11gr2] /home/grid>
发现HAS的服务确实启动了的,而ora.cssd和ora.diskmon这2个服务是依赖于HAS维护的.
进一步查看各资源的状态
[grid@vm11gr2] /home/grid> crs_stat -t
Name Type Target State Host
--------------------------------------------------------------
ora.FLASH_DATA.dg ora.diskgroup.type OFFLINE OFFLINE vm11gr2
ora.SYS_DATA.dg ora.diskgroup.type OFFLINE OFFLINE vm11gr2
ora.asm ora.asm.type OFFLINE OFFLINE vm11gr2
ora.cssd ora.cssd.type OFFLINE OFFLINE vm11gr2
ora.diskmon ora.diskmon.type OFFLINE OFFLINE vm11gr2
[grid@vm11gr2] /home/grid>
[grid@vm11gr2] /home/grid> crsctl status resource -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_ DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.FLASH_DATA.dg
OFFLINE OFFLINE vm11gr2
ora.SYS_DATA.dg
OFFLINE OFFLINE vm11gr2
ora.asm
OFFLINE OFFLINE vm11gr2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
1 OFFLINE OFFLINE
ora.diskmon
1 OFFLINE OFFLINE
再看一下ora.cssd和ora.diskmon的属性
[grid@vm11gr2] /home/grid> crs_stat -p ora.cssd
NAME=ora.cssd
TYPE=ora.cssd.type
ACTION_SCRIPT=
ACTIVE_PLACEMENT=0
AUTO_START=never
CHECK_INTERVAL=30
DESCRIPTION="Resource type for CSSD"
FAILOVER_DELAY=0
FAILURE_INTERVAL=3
FAILURE_THRESHOLD=5
HOSTING_MEMBERS=
PLACEMENT=balanced
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=600
START_TIMEOUT=600
STOP_TIMEOUT=900
UPTIME_THRESHOLD=1m
[grid@vm11gr2] /home/grid> crs_stat -p ora.diskmon
NAME=ora.diskmon
TYPE=ora.diskmon.type
ACTION_SCRIPT=
ACTIVE_PLACEMENT=0
AUTO_START=never
CHECK_INTERVAL=20
DESCRIPTION="Resource type for Diskmon"
FAILOVER_DELAY=0
FAILURE_INTERVAL=3
FAILURE_THRESHOLD=5
HOSTING_MEMBERS=
PLACEMENT=balanced
RESTART_ATTEMPTS=10
SCRIPT_TIMEOUT=60
START_TIMEOUT=60
STOP_TIMEOUT=60
UPTIME_THRESHOLD=5s
[grid@vm11gr2] /home/grid>
到这里基本就找到了原因了,可以看到这两个资源的AUTO_START属性默认都设置为never,也就是说他们不会随着HAS服务的启动而自动启动的,
尽管默认情况下HAS服务是开机自动启动的.好了,那我们就手动启动一下吧:
[grid@vm11gr2] /home/grid> crsctl start resource ora.cssd
CRS-2672: Attempting to start 'ora.cssd' on 'vm11gr2'
CRS-2679: Attempting to clean 'ora.diskmon' on 'vm11gr2'
CRS-2681: Clean of 'ora.diskmon' on 'vm11gr2' succeeded
CRS-2672: Attempting to start 'ora.diskmon' on 'vm11gr2'
CRS-2676: Start of 'ora.diskmon' on 'vm11gr2' succeeded
CRS-2676: Start of 'ora.cssd' on 'vm11gr2' succeeded
[grid@vm11gr2] /home/grid>
注:ora.cssd和ora.diskmon这两个服务是有依赖关系的,启动哪个都会把两个都起来.
[grid@vm11gr2] /home/grid> crs_stat -t
Name Type Target State Host
--------------------------------------------------------------
ora.FLASH_DATA.dg ora.diskgroup.type OFFLINE OFFLINE vm11gr2
ora.SYS_DATA.dg ora.diskgroup.type OFFLINE OFFLINE vm11gr2
ora.asm ora.asm.type OFFLINE OFFLINE vm11gr2
ora.cssd ora.cssd.type ONLINE ONLINE vm11gr2
ora.diskmon ora.diskmon.type ONLINE ONLINE vm11gr2
[grid@vm11gr2] /home/grid>
CSS服务起来了,重启动asm instance
[grid@vm11gr2] /home/grid> sqlplus "/as sysasm"
SQL*Plus: Release 11.2.0.1.0 Production on Sun Oct 25 10:30:03 2009
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup
ASM instance started
Total System Global Area 284565504 bytes
Fixed Size 1336036 bytes
Variable Size 258063644 bytes
ASM Cache 25165824 bytes
ASM diskgroups mounted
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Automatic Storage Management option
[grid@vm11gr2] /home/grid> crs_stat -t
Name Type Target State Host
--------------------------------------------------------------
ora.FLASH_DATA.dg ora.diskgroup.type ONLINE ONLINE vm11gr2
ora.SYS_DATA.dg ora.diskgroup.type ONLINE ONLINE vm11gr2
ora.asm ora.asm.type ONLINE ONLINE vm11gr2
ora.cssd ora.cssd.type ONLINE ONLINE vm11gr2
ora.diskmon ora.diskmon.type ONLINE ONLINE vm11gr2
[grid@vm11gr2] /home/grid>
tips
1)默认情况下HAS(High Availability Service)是自动启动的.通过如下命令可以取消和启用自动启动
crsctl disable has
crsctl enable has
2)HAS手动启动和停止
crsctl start has
crsctl stop has
3)查看HAS的状态
crsctl check has
4)如果想让ora.css和ora.diskmon服务随着HAS的启动而自动启动,那么你可以这两个服务的AUTO_START属性
crsctl modify resource "ora.cssd" -attr "AUTO_START=1"
or
crsctl modify resource "ora.diskmon" -attr "AUTO_START=1"
5)如果想取消ora.css和ora.diskmon的Auto start
crsctl modify resource "ora.cssd" -attr "AUTO_START=never"
crsctl modify resource "ora.diskmon" -attr "AUTO_START=never"
==============
Wed Jan 14 11:35:45 2015
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1
ORA-15032: not all alterations performed
ORA-15027: active use of diskgroup "DATADG" precludes its dismount
ERROR: ALTER DISKGROUP DATADG DISMOUNT /* asm agent *//* {0:0:153} */
Wed Jan 14 11:35:55 2015
SQL> ALTER DISKGROUP DATA DISMOUNT /* asm agent *//* {0:0:153} */
Wed Jan 14 11:35:55 2015
SQL> ALTER DISKGROUP DATADG DISMOUNT /* asm agent *//* {0:0:153} */
NOTE: Active use of SPFILE in group
Wed Jan 14 11:35:55 2015
GMON querying group 1 at 15 for pid 13, osid 3306
Wed Jan 14 11:35:55 2015
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1
ORA-15032: not all alterations performed
ORA-15027: active use of diskgroup "DATADG" precludes its dismount
ERROR: ALTER DISKGROUP DATADG DISMOUNT /* asm agent *//* {0:0:153} */
GMON querying group 2 at 16 for pid 13, osid 3306
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 2
Wed Jan 14 11:36:05 2015
ORA-15032: not all alterations performed
ORA-15027: active use of diskgroup "DATA" precludes its dismount
ERROR: ALTER DISKGROUP DATA DISMOUNT /* asm agent *//* {0:0:153} */
Wed Jan 14 11:36:08 2015
Errors in file /u01/app/grid/diag/asm/+asm/+ASM/trace/+ASM_gmon_3308.trc:
ORA-29746: Cluster Synchronization Service is being shut down.
GMON (ospid: 3308): terminating the instance due to error 29746
Wed Jan 14 11:36:09 2015
System state dump requested by (instance=1, osid=3308 (GMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/grid/diag/asm/+asm/+ASM/trace/+ASM_diag_3292.trc
Dumping diagnostic data in directory=[cdmp_20150114113609], requested by (instance=1, osid=3308 (GMON)), summary=[abnormal instance termination].
Instance terminated by GMON, pid = 3308
看css日志,没懂啊,然后一步一步解决的。。。
[grid@11g ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.DATA.dg ora....up.type OFFLINE OFFLINE
ora.DATADG.dg ora....up.type OFFLINE OFFLINE
ora....ER.lsnr ora....er.type ONLINE ONLINE 11g
ora.asm ora.asm.type OFFLINE OFFLINE
ora.asmdb.db ora....se.type OFFLINE OFFLINE
ora.cssd ora.cssd.type OFFLINE OFFLINE
ora.diskmon ora....on.type OFFLINE OFFLINE
ora.evmd ora.evm.type ONLINE ONLINE 11g
ora.ons ora.ons.type OFFLINE OFFLINE
看到这个地方,定位应该是ASM的问题,尝试重新启动
[grid@11g cssd]$ crs_start -all
CRS-5702: Resource 'ora.LISTENER.lsnr' is already running on '11g'
CRS-5702: Resource 'ora.evmd' is already running on '11g'
CRS-2501: Resource 'ora.ons' is disabled
Attempting to start `ora.cssd` on member `11g`
Attempting to start `ora.diskmon` on member `11g`
Start of `ora.diskmon` on member `11g` succeeded.
Start of `ora.cssd` on member `11g` succeeded.
Attempting to start `ora.asm` on member `11g`
ORA-01031: insufficient privileges
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-01031: insufficient privileges
. For details refer to "(:CLSN00107:)" in "/u01/app/11.2.0/grid/log/11g/agent/ohasd/oraagent_grid/oraagent_grid.log".
Start of `ora.asm` on member `11g` failed.
Attempting to stop `ora.asm` on member `11g`
ORA-01031: insufficient privileges
Stop of `ora.asm` on member `11g` succeeded.
Attempting to start `ora.asm` on member `11g`
ORA-01031: insufficient privileges
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-01031: insufficient privileges
. For details refer to "(:CLSN00107:)" in "/u01/app/11.2.0/grid/log/11g/agent/ohasd/oraagent_grid/oraagent_grid.log".
Start of `ora.asm` on member `11g` failed.
Attempting to stop `ora.asm` on member `11g`
ORA-01031: insufficient privileges
Stop of `ora.asm` on member `11g` succeeded.
CRS-0215: Could not start resource 'ora.DATADG.dg'.
CRS-0223: Resource 'ora.LISTENER.lsnr' has placement error.
CRS-0215: Could not start resource 'ora.asmdb.db'.
CRS-0223: Resource 'ora.evmd' has placement error.
CRS-2660: Resource 'ora.ons' or all of its instances are disabled
ocssd.log的日志发现报错:
2015-01-14 10:37:07.760: [ CSSD][3330419984]clsu_load_ENV_levels: Module = OLR, LogLevel = 0, TraceLevel = 0
[ CSSD][3330419984]clsugetconf : Configuration type [3].
2015-01-14 10:37:07.760: [ CSSD][3330419984]clssscmain: Starting CSS daemon, version 11.2.0.3.0, in (local-only) mode with uniqueness value 1421203027
2015-01-14 10:37:07.761: [ CSSD][3330419984]clssscmain: Environment is production
2015-01-14 10:37:07.761: [ CSSD][3330419984]clssscmain: Core file size limit extended
2015-01-14 10:37:07.766: [ CSSD][3330419984]clssscmain: GIPCHA down 0
2015-01-14 10:37:07.767: [ CSSD][3330419984]clssscGetParameterOLR: OLR fetch for parameter logsize (8) failed with rc 21
2015-01-14 10:37:07.767: [ CSSD][3330419984]clssscExtendLimits: The current soft limit for file descriptors is 65536, hard limit is 65536
2015-01-14 10:37:07.767: [ CSSD][3330419984]clssscExtendLimits: The current soft limit for locked memory is 4294967295, hard limit is 4294967295
2015-01-14 10:37:07.767: [ CSSD][3330419984]clssscmain: Running as user grid
尝试使用sqlplus / as sysdba进入asm,报错ORA-01031: insufficient privileges
定位为操作系统验证问题
[grid@11g admin]$ more sqlnet.ora
# sqlnet.ora Network Configuration File: /u01/app/11.2.0/grid/network/admin/sqlnet.ora
# Generated by Oracle configuration tools.
NAMES.DIRECTORY_PATH= (TNSNAMES, EZCONNECT)
ADR_BASE = /u01/app/grid 有问题
Sqlnet.authentication_services=(nts)
[grid@11g admin]$ ls
listener1412302PM5934.bak listener.ora.bak listener.ora.bak.11g samples shrept.lst sqlnet1412302PM5934.bak sqlnet1.ora sqlnet.ora
[grid@11g admin]$ mv sqlnet.ora sqlnet.ora.bak 直接删掉
再次尝试,成功。
[grid@11g cssd]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Wed Jan 14 11:01:06 2015
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup mount;
ORA-01031: insufficient privileges
SQL> conn / as sysasm
Connected to an idle instance.
SQL> startup mount
ASM instance started
Total System Global Area 283930624 bytes
Fixed Size 2227664 bytes
Variable Size 256537136 bytes
ASM Cache 25165824 bytes
ASM diskgroups mounted
至此,asm启动成功
启动db
[oracle@11g ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Wed Jan 14 11:06:56 2015
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area 839282688 bytes
Fixed Size 2233000 bytes
Variable Size 541068632 bytes
Database Buffers 293601280 bytes
Redo Buffers 2379776 bytes
Database mounted.
Database opened.
查看集群,集群正常
再次重启验证
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_stop -all
CRS-2500: Cannot stop resource 'ora.diskmon' as it is not running
CRS-2500: Cannot stop resource 'ora.ons' as it is not running
Attempting to stop `ora.LISTENER.lsnr` on member `11g`
Attempting to stop `ora.evmd` on member `11g`
Attempting to stop `ora.DATA.dg` on member `11g`
Attempting to stop `ora.DATADG.dg` on member `11g`
Attempting to stop `ora.asmdb.db` on member `11g`
ORA-01031: insufficient privileges
Stop of `ora.evmd` on member `11g` succeeded.
Stop of `ora.LISTENER.lsnr` on member `11g` succeeded.
Stop of `ora.DATADG.dg` on member `11g` succeeded.
Attempting to stop `ora.cssd` on member `11g`
Stop of `ora.cssd` on member `11g` succeeded.
CRS-5017: The resource action "ora.DATA.dg stop" encountered the following error:
DgpAgent::getConnxn aborted. For details refer to "(:CLSN00108:)" in "/u01/app/11.2.0/grid/log/11g/agent/ohasd/oraagent_grid/oraagent_grid.log".
CRS-5017: The resource action "ora.DATA.dg check" encountered the following error:
DgpAgent::getConnxn aborted. For details refer to "(:CLSN00109:)" in "/u01/app/11.2.0/grid/log/11g/agent/ohasd/oraagent_grid/oraagent_grid.log".
Attempting to stop `ora.DATA.dg` on member `11g`
`ora.DATA.dg` on member `11g` has experienced an unrecoverable failure.
Attempting to start `ora.cssd` on member `11g`
Start of `ora.cssd` on member `11g` succeeded.
CRS-0216: Could not stop resource 'ora.cssd'.
CRS-0216: Could not stop resource 'ora.diskmon'.
CRS-0216: Could not stop resource 'ora.ons'.
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.DATA.dg ora....up.type OFFLINE UNKNOWN 11g
4000
ora.DATADG.dg ora....up.type OFFLINE OFFLINE
ora....ER.lsnr ora....er.type OFFLINE OFFLINE
ora.asm ora.asm.type OFFLINE OFFLINE
ora.asmdb.db ora....se.type OFFLINE OFFLINE
ora.cssd ora.cssd.type OFFLINE ONLINE 11g
ora.diskmon ora....on.type OFFLINE OFFLINE
ora.evmd ora.evm.type OFFLINE OFFLINE
ora.ons ora.ons.type OFFLINE OFFLINE
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_stop ora.cssd
Attempting to stop `ora.cssd` on member `11g`
Stop of `ora.cssd` on member `11g` succeeded.
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.DATA.dg ora....up.type OFFLINE UNKNOWN 11g
ora.DATADG.dg ora....up.type OFFLINE OFFLINE
ora....ER.lsnr ora....er.type OFFLINE OFFLINE
ora.asm ora.asm.type OFFLINE OFFLINE
ora.asmdb.db ora....se.type OFFLINE OFFLINE
ora.cssd ora.cssd.type OFFLINE OFFLINE
ora.diskmon ora....on.type OFFLINE OFFLINE
ora.evmd ora.evm.type OFFLINE OFFLINE
ora.ons ora.ons.type OFFLINE OFFLINE
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_stop ora.DATA.dg
Attempting to stop `ora.DATA.dg` on member `11g`
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_stop -f ora.DATA.dg
CRS-2545: Cannot operate on 'instance of ora.DATA.dg assigned to 11g'. It is locked by 'root' for command 'Stop Resource' issued from '11g'
CRS-0233: Resource or relatives are currently involved with another operation.
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_stop ora.DATA.dg -f
CRS-2545: Cannot operate on 'instance of ora.DATA.dg assigned to 11g'. It is locked by 'root' for command 'Stop Resource' issued from '11g'
CRS-0233: Resource or relatives are currently involved with another operation.
[root@11g ~]# /u01/app/11.2.0/grid/bin/crs_start -all
CRS-2501: Resource 'ora.ons' is disabled
Attempting to start `ora.evmd` on member `11g`
Attempting to start `ora.LISTENER.lsnr` on member `11g`
Attempting to start `ora.cssd` on member `11g`
Attempting to start `ora.diskmon` on member `11g`
Start of `ora.diskmon` on member `11g` succeeded.
Start of `ora.cssd` on member `11g` succeeded.
Start of `ora.LISTENER.lsnr` on member `11g` succeeded.
Attempting to stop `ora.asm` on member `11g`
Stop of `ora.asm` on member `11g` succeeded.
Attempting to start `ora.asm` on member `11g`
Start of `ora.evmd` on member `11g` succeeded.
Start of `ora.asm` on member `11g` succeeded.
Attempting to start `ora.DATA.dg` on member `11g`
Attempting to start `ora.DATADG.dg` on member `11g`
Start of `ora.DATA.dg` on member `11g` succeeded.
Attempting to stop `ora.asmdb.db` on member `11g`
Start of `ora.DATADG.dg` on member `11g` succeeded.
Stop of `ora.asmdb.db` on member `11g` succeeded.
Attempting to start `ora.asmdb.db` on member `11g`
Start of `ora.asmdb.db` on member `11g` succeeded.
CRS-2660: Resource 'ora.ons' or all of its instances are disabled
=============
针对diskmon找了个文档,记录下
[grid@vm11gr2] /home/grid> sqlplus "/as sysasm"
SQL*Plus: Release 11.2.0.1.0 Production on Sun Oct 25 10:16:21 2009
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup
ORA-01078: failure in processing system parameters
ORA-29701: unable to connect to Cluster Synchronization Service
SQL>
无法连接到CSS服务上.到操作系统上检查一下看看
[grid@vm11gr2] /home/grid> crsctl check css
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
[grid@vm11gr2] /home/grid>
[grid@vm11gr2] /home/grid> ps -ef|grep cssd
果然没有CSS的服务daemon进程,再看一下HAS(High Availability Service)的状态
[grid@vm11gr2] /home/grid> crsctl check has
CRS-4638: Oracle High Availability Services is online
[grid@vm11gr2] /home/grid> ps -ef|grep d.bin
grid 5886 1 0 10:06 ? 00:00:01 /u01/app/grid/product/11.2/grid/bin/ohasd.bin reboot
[grid@vm11gr2] /home/grid>
发现HAS的服务确实启动了的,而ora.cssd和ora.diskmon这2个服务是依赖于HAS维护的.
进一步查看各资源的状态
[grid@vm11gr2] /home/grid> crs_stat -t
Name Type Target State Host
--------------------------------------------------------------
ora.FLASH_DATA.dg ora.diskgroup.type OFFLINE OFFLINE vm11gr2
ora.SYS_DATA.dg ora.diskgroup.type OFFLINE OFFLINE vm11gr2
ora.asm ora.asm.type OFFLINE OFFLINE vm11gr2
ora.cssd ora.cssd.type OFFLINE OFFLINE vm11gr2
ora.diskmon ora.diskmon.type OFFLINE OFFLINE vm11gr2
[grid@vm11gr2] /home/grid>
[grid@vm11gr2] /home/grid> crsctl status resource -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_ DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.FLASH_DATA.dg
OFFLINE OFFLINE vm11gr2
ora.SYS_DATA.dg
OFFLINE OFFLINE vm11gr2
ora.asm
OFFLINE OFFLINE vm11gr2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
1 OFFLINE OFFLINE
ora.diskmon
1 OFFLINE OFFLINE
再看一下ora.cssd和ora.diskmon的属性
[grid@vm11gr2] /home/grid> crs_stat -p ora.cssd
NAME=ora.cssd
TYPE=ora.cssd.type
ACTION_SCRIPT=
ACTIVE_PLACEMENT=0
AUTO_START=never
CHECK_INTERVAL=30
DESCRIPTION="Resource type for CSSD"
FAILOVER_DELAY=0
FAILURE_INTERVAL=3
FAILURE_THRESHOLD=5
HOSTING_MEMBERS=
PLACEMENT=balanced
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=600
START_TIMEOUT=600
STOP_TIMEOUT=900
UPTIME_THRESHOLD=1m
[grid@vm11gr2] /home/grid> crs_stat -p ora.diskmon
NAME=ora.diskmon
TYPE=ora.diskmon.type
ACTION_SCRIPT=
ACTIVE_PLACEMENT=0
AUTO_START=never
CHECK_INTERVAL=20
DESCRIPTION="Resource type for Diskmon"
FAILOVER_DELAY=0
FAILURE_INTERVAL=3
FAILURE_THRESHOLD=5
HOSTING_MEMBERS=
PLACEMENT=balanced
RESTART_ATTEMPTS=10
SCRIPT_TIMEOUT=60
START_TIMEOUT=60
STOP_TIMEOUT=60
UPTIME_THRESHOLD=5s
[grid@vm11gr2] /home/grid>
到这里基本就找到了原因了,可以看到这两个资源的AUTO_START属性默认都设置为never,也就是说他们不会随着HAS服务的启动而自动启动的,
尽管默认情况下HAS服务是开机自动启动的.好了,那我们就手动启动一下吧:
[grid@vm11gr2] /home/grid> crsctl start resource ora.cssd
CRS-2672: Attempting to start 'ora.cssd' on 'vm11gr2'
CRS-2679: Attempting to clean 'ora.diskmon' on 'vm11gr2'
CRS-2681: Clean of 'ora.diskmon' on 'vm11gr2' succeeded
CRS-2672: Attempting to start 'ora.diskmon' on 'vm11gr2'
CRS-2676: Start of 'ora.diskmon' on 'vm11gr2' succeeded
CRS-2676: Start of 'ora.cssd' on 'vm11gr2' succeeded
[grid@vm11gr2] /home/grid>
注:ora.cssd和ora.diskmon这两个服务是有依赖关系的,启动哪个都会把两个都起来.
[grid@vm11gr2] /home/grid> crs_stat -t
Name Type Target State Host
--------------------------------------------------------------
ora.FLASH_DATA.dg ora.diskgroup.type OFFLINE OFFLINE vm11gr2
ora.SYS_DATA.dg ora.diskgroup.type OFFLINE OFFLINE vm11gr2
ora.asm ora.asm.type OFFLINE OFFLINE vm11gr2
ora.cssd ora.cssd.type ONLINE ONLINE vm11gr2
ora.diskmon ora.diskmon.type ONLINE ONLINE vm11gr2
[grid@vm11gr2] /home/grid>
CSS服务起来了,重启动asm instance
[grid@vm11gr2] /home/grid> sqlplus "/as sysasm"
SQL*Plus: Release 11.2.0.1.0 Production on Sun Oct 25 10:30:03 2009
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup
ASM instance started
Total System Global Area 284565504 bytes
Fixed Size 1336036 bytes
Variable Size 258063644 bytes
ASM Cache 25165824 bytes
ASM diskgroups mounted
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Automatic Storage Management option
[grid@vm11gr2] /home/grid> crs_stat -t
Name Type Target State Host
--------------------------------------------------------------
ora.FLASH_DATA.dg ora.diskgroup.type ONLINE ONLINE vm11gr2
ora.SYS_DATA.dg ora.diskgroup.type ONLINE ONLINE vm11gr2
ora.asm ora.asm.type ONLINE ONLINE vm11gr2
ora.cssd ora.cssd.type ONLINE ONLINE vm11gr2
ora.diskmon ora.diskmon.type ONLINE ONLINE vm11gr2
[grid@vm11gr2] /home/grid>
tips
1)默认情况下HAS(High Availability Service)是自动启动的.通过如下命令可以取消和启用自动启动
crsctl disable has
crsctl enable has
2)HAS手动启动和停止
crsctl start has
crsctl stop has
3)查看HAS的状态
crsctl check has
4)如果想让ora.css和ora.diskmon服务随着HAS的启动而自动启动,那么你可以这两个服务的AUTO_START属性
crsctl modify resource "ora.cssd" -attr "AUTO_START=1"
or
crsctl modify resource "ora.diskmon" -attr "AUTO_START=1"
5)如果想取消ora.css和ora.diskmon的Auto start
crsctl modify resource "ora.cssd" -attr "AUTO_START=never"
crsctl modify resource "ora.diskmon" -attr "AUTO_START=never"
==============
Wed Jan 14 11:35:45 2015
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1
ORA-15032: not all alterations performed
ORA-15027: active use of diskgroup "DATADG" precludes its dismount
ERROR: ALTER DISKGROUP DATADG DISMOUNT /* asm agent *//* {0:0:153} */
Wed Jan 14 11:35:55 2015
SQL> ALTER DISKGROUP DATA DISMOUNT /* asm agent *//* {0:0:153} */
Wed Jan 14 11:35:55 2015
SQL> ALTER DISKGROUP DATADG DISMOUNT /* asm agent *//* {0:0:153} */
NOTE: Active use of SPFILE in group
Wed Jan 14 11:35:55 2015
GMON querying group 1 at 15 for pid 13, osid 3306
Wed Jan 14 11:35:55 2015
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1
ORA-15032: not all alterations performed
ORA-15027: active use of diskgroup "DATADG" precludes its dismount
ERROR: ALTER DISKGROUP DATADG DISMOUNT /* asm agent *//* {0:0:153} */
GMON querying group 2 at 16 for pid 13, osid 3306
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 2
Wed Jan 14 11:36:05 2015
ORA-15032: not all alterations performed
ORA-15027: active use of diskgroup "DATA" precludes its dismount
ERROR: ALTER DISKGROUP DATA DISMOUNT /* asm agent *//* {0:0:153} */
Wed Jan 14 11:36:08 2015
Errors in file /u01/app/grid/diag/asm/+asm/+ASM/trace/+ASM_gmon_3308.trc:
ORA-29746: Cluster Synchronization Service is being shut down.
GMON (ospid: 3308): terminating the instance due to error 29746
Wed Jan 14 11:36:09 2015
System state dump requested by (instance=1, osid=3308 (GMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/grid/diag/asm/+asm/+ASM/trace/+ASM_diag_3292.trc
Dumping diagnostic data in directory=[cdmp_20150114113609], requested by (instance=1, osid=3308 (GMON)), summary=[abnormal instance termination].
Instance terminated by GMON, pid = 3308
相关文章推荐
- MS SQL错误:SQL Server failed with error code 0xc0000000 to spawn a thread to process a new login or connection. Check the SQL Server error log and the Windows event logs for information about possible related problems
- Unreal 4 TestDemo Error Log for Mac Start->RocketBuild.sh failed with exit code 5
- 错误:One or more post-processing actions failed. Consult the OPP service log for details
- Error:Execution failed for task ':app2:transformDexArchiveWithExternalLibsDexMergerForDebug
- 导入Module运行出现问Error:Execution failed for task ':app:compileDebugJavaWithJavac'. > Compilation failed;
- Error:Execution failed for task ':app:transformResourcesWithMergeJavaResForDebug'
- WebStrom Mac 应用 闪退打不开 LSOpenURLsWithRole() failed with error -10810 for the file /Applications/Web
- android Error:Execution failed for task ':app:transformClassesWithJarMergingForRelease'
- Understanding the error message: “Login failed for user ''. The user is not associated with a trusted SQL Server connect
- Execution failed for task ':app:transformNative_libsWithStripDebugSymbolForRelease'.
- filezilla Failed to create listen socket on port 21 for IPv4 解决办法
- Android中Error:Execution failed for task ':app:transformClassesWithDexForDebug'解决方案
- 记录Error:Execution failed for task ':app:transformClassesWithDexForDebug'. > 的解决方法
- Error:Execution failed for task ':app:transformClassesWithDexForDebug'. > com.android.build.api.tran
- yii Exception (Exception) 'yii\base\Exception' with message 'Failed to change permissions for direc
- Error:Execution failed for task ':app:transformResourcesWithMergeJavaResForDebug'.
- 关于androidstudio发生Error:Execution failed for task ':app:transformResourcesWithMergeJavaResForDebug'.
- Retrieving the COM class factory for component with CLSID {00024500-0000-0000-C000-000000000046} failed due to the following error: 80070005.
- Execution failed for task ':app:compileReleaseJavaWithJavac'解决方案
- svn commit failed: Could not use external editor to fetch log message