rac实例故障问题解决报告
2014-11-21 10:38
260 查看
工程日志
填报日期:2014/11/20
[实施目的]
1. 解决rac其中一个节点实例只能启动到nomount;
[项目环境]
[实施步骤]
1. 分析错误原因
1.1查看监听,并分析:
根据李昕描述,监听无法启动,查看监听信息
[grid@zgcrac1~]$ lsnrctl status
LSNRCTL forLinux: Version 11.2.0.3.0 - Production on 19-11月-2014 09:55:30
Copyright (c)1991, 2011, Oracle. All rights reserved.
Connecting to(DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of theLISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Linux: Version11.2.0.3.0 - Production
StartDate 21-10月-201420:22:13
Uptime 28 days 13 hr. 33 min. 16 sec
TraceLevel off
Security ON: Local OS Authentication
SNMP OFF
ListenerParameter File /g01/11ggrid/app/11.2.0/grid/network/admin/listener.ora
Listener LogFile /g01/11ggrid/app/11.2.0/grid/log/diag/tnslsnr/zgcrac1/listener/alert/log.xml
ListeningEndpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.88.11)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.88.12)(PORT=1521)))
ServicesSummary...
Service"+ASM" has 1 instance(s).
Instance "+ASM1", status READY, has1 handler(s) for this service...
Service"PROD" has 1 instance(s).
Instance"PROD1", status BLOCKED, has 1 handler(s) for this service...
Grid用户
Name Type Target State Host
------------------------------------------------------------
ora.DATADG.dg ora....up.typeONLINE ONLINE zgcrac1
ora.FDG.dg ora....up.typeONLINE ONLINE zgcrac2
ora....ER.lsnr ora....er.type ONLINE ONLINE zgcrac2
ora....N1.lsnr ora....er.type ONLINE ONLINE zgcrac1
ora....EMDG.dg ora....up.type ONLINE ONLINE zgcrac1
ora.asm ora.asm.type ONLINE ONLINE zgcrac1
ora.cvu ora.cvu.type ONLINE ONLINE zgcrac1
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora....network ora....rk.type ONLINE ONLINE zgcrac1
ora.oc4j ora.oc4j.type ONLINE ONLINE zgcrac1
ora.ons ora.ons.type ONLINE ONLINE zgcrac1
ora.prod.db ora....se.typeONLINE ONLINE zgcrac2
ora.scan1.vip ora....ip.typeONLINE ONLINE zgcrac1
ora....SM1.asm application ONLINE ONLINE zgcrac1
ora....C1.lsnr application OFFLINE OFFLINE
ora....ac1.gsd application OFFLINE OFFLINE
ora....ac1.ons application ONLINE ONLINE zgcrac1
ora....ac1.vip ora....t1.type ONLINE ONLINE zgcrac1
ora....SM2.asm application ONLINE ONLINE zgcrac2
ora....C2.lsnr application ONLINE ONLINE zgcrac2
ora....ac2.gsd application OFFLINE OFFLINE
ora....ac2.ons application ONLINE ONLINE zgcrac2
ora....ac2.vip ora....t1.type ONLINE ONLINE zgcrac2
根据监听信息,可以看到报错。
1.2判断监听:
查看12服务器监听,没有发现问题;
查看22服务器监听,也没有发现问题。
判断是否为网络或者ip问题
1.3hosts文件:
127.0.0.1 localhostlocalhost.localdomain localhost4 localhost4.localdomain4
::1 localhostlocalhost.localdomain localhost6 localhost6.localdomain6
172.16.88.11 zgcrac1 zgcrac1.com
172.16.88.12 zgcrac1-vip
172.16.88.21 zgcrac2 zgcrac2.com
172.16.88.22 zgcrac2-vip
172.16.88.10 zgcrac-clusterzgcrac-cluster-scan
10.10.1.1 zgcrac1-priv
10.10.1.2 zgcrac2-priv
两个服务器ip都没有问题
1.4查看数据库实例运行:
select instance_name,host_name,status from v$instance
*
ERROR at line 1:
ORA-01034: ORACLE not available
查看全部实例运行情况
INSTANCE_NAME HOST_NAME STATUS
------------------------------ --------------------------------------------------------
+ASM1 zgcrac1 STARTED
+ASM2 zgcrac2 STARTED
INSTANCE_NAME HOST_NAME STATUS
---------------- -------------------- ------------
PROD2 zgcrac2 OPEN
PROD1 zgcrac1 STARTED
实例PROD1无法open
1.5查看rac服务运行情况:
[grid@zgcrac1 admin]$crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATADG.dg
ONLINE ONLINE zgcrac1
ONLINE ONLINE zgcrac2
ora.FDG.dg
ONLINE OFFLINE zgcrac1
ONLINE ONLINE zgcrac2
ora.LISTENER.lsnr
ONLINE ONLINE zgcrac1
ONLINE ONLINE zgcrac2
ora.SYSTEMDG.dg
ONLINE ONLINE zgcrac1
ONLINE ONLINE zgcrac2
ora.asm
ONLINE ONLINE zgcrac1 Started
ONLINE ONLINE zgcrac2 Started
ora.gsd
OFFLINE OFFLINE zgcrac1
OFFLINE OFFLINE zgcrac2
ora.net1.network
ONLINE ONLINE zgcrac1
ONLINE ONLINE zgcrac2
ora.ons
ONLINE ONLINE zgcrac1
ONLINE ONLINE zgcrac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE zgcrac1
ora.cvu
1 ONLINE ONLINE zgcrac1
ora.oc4j
1 ONLINE ONLINE zgcrac1
ora.prod.db
1 ONLINE OFFLINE Instance Shutdown
2 ONLINE ONLINE zgcrac2 Open
ora.scan1.vip
1 ONLINE ONLINE zgcrac1
ora.zgcrac1.vip
1 ONLINE ONLINE zgcrac1
ora.zgcrac2.vip
1 ONLINE ONLINE zgcrac2
发现节点zgcrac1的ora.FDG.dg 是offline的
判断问题在于ASM磁盘
1.6查看ASM磁盘状态:
zgcrac1节点
SQL> select name,state fromv$asm_diskgroup;
NAME
--------------------------------------------------------------------------------
STATE
---------------------------------
DATADG
MOUNTED
SYSTEMDG
MOUNTED
zgcrac2节点
SQL> select name ,statefrom v$asm_diskgroup;
NAME STATE
-----------------------------------------
DATADG CONNECTED
FDG CONNECTED
SYSTEMDG MOUNTED
通过对比发现节点zgcrac1之所以无法启动是由于ASM磁盘组无法识别造成的。
SQL> alter diskgroup FDGcheck all;
alter diskgroup FDG checkall
*
ERROR at line 1:
ORA-15032: not allalterations performed
ORA-15001: diskgroup"FDG" does not exist or is not mounted
SQL> alter diskgroup FDGmount;
alter diskgroup FDG mount
*
ERROR at line 1:
ORA-15032: not all alterationsperformed
ORA-15017: diskgroup"FDG" cannot be mounted
ORA-15063: ASM discoveredan insufficient number of disks for diskgroup "FDG"
ORA-15080: synchronous I/Ooperation to a disk failed
ORA-15080: synchronous I/Ooperation to a disk failed
ORA-15080: synchronous I/Ooperation to a disk failed
ORA-15080: synchronous I/Ooperation to a disk failed
可以断定问题是在ASM磁盘组FDG,也证实之前开启实例时控制文件缺失的报错。
1.7查看最早的alertlog:
SQL> CREATE DISKGROUPFDG EXTERNAL REDUNDANCY DISK'/dev/mapper/mpathop1' SIZE 446462M ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
ERROR: failed to updatediskgroup resource ora.FDG.dg
WARNING: failed to onlinediskgroup resource ora.FDG.dg (unable to communicate with CRSD/OHASD)
ORA-15032: not allalterations performed
磁盘组在创建的时候就有问题,根据以往经验,判断可能是磁盘组权限问题。
1.8查看磁盘组权限:
[root@zgcrac1 mapper]# llmpa*
lrwxrwxrwx 1 grid asmadmin7 Oct 21 11:02 mpathb -> ../dm-5
lrwxrwxrwx 1 grid asmadmin7 Oct 21 11:02 mpathbp1 -> ../dm-6
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathc -> ../dm-20
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathcp1 -> ../dm-22
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathd -> ../dm-19
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathdp1 -> ../dm-21
lrwxrwxrwx 1 grid asmadmin7 Oct 21 11:02 mpathe -> ../dm-9
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathep1 -> ../dm-11
lrwxrwxrwx 1 grid asmadmin7 Oct 21 11:02 mpathf -> ../dm-7
lrwxrwxrwx 1 grid asmadmin7 Oct 21 11:02 mpathfp1 -> ../dm-8
lrwxrwxrwx 1 grid asmadmin7 Oct 21 11:02 mpathg -> ../dm-3
lrwxrwxrwx 1 grid asmadmin7 Oct 21 11:02 mpathgp1 -> ../dm-4
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathh -> ../dm-23
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathhp1 -> ../dm-24
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathi -> ../dm-27
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathip1 -> ../dm-29
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathj -> ../dm-28
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathjp1 -> ../dm-30
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathk -> ../dm-25
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathkp1 -> ../dm-26
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathl -> ../dm-14
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathlp1 -> ../dm-16
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathm -> ../dm-17
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathmp1 -> ../dm-18
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathn -> ../dm-10
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathnp1 -> ../dm-13
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpatho -> ../dm-12
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathop1 -> ../dm-15
各个磁盘权限并无问题。
检查磁盘组信息
[root@zgcrac1 mapper]# kfoddisk=all
WARNING: Using brute forcemethod to determine the size of /dev/raw/rawctl.
There will be performance issues. Please checkconfiguration to determine the cause for the failure of ioctl
--------------------------------------------------------------------------------
Disk Size Path User Group
================================================================================
1: 524288 Mb /dev/mapper/mpathb grid asmadmin
2: 524285 Mb /dev/mapper/mpathbp1 grid asmadmin
3: 524288 Mb /dev/mapper/mpathc grid asmadmin
4: 524285 Mb /dev/mapper/mpathcp1 grid asmadmin
5: 524288 Mb /dev/mapper/mpathd grid asmadmin
6: 524285 Mb /dev/mapper/mpathdp1 grid asmadmin
7: 10240 Mb /dev/mapper/mpathe grid asmadmin
8: 10236 Mb /dev/mapper/mpathep1 grid asmadmin
9: 10240 Mb /dev/mapper/mpathf grid asmadmin
10: 10236 Mb /dev/mapper/mpathfp1 grid asmadmin
11: 10240 Mb /dev/mapper/mpathg grid asmadmin
12: 10236 Mb /dev/mapper/mpathgp1 grid asmadmin
13: 524288 Mb /dev/mapper/mpathh grid asmadmin
14: 524285 Mb /dev/mapper/mpathhp1 grid asmadmin
15: 524288 Mb /dev/mapper/mpathi grid asmadmin
16: 524285 Mb /dev/mapper/mpathip1 grid asmadmin
17: 524288 Mb /dev/mapper/mpathj grid asmadmin
18: 524285 Mb /dev/mapper/mpathjp1 grid asmadmin
19: 524288 Mb /dev/mapper/mpathk root disk
20: 524285 Mb /dev/mapper/mpathkp1 root disk
21: 524288 Mb /dev/mapper/mpathl root disk
22: 524285 Mb /dev/mapper/mpathlp1 root disk
23: 524288 Mb /dev/mapper/mpathm root disk
24: 524285 Mb /dev/mapper/mpathmp1 root disk
25: 524288 Mb /dev/mapper/mpathn root disk
26: 524285 Mb /dev/mapper/mpathnp1 root disk
27: 446464 Mb /dev/mapper/mpatho root disk
28: 446462 Mb /dev/mapper/mpathop1 root disk
--------------------------------------------------------------------------------
ORACLE_SID ORACLE_HOME
================================================================================
+ASM2 /g01/11ggrid/app/11.2.0/grid
+ASM1 /g01/11ggrid/app/11.2.0/grid
发现不问磁盘属主是root,判断这就是实例无法识别磁盘组的原因。
更改属主即可
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-10
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-12
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-13
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-14
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-15
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-16
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-17
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-18
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-25
[root@zgcrac1 dev]# chown -R grid:asmadmindm-26
2.更改后rac情况
[root@zgcrac1 dev]#kfod disk=all
WARNING: Usingbrute force method to determine the size of /dev/raw/rawctl.
There will be performance issues. Please checkconfiguration to determine the cause for the failure of ioctl
--------------------------------------------------------------------------------
Disk Size Path User Group
================================================================================
1: 524288 Mb /dev/mapper/mpathb grid asmadmin
2: 524285 Mb /dev/mapper/mpathbp1 grid asmadmin
3: 524288 Mb /dev/mapper/mpathc grid asmadmin
4: 524285 Mb /dev/mapper/mpathcp1 grid asmadmin
5: 524288 Mb /dev/mapper/mpathd grid asmadmin
6: 524285 Mb /dev/mapper/mpathdp1 grid asmadmin
7: 10240 Mb /dev/mapper/mpathe grid asmadmin
8: 10236 Mb/dev/mapper/mpathep1 grid asmadmin
9: 10240 Mb /dev/mapper/mpathf grid asmadmin
10: 10236 Mb /dev/mapper/mpathfp1 grid asmadmin
11: 10240 Mb /dev/mapper/mpathg grid asmadmin
12: 10236 Mb /dev/mapper/mpathgp1 grid asmadmin
13: 524288 Mb /dev/mapper/mpathh grid asmadmin
14: 524285 Mb /dev/mapper/mpathhp1 grid asmadmin
15: 524288 Mb /dev/mapper/mpathi grid asmadmin
16: 524285 Mb /dev/mapper/mpathip1 grid asmadmin
17: 524288 Mb /dev/mapper/mpathj grid asmadmin
18: 524285 Mb /dev/mapper/mpathjp1 grid asmadmin
19: 524288 Mb /dev/mapper/mpathk grid asmadmin
20: 524285 Mb /dev/mapper/mpathkp1 grid asmadmin
21: 524288 Mb /dev/mapper/mpathl grid asmadmin
22: 524285 Mb /dev/mapper/mpathlp1 grid asmadmin
23: 524288 Mb /dev/mapper/mpathm grid asmadmin
24: 524285 Mb /dev/mapper/mpathmp1 grid asmadmin
25: 524288 Mb /dev/mapper/mpathn grid asmadmin
26: 524285 Mb /dev/mapper/mpathnp1 grid asmadmin
27: 446464 Mb /dev/mapper/mpatho grid asmadmin
28: 446462 Mb /dev/mapper/mpathop1 grid asmadmin
rac服务运行
[root@zgcrac1 dev]# crs_stat-t
Name Type Target State Host
------------------------------------------------------------
ora.DATADG.dg ora....up.type ONLINE ONLINE zgcrac1
ora.FDG.dg ora....up.type ONLINE ONLINE zgcrac1
ora....ER.lsnr ora....er.typeONLINE ONLINE zgcrac1
ora....N1.lsnr ora....er.typeONLINE ONLINE zgcrac2
ora....EMDG.dg ora....up.typeONLINE ONLINE zgcrac1
ora.asm ora.asm.type ONLINE ONLINE zgcrac1
ora.cvu ora.cvu.type ONLINE ONLINE zgcrac2
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora....network ora....rk.typeONLINE ONLINE zgcrac1
ora.oc4j ora.oc4j.type ONLINE ONLINE zgcrac2
ora.ons ora.ons.type ONLINE ONLINE zgcrac1
ora.prod.db ora....se.type ONLINE ONLINE zgcrac1
ora.scan1.vip ora....ip.type ONLINE ONLINE zgcrac2
ora....SM1.asmapplication ONLINE ONLINE zgcrac1
ora....C1.lsnrapplication ONLINE ONLINE zgcrac1
ora....ac1.gsdapplication OFFLINE OFFLINE
ora....ac1.onsapplication ONLINE ONLINE zgcrac1
ora....ac1.vip ora....t1.typeONLINE ONLINE zgcrac1
ora....SM2.asmapplication ONLINE ONLINE zgcrac2
ora....C2.lsnrapplication ONLINE ONLINE zgcrac2
ora....ac2.gsdapplication OFFLINE OFFLINE
ora....ac2.onsapplication ONLINE ONLINE zgcrac2
ora....ac2.vip ora....t1.typeONLINE ONLINE zgcrac2
监听
[grid@zgcrac1 ~]$ lsnrctlstatus
LSNRCTL for Linux: Version11.2.0.3.0 - Production on 20-11月-2014 17:18:52
Copyright (c) 1991, 2011,Oracle. All rights reserved.
Connecting to(DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Linux: Version11.2.0.3.0 - Production
Start Date 21-10月-201420:22:13
Uptime 29 days 20 hr. 56 min. 38sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /g01/11ggrid/app/11.2.0/grid/network/admin/listener.ora
Listener Log File /g01/11ggrid/app/11.2.0/grid/log/diag/tnslsnr/zgcrac1/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.88.12)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.88.11)(PORT=1521)))
Services Summary...
Service "+ASM" has 1instance(s).
Instance "+ASM1", status READY, has1 handler(s) for this service...
Service "PROD" has 1instance(s).
Instance "PROD1", status READY, has1 handler(s) for this service...
The command completedsuccessfully
实例运行
SQL> selectinstance_name,host_name,status from gv$instance;
INSTANCE_NAME
----------------
HOST_NAME STATUS
----------------------------------------------------------------------------
PROD1
zgcrac1 OPEN
PROD2
zgcrac2 OPEN
磁盘组运行
SQL> select group_number,name,state from v$asm_diskgroup;
GROUP_NUMBER NAME STATE
------------------------------------------ -----------
1 DATADG CONNECTED
2 FDG CONNECTED
3 SYSTEMDG MOUNTED
填报日期:2014/11/20
[实施目的]
1. 解决rac其中一个节点实例只能启动到nomount;
[项目环境]
操作系统 | linux |
主机名 | zgcrac1 |
数据库版本 | Oracle 11.2.0 |
字符集 | |
生产库实例名 | PROD |
监听 | LISTENER/1521 |
1. 分析错误原因
1.1查看监听,并分析:
根据李昕描述,监听无法启动,查看监听信息
[grid@zgcrac1~]$ lsnrctl status
LSNRCTL forLinux: Version 11.2.0.3.0 - Production on 19-11月-2014 09:55:30
Copyright (c)1991, 2011, Oracle. All rights reserved.
Connecting to(DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of theLISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Linux: Version11.2.0.3.0 - Production
StartDate 21-10月-201420:22:13
Uptime 28 days 13 hr. 33 min. 16 sec
TraceLevel off
Security ON: Local OS Authentication
SNMP OFF
ListenerParameter File /g01/11ggrid/app/11.2.0/grid/network/admin/listener.ora
Listener LogFile /g01/11ggrid/app/11.2.0/grid/log/diag/tnslsnr/zgcrac1/listener/alert/log.xml
ListeningEndpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.88.11)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.88.12)(PORT=1521)))
ServicesSummary...
Service"+ASM" has 1 instance(s).
Instance "+ASM1", status READY, has1 handler(s) for this service...
Service"PROD" has 1 instance(s).
Instance"PROD1", status BLOCKED, has 1 handler(s) for this service...
Grid用户
Name Type Target State Host
------------------------------------------------------------
ora.DATADG.dg ora....up.typeONLINE ONLINE zgcrac1
ora.FDG.dg ora....up.typeONLINE ONLINE zgcrac2
ora....ER.lsnr ora....er.type ONLINE ONLINE zgcrac2
ora....N1.lsnr ora....er.type ONLINE ONLINE zgcrac1
ora....EMDG.dg ora....up.type ONLINE ONLINE zgcrac1
ora.asm ora.asm.type ONLINE ONLINE zgcrac1
ora.cvu ora.cvu.type ONLINE ONLINE zgcrac1
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora....network ora....rk.type ONLINE ONLINE zgcrac1
ora.oc4j ora.oc4j.type ONLINE ONLINE zgcrac1
ora.ons ora.ons.type ONLINE ONLINE zgcrac1
ora.prod.db ora....se.typeONLINE ONLINE zgcrac2
ora.scan1.vip ora....ip.typeONLINE ONLINE zgcrac1
ora....SM1.asm application ONLINE ONLINE zgcrac1
ora....C1.lsnr application OFFLINE OFFLINE
ora....ac1.gsd application OFFLINE OFFLINE
ora....ac1.ons application ONLINE ONLINE zgcrac1
ora....ac1.vip ora....t1.type ONLINE ONLINE zgcrac1
ora....SM2.asm application ONLINE ONLINE zgcrac2
ora....C2.lsnr application ONLINE ONLINE zgcrac2
ora....ac2.gsd application OFFLINE OFFLINE
ora....ac2.ons application ONLINE ONLINE zgcrac2
ora....ac2.vip ora....t1.type ONLINE ONLINE zgcrac2
根据监听信息,可以看到报错。
1.2判断监听:
查看12服务器监听,没有发现问题;
查看22服务器监听,也没有发现问题。
判断是否为网络或者ip问题
1.3hosts文件:
127.0.0.1 localhostlocalhost.localdomain localhost4 localhost4.localdomain4
::1 localhostlocalhost.localdomain localhost6 localhost6.localdomain6
172.16.88.11 zgcrac1 zgcrac1.com
172.16.88.12 zgcrac1-vip
172.16.88.21 zgcrac2 zgcrac2.com
172.16.88.22 zgcrac2-vip
172.16.88.10 zgcrac-clusterzgcrac-cluster-scan
10.10.1.1 zgcrac1-priv
10.10.1.2 zgcrac2-priv
两个服务器ip都没有问题
1.4查看数据库实例运行:
select instance_name,host_name,status from v$instance
*
ERROR at line 1:
ORA-01034: ORACLE not available
查看全部实例运行情况
INSTANCE_NAME HOST_NAME STATUS
------------------------------ --------------------------------------------------------
+ASM1 zgcrac1 STARTED
+ASM2 zgcrac2 STARTED
INSTANCE_NAME HOST_NAME STATUS
---------------- -------------------- ------------
PROD2 zgcrac2 OPEN
PROD1 zgcrac1 STARTED
实例PROD1无法open
1.5查看rac服务运行情况:
[grid@zgcrac1 admin]$crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATADG.dg
ONLINE ONLINE zgcrac1
ONLINE ONLINE zgcrac2
ora.FDG.dg
ONLINE OFFLINE zgcrac1
ONLINE ONLINE zgcrac2
ora.LISTENER.lsnr
ONLINE ONLINE zgcrac1
ONLINE ONLINE zgcrac2
ora.SYSTEMDG.dg
ONLINE ONLINE zgcrac1
ONLINE ONLINE zgcrac2
ora.asm
ONLINE ONLINE zgcrac1 Started
ONLINE ONLINE zgcrac2 Started
ora.gsd
OFFLINE OFFLINE zgcrac1
OFFLINE OFFLINE zgcrac2
ora.net1.network
ONLINE ONLINE zgcrac1
ONLINE ONLINE zgcrac2
ora.ons
ONLINE ONLINE zgcrac1
ONLINE ONLINE zgcrac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE zgcrac1
ora.cvu
1 ONLINE ONLINE zgcrac1
ora.oc4j
1 ONLINE ONLINE zgcrac1
ora.prod.db
1 ONLINE OFFLINE Instance Shutdown
2 ONLINE ONLINE zgcrac2 Open
ora.scan1.vip
1 ONLINE ONLINE zgcrac1
ora.zgcrac1.vip
1 ONLINE ONLINE zgcrac1
ora.zgcrac2.vip
1 ONLINE ONLINE zgcrac2
发现节点zgcrac1的ora.FDG.dg 是offline的
判断问题在于ASM磁盘
1.6查看ASM磁盘状态:
zgcrac1节点
SQL> select name,state fromv$asm_diskgroup;
NAME
--------------------------------------------------------------------------------
STATE
---------------------------------
DATADG
MOUNTED
SYSTEMDG
MOUNTED
zgcrac2节点
SQL> select name ,statefrom v$asm_diskgroup;
NAME STATE
-----------------------------------------
DATADG CONNECTED
FDG CONNECTED
SYSTEMDG MOUNTED
通过对比发现节点zgcrac1之所以无法启动是由于ASM磁盘组无法识别造成的。
SQL> alter diskgroup FDGcheck all;
alter diskgroup FDG checkall
*
ERROR at line 1:
ORA-15032: not allalterations performed
ORA-15001: diskgroup"FDG" does not exist or is not mounted
SQL> alter diskgroup FDGmount;
alter diskgroup FDG mount
*
ERROR at line 1:
ORA-15032: not all alterationsperformed
ORA-15017: diskgroup"FDG" cannot be mounted
ORA-15063: ASM discoveredan insufficient number of disks for diskgroup "FDG"
ORA-15080: synchronous I/Ooperation to a disk failed
ORA-15080: synchronous I/Ooperation to a disk failed
ORA-15080: synchronous I/Ooperation to a disk failed
ORA-15080: synchronous I/Ooperation to a disk failed
可以断定问题是在ASM磁盘组FDG,也证实之前开启实例时控制文件缺失的报错。
1.7查看最早的alertlog:
SQL> CREATE DISKGROUPFDG EXTERNAL REDUNDANCY DISK'/dev/mapper/mpathop1' SIZE 446462M ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
ERROR: failed to updatediskgroup resource ora.FDG.dg
WARNING: failed to onlinediskgroup resource ora.FDG.dg (unable to communicate with CRSD/OHASD)
ORA-15032: not allalterations performed
磁盘组在创建的时候就有问题,根据以往经验,判断可能是磁盘组权限问题。
1.8查看磁盘组权限:
[root@zgcrac1 mapper]# llmpa*
lrwxrwxrwx 1 grid asmadmin7 Oct 21 11:02 mpathb -> ../dm-5
lrwxrwxrwx 1 grid asmadmin7 Oct 21 11:02 mpathbp1 -> ../dm-6
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathc -> ../dm-20
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathcp1 -> ../dm-22
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathd -> ../dm-19
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathdp1 -> ../dm-21
lrwxrwxrwx 1 grid asmadmin7 Oct 21 11:02 mpathe -> ../dm-9
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathep1 -> ../dm-11
lrwxrwxrwx 1 grid asmadmin7 Oct 21 11:02 mpathf -> ../dm-7
lrwxrwxrwx 1 grid asmadmin7 Oct 21 11:02 mpathfp1 -> ../dm-8
lrwxrwxrwx 1 grid asmadmin7 Oct 21 11:02 mpathg -> ../dm-3
lrwxrwxrwx 1 grid asmadmin7 Oct 21 11:02 mpathgp1 -> ../dm-4
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathh -> ../dm-23
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathhp1 -> ../dm-24
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathi -> ../dm-27
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathip1 -> ../dm-29
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathj -> ../dm-28
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:02 mpathjp1 -> ../dm-30
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathk -> ../dm-25
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathkp1 -> ../dm-26
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathl -> ../dm-14
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathlp1 -> ../dm-16
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathm -> ../dm-17
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathmp1 -> ../dm-18
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathn -> ../dm-10
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathnp1 -> ../dm-13
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpatho -> ../dm-12
lrwxrwxrwx 1 grid asmadmin8 Oct 21 11:34 mpathop1 -> ../dm-15
各个磁盘权限并无问题。
检查磁盘组信息
[root@zgcrac1 mapper]# kfoddisk=all
WARNING: Using brute forcemethod to determine the size of /dev/raw/rawctl.
There will be performance issues. Please checkconfiguration to determine the cause for the failure of ioctl
--------------------------------------------------------------------------------
Disk Size Path User Group
================================================================================
1: 524288 Mb /dev/mapper/mpathb grid asmadmin
2: 524285 Mb /dev/mapper/mpathbp1 grid asmadmin
3: 524288 Mb /dev/mapper/mpathc grid asmadmin
4: 524285 Mb /dev/mapper/mpathcp1 grid asmadmin
5: 524288 Mb /dev/mapper/mpathd grid asmadmin
6: 524285 Mb /dev/mapper/mpathdp1 grid asmadmin
7: 10240 Mb /dev/mapper/mpathe grid asmadmin
8: 10236 Mb /dev/mapper/mpathep1 grid asmadmin
9: 10240 Mb /dev/mapper/mpathf grid asmadmin
10: 10236 Mb /dev/mapper/mpathfp1 grid asmadmin
11: 10240 Mb /dev/mapper/mpathg grid asmadmin
12: 10236 Mb /dev/mapper/mpathgp1 grid asmadmin
13: 524288 Mb /dev/mapper/mpathh grid asmadmin
14: 524285 Mb /dev/mapper/mpathhp1 grid asmadmin
15: 524288 Mb /dev/mapper/mpathi grid asmadmin
16: 524285 Mb /dev/mapper/mpathip1 grid asmadmin
17: 524288 Mb /dev/mapper/mpathj grid asmadmin
18: 524285 Mb /dev/mapper/mpathjp1 grid asmadmin
19: 524288 Mb /dev/mapper/mpathk root disk
20: 524285 Mb /dev/mapper/mpathkp1 root disk
21: 524288 Mb /dev/mapper/mpathl root disk
22: 524285 Mb /dev/mapper/mpathlp1 root disk
23: 524288 Mb /dev/mapper/mpathm root disk
24: 524285 Mb /dev/mapper/mpathmp1 root disk
25: 524288 Mb /dev/mapper/mpathn root disk
26: 524285 Mb /dev/mapper/mpathnp1 root disk
27: 446464 Mb /dev/mapper/mpatho root disk
28: 446462 Mb /dev/mapper/mpathop1 root disk
--------------------------------------------------------------------------------
ORACLE_SID ORACLE_HOME
================================================================================
+ASM2 /g01/11ggrid/app/11.2.0/grid
+ASM1 /g01/11ggrid/app/11.2.0/grid
发现不问磁盘属主是root,判断这就是实例无法识别磁盘组的原因。
更改属主即可
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-10
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-12
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-13
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-14
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-15
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-16
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-17
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-18
[root@zgcrac1 dev]# chown -Rgrid:asmadmin dm-25
[root@zgcrac1 dev]# chown -R grid:asmadmindm-26
2.更改后rac情况
[root@zgcrac1 dev]#kfod disk=all
WARNING: Usingbrute force method to determine the size of /dev/raw/rawctl.
There will be performance issues. Please checkconfiguration to determine the cause for the failure of ioctl
--------------------------------------------------------------------------------
Disk Size Path User Group
================================================================================
1: 524288 Mb /dev/mapper/mpathb grid asmadmin
2: 524285 Mb /dev/mapper/mpathbp1 grid asmadmin
3: 524288 Mb /dev/mapper/mpathc grid asmadmin
4: 524285 Mb /dev/mapper/mpathcp1 grid asmadmin
5: 524288 Mb /dev/mapper/mpathd grid asmadmin
6: 524285 Mb /dev/mapper/mpathdp1 grid asmadmin
7: 10240 Mb /dev/mapper/mpathe grid asmadmin
8: 10236 Mb/dev/mapper/mpathep1 grid asmadmin
9: 10240 Mb /dev/mapper/mpathf grid asmadmin
10: 10236 Mb /dev/mapper/mpathfp1 grid asmadmin
11: 10240 Mb /dev/mapper/mpathg grid asmadmin
12: 10236 Mb /dev/mapper/mpathgp1 grid asmadmin
13: 524288 Mb /dev/mapper/mpathh grid asmadmin
14: 524285 Mb /dev/mapper/mpathhp1 grid asmadmin
15: 524288 Mb /dev/mapper/mpathi grid asmadmin
16: 524285 Mb /dev/mapper/mpathip1 grid asmadmin
17: 524288 Mb /dev/mapper/mpathj grid asmadmin
18: 524285 Mb /dev/mapper/mpathjp1 grid asmadmin
19: 524288 Mb /dev/mapper/mpathk grid asmadmin
20: 524285 Mb /dev/mapper/mpathkp1 grid asmadmin
21: 524288 Mb /dev/mapper/mpathl grid asmadmin
22: 524285 Mb /dev/mapper/mpathlp1 grid asmadmin
23: 524288 Mb /dev/mapper/mpathm grid asmadmin
24: 524285 Mb /dev/mapper/mpathmp1 grid asmadmin
25: 524288 Mb /dev/mapper/mpathn grid asmadmin
26: 524285 Mb /dev/mapper/mpathnp1 grid asmadmin
27: 446464 Mb /dev/mapper/mpatho grid asmadmin
28: 446462 Mb /dev/mapper/mpathop1 grid asmadmin
rac服务运行
[root@zgcrac1 dev]# crs_stat-t
Name Type Target State Host
------------------------------------------------------------
ora.DATADG.dg ora....up.type ONLINE ONLINE zgcrac1
ora.FDG.dg ora....up.type ONLINE ONLINE zgcrac1
ora....ER.lsnr ora....er.typeONLINE ONLINE zgcrac1
ora....N1.lsnr ora....er.typeONLINE ONLINE zgcrac2
ora....EMDG.dg ora....up.typeONLINE ONLINE zgcrac1
ora.asm ora.asm.type ONLINE ONLINE zgcrac1
ora.cvu ora.cvu.type ONLINE ONLINE zgcrac2
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora....network ora....rk.typeONLINE ONLINE zgcrac1
ora.oc4j ora.oc4j.type ONLINE ONLINE zgcrac2
ora.ons ora.ons.type ONLINE ONLINE zgcrac1
ora.prod.db ora....se.type ONLINE ONLINE zgcrac1
ora.scan1.vip ora....ip.type ONLINE ONLINE zgcrac2
ora....SM1.asmapplication ONLINE ONLINE zgcrac1
ora....C1.lsnrapplication ONLINE ONLINE zgcrac1
ora....ac1.gsdapplication OFFLINE OFFLINE
ora....ac1.onsapplication ONLINE ONLINE zgcrac1
ora....ac1.vip ora....t1.typeONLINE ONLINE zgcrac1
ora....SM2.asmapplication ONLINE ONLINE zgcrac2
ora....C2.lsnrapplication ONLINE ONLINE zgcrac2
ora....ac2.gsdapplication OFFLINE OFFLINE
ora....ac2.onsapplication ONLINE ONLINE zgcrac2
ora....ac2.vip ora....t1.typeONLINE ONLINE zgcrac2
监听
[grid@zgcrac1 ~]$ lsnrctlstatus
LSNRCTL for Linux: Version11.2.0.3.0 - Production on 20-11月-2014 17:18:52
Copyright (c) 1991, 2011,Oracle. All rights reserved.
Connecting to(DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Linux: Version11.2.0.3.0 - Production
Start Date 21-10月-201420:22:13
Uptime 29 days 20 hr. 56 min. 38sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /g01/11ggrid/app/11.2.0/grid/network/admin/listener.ora
Listener Log File /g01/11ggrid/app/11.2.0/grid/log/diag/tnslsnr/zgcrac1/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.88.12)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.88.11)(PORT=1521)))
Services Summary...
Service "+ASM" has 1instance(s).
Instance "+ASM1", status READY, has1 handler(s) for this service...
Service "PROD" has 1instance(s).
Instance "PROD1", status READY, has1 handler(s) for this service...
The command completedsuccessfully
实例运行
SQL> selectinstance_name,host_name,status from gv$instance;
INSTANCE_NAME
----------------
HOST_NAME STATUS
----------------------------------------------------------------------------
PROD1
zgcrac1 OPEN
PROD2
zgcrac2 OPEN
磁盘组运行
SQL> select group_number,name,state from v$asm_diskgroup;
GROUP_NUMBER NAME STATE
------------------------------------------ -----------
1 DATADG CONNECTED
2 FDG CONNECTED
3 SYSTEMDG MOUNTED
相关文章推荐
- Oracle 出现锁表问题解决步骤(以前写的一份故障排查报告)
- 日志在2003上报告 security 问题的解决办法
- 解决了winform作业中,一个类只实例化一次的问题
- Web Control中填写JavaScript报告"缺少对象"错误问题解决
- 活动目录(Active Directory)域故障解决实例
- 终于解决.net 连接oracle数据库时提示的未将对象引用设置到对象的实例的问题
- 如何快速解决IT系统中的疑难故障问题
- 再论局域网互访故障问题的解决方法图文教程
- 活动目录(Active Directory)域故障解决实例[转载]
- 路由器接口问题故障解决办法
- 出现"此版本的sql server不支持用户实例登陆标志" 问题的解决方法
- 解决Web Service中传递子类实例时,序列化的问题。
- 如何解决启动用户实例的进程时出错的问题(Sql Server Exoress)[转载]
- 动态script标签解决跨域问题实例
- MP3 常见问题、常见故障和解决办法
- 活动目录(Active Directory)域故障解决实例(转载)
- 错误代码:WLTC0032W ibatis作为持久层,websphere 报告连接没有提交错误,问题的解决以及产生的原因
- 故障解决:VS2005的水晶报表在WEB应用程序中多次使用后,就会出现加载报表失败.重启WEB服务器又正常了.过一段时间又出同样问题
- 线程访问临界区的问题 实例,需解决