11g R2 节点系统重建后,删除节点及添加节点 过程和问题解决
2016-07-05 23:19
489 查看
故障现象:
http://www.santongit.com/thread-12327-1-1.html
一个RAC数据库,两个节点,RedHat 6.3_X64的系统,因为业务问题,节点2的服务器的系统进行了重装。
现需要重建节点2 。
节点的重建
一:先从集群中清除节点2的信息
因为节点2服务器系统已经重装所以在清除节点时,在RAC上清除本地的操作就不需要操作了。直接在节点1上面
从集群中清除节点2的信息
(1):
[root@racdb1 ~]# olsnodes -t –s #####查看集群中的节点
[root@racdb1 ~]# crsctl unpin css -n racdb2 #####在所有保留的节点上执行
(2):删除节点2的数据库实例 使用dbca
[oracle@racdb1 ~]$ dbca –图形界面
验证racdb2实例已被删除
查看活动的实例:
[oracle@racdb1 ~]$ sqlplus / as sysdba
SQL> select thread#,status,instance from v$thread;
注:此过程可能报错,因为在节点2上已经重装系统,在DBCA删除实例时无法找到节点2上的相应文件。只要保证
在数据库中查不到racdb2的实例即可
查看库的配置:
[root@racdb1 ~]# srvctl config database -d orcl
(3):在racdb1节点上停止racdb2节点NodeApps
[oracle@racdb1 bin]$ srvctl stop nodeapps -n racdb2 -f
(4):在保留节点使用oracle用户更新集群列表
在每个保留的节点上执行:----------因为这是两个节点所以只在racdb1上执行就可以了
[root@racdb1 ~]# su – oracle
[oracle@racdb1 ~]$ $ORACLE_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME
“CLUSTER_NODES={racdb1}”
注:此时会报错,因为这是集群需要在racdb2上执行相关语句。报如下错误:
SEVERE: oracle.sysman.oii.oiip.oiipg.OiipgRemoteOpsException: Error occured while trying to run Unix command /u01/app/11.2.0/grid/oui/bin/../bin/runInstaller -paramFile /u01/app/11.2.0/grid/oui/bin/../clusterparam.ini -silent -ignoreSysPrereqs -updateNodeList
-noClusterEnabled ORACLE_HOME=/u01/app/11.2.0/grid CLUSTER_NODES=racdb1,racdb2 CRS=true "INVENTORY_LOCATION=/u01/app/oraInventory" LOCAL_NODE=racdb2 -remoteInvocation -invokingNodeName racdb1 -logFilePath "/u01/app/oraInventory/logs" -timestamp 2014-12-03_11-23-57PM
on nodes racdb2. [PRKC-1044 : Failed to check remote command execution setup for node racdb2 using shells /usr/bin/ssh and /usr/bin/rsh
File "/usr/bin/rsh" does not exist on node "racdb2"
No RSA host key is known for racdb2 and you have requested strict checking.Host key verification failed.]
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runCmdOnUnix(OiipgClusterRunCmd.java:276)
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runAnyCmdOnNodes(OiipgClusterRunCmd.java:369)
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runCmd(OiipgClusterRunCmd.java:314)
at oracle.sysman.oii.oiic.OiicBaseInventoryApp.runRemoteInvOpCmd(OiicBaseInventoryApp.java:281)
at oracle.sysman.oii.oiic.OiicUpdateNodeList.clsCmdUpdateNodeList(OiicUpdateNodeList.java:296)
at oracle.sysman.oii.oiic.OiicUpdateNodeList.doOperation(OiicUpdateNodeList.java:240)
at oracle.sysman.oii.oiic.OiicBaseInventoryApp.main_helper(OiicBaseInventoryApp.java:890)
at oracle.sysman.oii.oiic.OiicUpdateNodeList.main(OiicUpdateNodeList.java:401)
Caused by: oracle.ops.mgmt.cluster.ClusterException: PRKC-1044 : Failed to check remote command execution setup for node racdb2 using shells /usr/bin/ssh and /usr/bin/rsh
File "/usr/bin/rsh" does not exist on node "racdb2"
No RSA host key is known for racdb2 and you have requested strict checking.Host key verification failed.
at oracle.ops.mgmt.cluster.ClusterCmd.runCmd(ClusterCmd.java:2149)
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runCmdOnUnix(OiipgClusterRunCmd.java:270)
... 7 more
SEVERE: Remote 'UpdateNodeList' failed on nodes: 'racdb2'. Refer to '/u01/app/oraInventory/logs/UpdateNodeList2014-12-03_11-23-57PM.log' for details.
It is recommended that the following command needs to be manually run on the failed nodes:
/u01/app/11.2.0/grid/oui/bin/runInstaller -updateNodeList -noClusterEnabled ORACLE_HOME=/u01/app/11.2.0/grid CLUSTER_NODES=racdb1,racdb2 CRS=true "INVENTORY_LOCATION=/u01/app/oraInventory" LOCAL_NODE=<node on which command is to be run>.
Please refer 'UpdateNodeList' logs under central inventory of remote nodes where failure occurred for more details.
因为racdb2重装了系统没有相对应的文件,所以执行不成功。此时打开 RAC 的节点和CRS 配置文件 inventory.xml 进行手工
清除 racdb2 的节点。
(5):删除racdb2节点的VIP
[root@racdb1 ~]# crs_stat -t
如果仍然有racdb2节点的VIP服务存在,执行如下:
[root@racdb1 ~]# srvctl stop vip -i ora.racdb2.vip -f
[root@racdb1 ~]# srvctl remove vip -i ora.racdb2.vip -f
[root@racdb1 ~]# crsctl delete resource ora.racdb2.vip -f
(6):在任一保留的节点(racdb1)上删除racdb2节点
[root@racdb1 ~]# crsctl delete node -n racdb2
[root@racdb1 ~]# olsnodes -t -s
(7):保留节点(racdb1)使用grid用户更新集群列表
在所有保留的节点上执行:
[grid@racdb1 ~]$ $ORACLE_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME “CLUSTER_NODES={racdb1}” CRS=true
注:此时会报错,因为这是集群需要在racdb2上执行相关语句。报如下错误:
SEVERE: oracle.sysman.oii.oiip.oiipg.OiipgRemoteOpsException: Error occured while trying to run Unix command /u01/app/11.2.0/grid/oui/bin/../bin/runInstaller -paramFile /u01/app/11.2.0/grid/oui/bin/../clusterparam.ini -silent -ignoreSysPrereqs -updateNodeList
-noClusterEnabled ORACLE_HOME=/u01/app/11.2.0/grid CLUSTER_NODES=racdb1,racdb2 CRS=true "INVENTORY_LOCATION=/u01/app/oraInventory" LOCAL_NODE=racdb2 -remoteInvocation -invokingNodeName racdb1 -logFilePath "/u01/app/oraInventory/logs" -timestamp 2014-12-03_11-23-57PM
on nodes racdb2. [PRKC-1044 : Failed to check remote command execution setup for node racdb2 using shells /usr/bin/ssh and /usr/bin/rsh
File "/usr/bin/rsh" does not exist on node "racdb2"
No RSA host key is known for racdb2 and you have requested strict checking.Host key verification failed.]
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runCmdOnUnix(OiipgClusterRunCmd.java:276)
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runAnyCmdOnNodes(OiipgClusterRunCmd.java:369)
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runCmd(OiipgClusterRunCmd.java:314)
at oracle.sysman.oii.oiic.OiicBaseInventoryApp.runRemoteInvOpCmd(OiicBaseInventoryApp.java:281)
at oracle.sysman.oii.oiic.OiicUpdateNodeList.clsCmdUpdateNodeList(OiicUpdateNodeList.java:296)
at oracle.sysman.oii.oiic.OiicUpdateNodeList.doOperation(OiicUpdateNodeList.java:240)
at oracle.sysman.oii.oiic.OiicBaseInventoryApp.main_helper(OiicBaseInventoryApp.java:890)
at oracle.sysman.oii.oiic.OiicUpdateNodeList.main(OiicUpdateNodeList.java:401)
Caused by: oracle.ops.mgmt.cluster.ClusterException: PRKC-1044 : Failed to check remote command execution setup for node racdb2 using shells /usr/bin/ssh and /usr/bin/rsh
File "/usr/bin/rsh" does not exist on node "racdb2"
No RSA host key is known for racdb2 and you have requested strict checking.Host key verification failed.
at oracle.ops.mgmt.cluster.ClusterCmd.runCmd(ClusterCmd.java:2149)
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runCmdOnUnix(OiipgClusterRunCmd.java:270)
... 7 more
SEVERE: Remote 'UpdateNodeList' failed on nodes: 'racdb2'. Refer to '/u01/app/oraInventory/logs/UpdateNodeList2014-12-03_11-23-57PM.log'
for details.
It is recommended that the following command needs to be manually run on the failed nodes:
/u01/app/11.2.0/grid/oui/bin/runInstaller -updateNodeList -noClusterEnabled ORACLE_HOME=/u01/app/11.2.0/grid CLUSTER_NODES=racdb1,racdb2 CRS=true "INVENTORY_LOCATION=/u01/app/oraInventory" LOCAL_NODE=<node on which command is to be run>.
Please refer 'UpdateNodeList' logs under central inventory of remote nodes where failure occurred for more details.
因为racdb2重装了系统没有相对应的文件,所以执行不成功。此时打开 racdb1上的 RAC 的节点和CRS 配置文件 inventory.xml 进行手工清除 racdb2 的CRS信息
(8)验证racdb2节点被删除
在任一保留的节点上:
[grid@racdb1 ~]$ cluvfy stage -post nodedel -n racdb2
[grid@racdb1 ~]$ crsctl status resource -t
验证racdb2节点被删除
查看活动的实例:
[oracle@racdb1 ~]$ sqlplus / as sysdba
SQL> select thread#,status,instance from v$thread;
至此集群中的节点2的信息完全清除完毕!
因为racdb2 的系统已经重装,所以在删除节点更新节点信息和集群信息时 会报错。此时可以手工修改,但是在安装时会有报错
提示racdb2没有清除干净
二:重新添加racdb2
(1):添加相应的用户和组 添加相应的用户和组 添加相应的用户和组
(2):配置 hosts hostshostshosts文件 ,新增节点和原有都配置为相同的 新增节点和原有都配置为相同
(3):配置系统参数,用户参数和原有节点一样
(4):创建相应的目录
(5):检查racdb2是否满足rac安装条件(在已经有节点下面用grid,oracle用户执行)
[root@racdb1 ~]# su - grid
[grid@racdb1 ~]$ cluvfy stage -pre nodeadd -n racdb2 -fixup -verbose
[grid@racdb1 ~]$ cluvfy stage -post hwos -n racdb2
(6): 添加新节点的软件
在已经有节点下面执行这个命令添加新节点的集群软件(grid用户执行)
[root@racdb1 ~]# su - grid
[grid@racdb1 ~]$ /u01/app/grid/product/11.2.0/grid/oui/bin/addNode.sh -silent "CLUSTER_NEW_NODES={racdb2}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={racdb2-vip}" "CLUSTER_NEW_PRIVATE_NODE_NAMES={racdb2-priv}" #用grid用户执行
注:因为在删除时节点已经重装系统,无法在节点2上执行相应的操作,我们是在节点1上手工清除的节点2的信息
手工清除后,再进行检测时是清除干净的。但是在添加时会报如下错误
Performing tests to see whether nodes racdb2,racdb2 are available
............................................................... 100% Done.
Error ocurred while retrieving node numbers of the existing nodes. Please
check if clusterware home is properly configured.
SEVERE:Error ocurred while retrieving node numbers of the existing nodes. Please check if clusterware home is properly configured.
此时在官网上查到的解决办法是
grid@racdb1 bin]$
./detachHome.sh
Starting Oracle Universal Installer...
Checking swap space: must be greater than 500 MB. Actual 2986 MB Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory
'DetachHome' was successful.
[grid@racdb1 bin]$ ./attachHome.sh
Starting Oracle Universal Installer...
Checking swap space: must be greater than 500 MB. Actual 2986
MB Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2010-06-01_08-53-48PM. Please wait ...[grid@racdb1 bin]$ The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory
'AttachHome' was successful.
这两个步骤就是根据集群信息重新重建一下 inventory.xml
有的地方提示是要执行下面的脚本,个人感觉不能执行,因为若要执行会把racdb1的Cluster 删除掉,这对于集群是致命的!!
[grid@racdb1 bin]$ ./runInstaller -updateNodeList ORACLE_HOME=$oracle_home "CLUSTER_NODES={racdb1}" -local
Starting Oracle Universal Installer...
Checking swap space: must be greater than 500 MB. Actual 2671
MB Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory
(7):运行提示的root.sh脚本
/u01/app/oraInventory/orainstRoot.sh #在新节点 racdb2用root用户执行
/u01/app/grid/product/11.2.0/grid/root.sh #在新节点racdb2用root用户执行
(8):验证集群软件是否添加成功
[grid@racdb1 bin]$ cluvfy stage -post nodeadd -n racdb2 -verbose
(9).添加新节点数据库
为新节点安装数据库软件(在已经有节点下用oracle用户执行)
[root@racdb1 ~]# su - oracle
[oracle@racdb1 ~]$ /app/oracle/product/11.2.0/db_1/oui/bin/addNode.sh -silent "CLUSTER_NEW_NODES={racdb2}
运行提示的root.sh脚本
#在新节点 racdb2用root用户执行
/app/oracle/product/11.2.0/db_1/root.sh
注:在添加节点数据库时可能会遇到无法进行cp 但是也没有报错,此时可以直接把 oracle软件从racdb1 拷贝到 racdb2 上,直接拷贝的
不需要执行root.sh
(10):添加实例
[oracle@racdb1 ~]$ dbca
或用命令行直接添加实例(在已经有节点下面用oracle用户执行)
[oracle@racdb1 ~]$ dbca -silent -addInstance -nodeList racdb2 -gdbName orcl -instanceName orcldb2 -sysDBAUserName sys -sysDBAPassword "***" 在oracle用户下面执行
注:再添加完实例后,若是直接拷贝的数据库软件,新添加实例可能无法启动报如下错误:
ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+DATA1/orcl/spfileorcl.ora'
ORA-17503: ksfdopn:2 Failed to open file +DATA1/orcl/spfileorcl.ora
报以上错误是因为直接拷贝的数据库软件有两个软件权限不对,执行以下语句修改权限
cd $GRID_HOME/bin
chmod 6751 oracle
cd $ORACLE_HOME/bin
chmod 6751 oracle
(11)验证已添加实例
查看活动的实例:
[oracle@racdb1 ~]$ sqlplus / as sysdba
SQL> select thread#,status,instance from gv$thread;
http://www.santongit.com/thread-12327-1-1.html
一个RAC数据库,两个节点,RedHat 6.3_X64的系统,因为业务问题,节点2的服务器的系统进行了重装。
现需要重建节点2 。
节点的重建
一:先从集群中清除节点2的信息
因为节点2服务器系统已经重装所以在清除节点时,在RAC上清除本地的操作就不需要操作了。直接在节点1上面
从集群中清除节点2的信息
(1):
[root@racdb1 ~]# olsnodes -t –s #####查看集群中的节点
[root@racdb1 ~]# crsctl unpin css -n racdb2 #####在所有保留的节点上执行
(2):删除节点2的数据库实例 使用dbca
[oracle@racdb1 ~]$ dbca –图形界面
验证racdb2实例已被删除
查看活动的实例:
[oracle@racdb1 ~]$ sqlplus / as sysdba
SQL> select thread#,status,instance from v$thread;
注:此过程可能报错,因为在节点2上已经重装系统,在DBCA删除实例时无法找到节点2上的相应文件。只要保证
在数据库中查不到racdb2的实例即可
查看库的配置:
[root@racdb1 ~]# srvctl config database -d orcl
(3):在racdb1节点上停止racdb2节点NodeApps
[oracle@racdb1 bin]$ srvctl stop nodeapps -n racdb2 -f
(4):在保留节点使用oracle用户更新集群列表
在每个保留的节点上执行:----------因为这是两个节点所以只在racdb1上执行就可以了
[root@racdb1 ~]# su – oracle
[oracle@racdb1 ~]$ $ORACLE_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME
“CLUSTER_NODES={racdb1}”
注:此时会报错,因为这是集群需要在racdb2上执行相关语句。报如下错误:
SEVERE: oracle.sysman.oii.oiip.oiipg.OiipgRemoteOpsException: Error occured while trying to run Unix command /u01/app/11.2.0/grid/oui/bin/../bin/runInstaller -paramFile /u01/app/11.2.0/grid/oui/bin/../clusterparam.ini -silent -ignoreSysPrereqs -updateNodeList
-noClusterEnabled ORACLE_HOME=/u01/app/11.2.0/grid CLUSTER_NODES=racdb1,racdb2 CRS=true "INVENTORY_LOCATION=/u01/app/oraInventory" LOCAL_NODE=racdb2 -remoteInvocation -invokingNodeName racdb1 -logFilePath "/u01/app/oraInventory/logs" -timestamp 2014-12-03_11-23-57PM
on nodes racdb2. [PRKC-1044 : Failed to check remote command execution setup for node racdb2 using shells /usr/bin/ssh and /usr/bin/rsh
File "/usr/bin/rsh" does not exist on node "racdb2"
No RSA host key is known for racdb2 and you have requested strict checking.Host key verification failed.]
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runCmdOnUnix(OiipgClusterRunCmd.java:276)
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runAnyCmdOnNodes(OiipgClusterRunCmd.java:369)
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runCmd(OiipgClusterRunCmd.java:314)
at oracle.sysman.oii.oiic.OiicBaseInventoryApp.runRemoteInvOpCmd(OiicBaseInventoryApp.java:281)
at oracle.sysman.oii.oiic.OiicUpdateNodeList.clsCmdUpdateNodeList(OiicUpdateNodeList.java:296)
at oracle.sysman.oii.oiic.OiicUpdateNodeList.doOperation(OiicUpdateNodeList.java:240)
at oracle.sysman.oii.oiic.OiicBaseInventoryApp.main_helper(OiicBaseInventoryApp.java:890)
at oracle.sysman.oii.oiic.OiicUpdateNodeList.main(OiicUpdateNodeList.java:401)
Caused by: oracle.ops.mgmt.cluster.ClusterException: PRKC-1044 : Failed to check remote command execution setup for node racdb2 using shells /usr/bin/ssh and /usr/bin/rsh
File "/usr/bin/rsh" does not exist on node "racdb2"
No RSA host key is known for racdb2 and you have requested strict checking.Host key verification failed.
at oracle.ops.mgmt.cluster.ClusterCmd.runCmd(ClusterCmd.java:2149)
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runCmdOnUnix(OiipgClusterRunCmd.java:270)
... 7 more
SEVERE: Remote 'UpdateNodeList' failed on nodes: 'racdb2'. Refer to '/u01/app/oraInventory/logs/UpdateNodeList2014-12-03_11-23-57PM.log' for details.
It is recommended that the following command needs to be manually run on the failed nodes:
/u01/app/11.2.0/grid/oui/bin/runInstaller -updateNodeList -noClusterEnabled ORACLE_HOME=/u01/app/11.2.0/grid CLUSTER_NODES=racdb1,racdb2 CRS=true "INVENTORY_LOCATION=/u01/app/oraInventory" LOCAL_NODE=<node on which command is to be run>.
Please refer 'UpdateNodeList' logs under central inventory of remote nodes where failure occurred for more details.
因为racdb2重装了系统没有相对应的文件,所以执行不成功。此时打开 RAC 的节点和CRS 配置文件 inventory.xml 进行手工
清除 racdb2 的节点。
(5):删除racdb2节点的VIP
[root@racdb1 ~]# crs_stat -t
如果仍然有racdb2节点的VIP服务存在,执行如下:
[root@racdb1 ~]# srvctl stop vip -i ora.racdb2.vip -f
[root@racdb1 ~]# srvctl remove vip -i ora.racdb2.vip -f
[root@racdb1 ~]# crsctl delete resource ora.racdb2.vip -f
(6):在任一保留的节点(racdb1)上删除racdb2节点
[root@racdb1 ~]# crsctl delete node -n racdb2
[root@racdb1 ~]# olsnodes -t -s
(7):保留节点(racdb1)使用grid用户更新集群列表
在所有保留的节点上执行:
[grid@racdb1 ~]$ $ORACLE_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME “CLUSTER_NODES={racdb1}” CRS=true
注:此时会报错,因为这是集群需要在racdb2上执行相关语句。报如下错误:
SEVERE: oracle.sysman.oii.oiip.oiipg.OiipgRemoteOpsException: Error occured while trying to run Unix command /u01/app/11.2.0/grid/oui/bin/../bin/runInstaller -paramFile /u01/app/11.2.0/grid/oui/bin/../clusterparam.ini -silent -ignoreSysPrereqs -updateNodeList
-noClusterEnabled ORACLE_HOME=/u01/app/11.2.0/grid CLUSTER_NODES=racdb1,racdb2 CRS=true "INVENTORY_LOCATION=/u01/app/oraInventory" LOCAL_NODE=racdb2 -remoteInvocation -invokingNodeName racdb1 -logFilePath "/u01/app/oraInventory/logs" -timestamp 2014-12-03_11-23-57PM
on nodes racdb2. [PRKC-1044 : Failed to check remote command execution setup for node racdb2 using shells /usr/bin/ssh and /usr/bin/rsh
File "/usr/bin/rsh" does not exist on node "racdb2"
No RSA host key is known for racdb2 and you have requested strict checking.Host key verification failed.]
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runCmdOnUnix(OiipgClusterRunCmd.java:276)
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runAnyCmdOnNodes(OiipgClusterRunCmd.java:369)
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runCmd(OiipgClusterRunCmd.java:314)
at oracle.sysman.oii.oiic.OiicBaseInventoryApp.runRemoteInvOpCmd(OiicBaseInventoryApp.java:281)
at oracle.sysman.oii.oiic.OiicUpdateNodeList.clsCmdUpdateNodeList(OiicUpdateNodeList.java:296)
at oracle.sysman.oii.oiic.OiicUpdateNodeList.doOperation(OiicUpdateNodeList.java:240)
at oracle.sysman.oii.oiic.OiicBaseInventoryApp.main_helper(OiicBaseInventoryApp.java:890)
at oracle.sysman.oii.oiic.OiicUpdateNodeList.main(OiicUpdateNodeList.java:401)
Caused by: oracle.ops.mgmt.cluster.ClusterException: PRKC-1044 : Failed to check remote command execution setup for node racdb2 using shells /usr/bin/ssh and /usr/bin/rsh
File "/usr/bin/rsh" does not exist on node "racdb2"
No RSA host key is known for racdb2 and you have requested strict checking.Host key verification failed.
at oracle.ops.mgmt.cluster.ClusterCmd.runCmd(ClusterCmd.java:2149)
at oracle.sysman.oii.oiip.oiipg.OiipgClusterRunCmd.runCmdOnUnix(OiipgClusterRunCmd.java:270)
... 7 more
SEVERE: Remote 'UpdateNodeList' failed on nodes: 'racdb2'. Refer to '/u01/app/oraInventory/logs/UpdateNodeList2014-12-03_11-23-57PM.log'
for details.
It is recommended that the following command needs to be manually run on the failed nodes:
/u01/app/11.2.0/grid/oui/bin/runInstaller -updateNodeList -noClusterEnabled ORACLE_HOME=/u01/app/11.2.0/grid CLUSTER_NODES=racdb1,racdb2 CRS=true "INVENTORY_LOCATION=/u01/app/oraInventory" LOCAL_NODE=<node on which command is to be run>.
Please refer 'UpdateNodeList' logs under central inventory of remote nodes where failure occurred for more details.
因为racdb2重装了系统没有相对应的文件,所以执行不成功。此时打开 racdb1上的 RAC 的节点和CRS 配置文件 inventory.xml 进行手工清除 racdb2 的CRS信息
(8)验证racdb2节点被删除
在任一保留的节点上:
[grid@racdb1 ~]$ cluvfy stage -post nodedel -n racdb2
[grid@racdb1 ~]$ crsctl status resource -t
验证racdb2节点被删除
查看活动的实例:
[oracle@racdb1 ~]$ sqlplus / as sysdba
SQL> select thread#,status,instance from v$thread;
至此集群中的节点2的信息完全清除完毕!
因为racdb2 的系统已经重装,所以在删除节点更新节点信息和集群信息时 会报错。此时可以手工修改,但是在安装时会有报错
提示racdb2没有清除干净
二:重新添加racdb2
(1):添加相应的用户和组 添加相应的用户和组 添加相应的用户和组
(2):配置 hosts hostshostshosts文件 ,新增节点和原有都配置为相同的 新增节点和原有都配置为相同
(3):配置系统参数,用户参数和原有节点一样
(4):创建相应的目录
(5):检查racdb2是否满足rac安装条件(在已经有节点下面用grid,oracle用户执行)
[root@racdb1 ~]# su - grid
[grid@racdb1 ~]$ cluvfy stage -pre nodeadd -n racdb2 -fixup -verbose
[grid@racdb1 ~]$ cluvfy stage -post hwos -n racdb2
(6): 添加新节点的软件
在已经有节点下面执行这个命令添加新节点的集群软件(grid用户执行)
[root@racdb1 ~]# su - grid
[grid@racdb1 ~]$ /u01/app/grid/product/11.2.0/grid/oui/bin/addNode.sh -silent "CLUSTER_NEW_NODES={racdb2}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={racdb2-vip}" "CLUSTER_NEW_PRIVATE_NODE_NAMES={racdb2-priv}" #用grid用户执行
注:因为在删除时节点已经重装系统,无法在节点2上执行相应的操作,我们是在节点1上手工清除的节点2的信息
手工清除后,再进行检测时是清除干净的。但是在添加时会报如下错误
Performing tests to see whether nodes racdb2,racdb2 are available
............................................................... 100% Done.
Error ocurred while retrieving node numbers of the existing nodes. Please
check if clusterware home is properly configured.
SEVERE:Error ocurred while retrieving node numbers of the existing nodes. Please check if clusterware home is properly configured.
此时在官网上查到的解决办法是
grid@racdb1 bin]$
./detachHome.sh
Starting Oracle Universal Installer...
Checking swap space: must be greater than 500 MB. Actual 2986 MB Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory
'DetachHome' was successful.
[grid@racdb1 bin]$ ./attachHome.sh
Starting Oracle Universal Installer...
Checking swap space: must be greater than 500 MB. Actual 2986
MB Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2010-06-01_08-53-48PM. Please wait ...[grid@racdb1 bin]$ The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory
'AttachHome' was successful.
这两个步骤就是根据集群信息重新重建一下 inventory.xml
有的地方提示是要执行下面的脚本,个人感觉不能执行,因为若要执行会把racdb1的Cluster 删除掉,这对于集群是致命的!!
[grid@racdb1 bin]$ ./runInstaller -updateNodeList ORACLE_HOME=$oracle_home "CLUSTER_NODES={racdb1}" -local
Starting Oracle Universal Installer...
Checking swap space: must be greater than 500 MB. Actual 2671
MB Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory
(7):运行提示的root.sh脚本
/u01/app/oraInventory/orainstRoot.sh #在新节点 racdb2用root用户执行
/u01/app/grid/product/11.2.0/grid/root.sh #在新节点racdb2用root用户执行
(8):验证集群软件是否添加成功
[grid@racdb1 bin]$ cluvfy stage -post nodeadd -n racdb2 -verbose
(9).添加新节点数据库
为新节点安装数据库软件(在已经有节点下用oracle用户执行)
[root@racdb1 ~]# su - oracle
[oracle@racdb1 ~]$ /app/oracle/product/11.2.0/db_1/oui/bin/addNode.sh -silent "CLUSTER_NEW_NODES={racdb2}
运行提示的root.sh脚本
#在新节点 racdb2用root用户执行
/app/oracle/product/11.2.0/db_1/root.sh
注:在添加节点数据库时可能会遇到无法进行cp 但是也没有报错,此时可以直接把 oracle软件从racdb1 拷贝到 racdb2 上,直接拷贝的
不需要执行root.sh
(10):添加实例
[oracle@racdb1 ~]$ dbca
或用命令行直接添加实例(在已经有节点下面用oracle用户执行)
[oracle@racdb1 ~]$ dbca -silent -addInstance -nodeList racdb2 -gdbName orcl -instanceName orcldb2 -sysDBAUserName sys -sysDBAPassword "***" 在oracle用户下面执行
注:再添加完实例后,若是直接拷贝的数据库软件,新添加实例可能无法启动报如下错误:
ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+DATA1/orcl/spfileorcl.ora'
ORA-17503: ksfdopn:2 Failed to open file +DATA1/orcl/spfileorcl.ora
报以上错误是因为直接拷贝的数据库软件有两个软件权限不对,执行以下语句修改权限
cd $GRID_HOME/bin
chmod 6751 oracle
cd $ORACLE_HOME/bin
chmod 6751 oracle
(11)验证已添加实例
查看活动的实例:
[oracle@racdb1 ~]$ sqlplus / as sysdba
SQL> select thread#,status,instance from gv$thread;
相关文章推荐
- PopupWindow简单使用
- filesystem-e2fsprog
- SQL镜像
- C语言字符串查找替换
- poj2376
- Scala中的类和对象
- HDU 5575 Discover Water Tank
- 史上最详细Windows版本搭建安装React Native环境配置
- 对scoped_ptr智能指着的实现和测试用例
- JAVA之多线程的安全问题(线程同步)
- sql之with as 公用表表达式
- HDU 1269 迷宫城堡
- 个人档案 7-4
- 个人档案
- Service
- Adapter中常见遇到的NullPointerException
- tomcat 的jvm 内存溢出问题的解决
- mitaka版本openstack虚拟化云桌面的实现(spice)
- Scala中的构造器和高阶函数
- poj 3264