VMware下linux中安装oracle11g R2 root时遇到“time out"(未解决)
2012-07-09 15:06
531 查看
这次问题没有解决,但问题的原因也算是弄明白了,在解决的过程中也搞了许多东西,做为个人笔记,暂且记在这里。
这次在第二个节点root也遇到了“Timed out waiting for the CRS stack to start.”问题,具体的检查解决过程如下。
[root@dbserver2 ~]# ?/apps/oracle/11.2.0/grid/root.sh
-bash: ? command not found
[root@dbserver2 ~]# /apps/oracle/11.2.0/grid/root.sh
Running Oracle 11g root.sh script...
The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /apps/oracle/11.2.0/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]:
Copying dbhome to /usr/local/bin ...
Copying oraenv to /usr/local/bin ...
Copying coraenv to /usr/local/bin ...
Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.
2012-06-07 09:25:48: Parsing the host name
2012-06-07 09:25:48: Checking for super user privileges
2012-06-07 09:25:48: User has super user privileges
Using configuration parameter file: /apps/oracle/11.2.0/grid/crs/install/crsconfig_params
Creating trace directory
LOCAL ADD MODE
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Adding daemon to inittab
CRS-4123: Oracle High Availability Services has been started.
ohasd is starting
ADVM/ACFS is not supported on centos-release-4-8
CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node dbserver1, number 1, and is terminating
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'dbserver2'
CRS-2677: Stop of 'ora.cssdmonitor' on 'dbserver2' succeeded
An active cluster was found during exclusive startup, restarting to join the cluster
CRS-2672: Attempting to start 'ora.mdnsd' on 'dbserver2'
CRS-2676: Start of 'ora.mdnsd' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'dbserver2'
CRS-2676: Start of 'ora.gipcd' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'dbserver2'
CRS-2676: Start of 'ora.gpnpd' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'dbserver2'
CRS-2676: Start of 'ora.cssdmonitor' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'dbserver2'
CRS-2672: Attempting to start 'ora.diskmon' on 'dbserver2'
CRS-2676: Start of 'ora.diskmon' on 'dbserver2' succeeded
CRS-2676: Start of 'ora.cssd' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'dbserver2'
CRS-2676: Start of 'ora.ctssd' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'dbserver2'
CRS-2676: Start of 'ora.asm' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'dbserver2'
CRS-2676: Start of 'ora.crsd' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.evmd' on 'dbserver2'
CRS-2676: Start of 'ora.evmd' on 'dbserver2' succeeded
Timed out waiting for the CRS stack to start.
[root@dbserver2 ~]# su - grid
[grid@dbserver2 ~]$ asmcmd
ASMCMD> ls
CRS/
ASMCMD> exit
[grid@dbserver2 ~]$
查看日志:
[root@dbserver2 dbserver2]# tail -100 alertdbserver2.log
ctssd(13516)]CRS-2403:The Cluster Time Synchronization Service on host dbserver2 is in observer mode.
2012-06-07 09:29:30.448
[ctssd(13516)]CRS-2407:The new Cluster Time Synchronization Service reference node is host dbserver1.
2012-06-07 09:29:30.462
[ctssd(13516)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log.
2012-06-07 09:29:30.462
[ctssd(13516)]CRS-2409:The clock on host dbserver2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2012-06-07 09:29:30.930
[ctssd(13516)]CRS-2401:The Cluster Time Synchronization Service started on host dbserver2.
2012-06-07 09:30:16.981
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:19.320
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:21.479
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:23.641
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:26.077
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:28.650
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:31.091
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:33.262
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:36.844
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:39.175
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:41.394
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:41.395
[ohasd(12955)]CRS-2771:Maximum restart attempts reached for resource 'ora.crsd'; will not restart.
2012-06-07 10:00:04.931
[ctssd(13516)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log.
2012-06-07 10:00:04.933
[ctssd(13516)]CRS-2409:The clock on host dbserver2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2012-06-07 10:31:19.688
[ctssd(13516)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log.
2012-06-07 10:31:19.690
[ctssd(13516)]CRS-2409:The clock on host dbserver2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
[root@dbserver2 dbserver2]#
原来这次是由于时钟同步问题导致的。
尝试次时间服务停掉,使用集群时间同步
[root@dbserver1 ~]# service ntpd stop
[root@dbserver2 dbserver2]# service ntpd stop
重新执行,如下:
[root@dbserver2 ~]# cd /apps/oracle/11.2.0/grid/crs/install/
[root@dbserver2 install]# /apps/oracle/11.2.0/grid/crs/install/rootcrs.pl -verbose -deconfig -force
……
error: package cvuqdisk is not installed
Successfully deconfigured Oracle clusterware stack on this node
[root@dbserver2 ~]# /apps/oracle/11.2.0/grid/root.sh
……
CRS-2676: Start of 'ora.evmd' on 'dbserver2' succeeded
Timed out waiting for the CRS stack to start.
[root@dbserver2 dbserver2]# tail -100 /apps/oracle/11.2.0/grid/log/dbserver2/alertdbserver2.log
还有时间问题,如下:
2012-06-07 10:42:47.726
[ctssd(18229)]CRS-2407:The new Cluster Time Synchronization Service reference node is host dbserver1.
2012-06-07 10:42:47.746
[ctssd(18229)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log.
2012-06-07 10:42:47.746
[ctssd(18229)]CRS-2409:The clock on host dbserver2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2012-06-07 10:42:48.651
[ctssd(18229)]CRS-2401:The Cluster Time Synchronization Service started on host dbserver2.
2012-06-07 10:43:15.823
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:18.045
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:20.207
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:22.771
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:24.988
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:27.495
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:29.708
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:32.015
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:34.427
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:36.730
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:38.913
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:38.914
[ohasd(17697)]CRS-2771:Maximum restart attempts reached for resource 'ora.crsd'; will not restart.
打开两台机器上的时间服务,再次重新执行,还是不行,日志报错如下:
[ctssd(19775)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log.
2012-06-09 10:04:36.307
[ctssd(19775)]CRS-2409:The clock on host dbserver2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2012-06-09 10:35:09.642
[ctssd(19775)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log.
2012-06-09 10:35:09.644
[ctssd(19775)]CRS-2409:The clock on host dbserver2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
经检查时间,再一次不同步?
[grid@dbserver2 ~]$ ssh dbserver1 date > a.txt|ssh dbserver2 date >> a.txt
[grid@dbserver2 ~]$ cat a.txt
Sat Jun 9 19:14:25 CST 2012
Sat Jun 9 18:58:10 CST 2012
[grid@dbserver2 ~]$
经过一次同步:
[root@dbserver2 ~]# service ntpd restart
Shutting down ntpd: [ OK ]
ntpd: Synchronizing with time server: [ OK ]
Starting ntpd: [ OK ]
[root@dbserver2 ~]#
然后观察:
[grid@dbserver1 ~]$ ssh dbserver1 date > a.txt|ssh dbserver2 date >> a.txt
[grid@dbserver1 ~]$ cat a.txt
Sat Jun 9 19:15:42 CST 2012
Sat Jun 9 19:15:42 CST 2012
[grid@dbserver1 ~]$ ssh dbserver1 date > a.txt|ssh dbserver2 date >> a.txt
[grid@dbserver1 ~]$ cat a.txt
Sat Jun 9 19:15:52 CST 2012
Sat Jun 9 19:15:51 CST 2012
[grid@dbserver1 ~]$ ssh dbserver1 date > a.txt|ssh dbserver2 date >> a.txt
[grid@dbserver1 ~]$ cat a.txt
Sat Jun 9 19:16:02 CST 2012
Sat Jun 9 19:16:00 CST 2012
[grid@dbserver1 ~]$ ssh dbserver1 date > a.txt|ssh dbserver2 date >> a.txt
[grid@dbserver1 ~]$ cat a.txt
Sat Jun 9 19:16:37 CST 2012
Sat Jun 9 19:16:31 CST 2012
[grid@dbserver1 ~]$
可以看到,时间间距在慢慢变大,至少有一个时间不准确
从这个文章参考到了原因以及解决办法:
http://hi.baidu.com/shayusir/blog/item/74ac1df68d85852e5d600845.html
原因是:
Linux 2.6核心里把系统计时器的频率加高到1000 Hz ,VMware没办法真的每隔1ms就报一次讯号给guestOS,所以guestOS里的Linux 2.6无法确实接到计时器的讯号。本来这也不应该造成问题,不过2.6核心处理这个"tick loss"的程式有问题,以致于guestOS里的Linux 2.6系统时间走一秒会慢一秒;外面过了两秒里面只过了一秒。
我的解决办法是,一分钟同步一次。
Shutting down ntpd: [ OK ]
[root@dbserver2 ~]# ntpdate dbserver1
9 Jun 19:29:55 ntpdate[13122]: step time server 10.27.76.183 offset 80.589517 sec
[root@dbserver2 ~]#
[root@dbserver2 cron]# crontab -l
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
* * * * * ntpdate dbserver1>/dev/null 2>&1 &
[root@dbserver2 ~]# service crond restart
Stopping crond: [ OK ]
Starting crond: [ OK ]
[root@dbserver2 ~]#
结果如下:
[grid@dbserver1 ~]$ ssh dbserver1 date > a.txt|ssh dbserver2 date >> a.txt
[grid@dbserver1 ~]$ cat a.txt
Sat Jun 9 19:57:08 CST 2012
Sat Jun 9 19:55:51 CST 2012
[grid@dbserver1 ~]$ ssh dbserver1 date > a.txt|ssh dbserver2 date >> a.txt
[grid@dbserver1 ~]$ cat a.txt
Sat Jun 9 19:57:39 CST 2012
Sat Jun 9 19:57:37 CST 2012
[grid@dbserver1 ~]$
很不幸的是,还是不能解决当前的问题。
[ctssd(16953)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log.
2012-06-09 20:07:44.619
[ctssd(16953)]CRS-2409:The clock on host dbserver2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
[root@dbserver2 cron]# tail -f /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log
odenum [1] hostname [dbserver1] )
2012-06-09 20:13:28.725: [ CTSS][119737248]ctsselect_msm: Sync interval returned in [4]
2012-06-09 20:13:29.727: [ CTSS][119737248]ctsselect_msm: CTSS mode is [66]
2012-06-09 20:13:29.730: [ CTSS][119737248]ctssslave_swm17: LT [1339244009sec 730191usec], MT [1339244011sec 303569usec], Delta [3007usec]
2012-06-09 20:13:29.730: [ CTSS][119737248]ctssslave_swm19: The offset is [-1573378 usec] and sync interval set to [4]
2012-06-09 20:13:29.730: [ CTSS][119737248]ctssslave_sync_with_master: Received from master (mode [0x6e] nodenum [1] hostname [dbserver1] )
2012-06-09 20:13:29.730: [ CTSS][119737248]ctsselect_msm: Sync interval returned in [4]
2012-06-09 20:13:30.732: [ CTSS][119737248]ctsselect_msm: CTSS mode is [66]
2012-06-09 20:13:30.734: [ CTSS][119737248]ctssslave_swm17: LT [1339244010sec 734589usec], MT [1339244012sec 225815usec], Delta [2037usec]
2012-06-09 20:13:30.734: [ CTSS][119737248]ctssslave_swm19: The offset is [-1491226 usec] and sync interval set to [4]
2012-06-09 20:13:30.734: [ CTSS][119737248]ctssslave_sync_with_master: Received from master (mode [0x6e] nodenum [1] hostname [dbserver1] )
20
在杨的博客上找到:Oracle的时间同步只能在后台选择一个可接受的时间间隔,大概是0.13秒,而我现在一分钟同步一次,远大于这个值,貌似crontab最小间隔就是一分钟了。
这次在第二个节点root也遇到了“Timed out waiting for the CRS stack to start.”问题,具体的检查解决过程如下。
[root@dbserver2 ~]# ?/apps/oracle/11.2.0/grid/root.sh
-bash: ? command not found
[root@dbserver2 ~]# /apps/oracle/11.2.0/grid/root.sh
Running Oracle 11g root.sh script...
The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /apps/oracle/11.2.0/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]:
Copying dbhome to /usr/local/bin ...
Copying oraenv to /usr/local/bin ...
Copying coraenv to /usr/local/bin ...
Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.
2012-06-07 09:25:48: Parsing the host name
2012-06-07 09:25:48: Checking for super user privileges
2012-06-07 09:25:48: User has super user privileges
Using configuration parameter file: /apps/oracle/11.2.0/grid/crs/install/crsconfig_params
Creating trace directory
LOCAL ADD MODE
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Adding daemon to inittab
CRS-4123: Oracle High Availability Services has been started.
ohasd is starting
ADVM/ACFS is not supported on centos-release-4-8
CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node dbserver1, number 1, and is terminating
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'dbserver2'
CRS-2677: Stop of 'ora.cssdmonitor' on 'dbserver2' succeeded
An active cluster was found during exclusive startup, restarting to join the cluster
CRS-2672: Attempting to start 'ora.mdnsd' on 'dbserver2'
CRS-2676: Start of 'ora.mdnsd' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'dbserver2'
CRS-2676: Start of 'ora.gipcd' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'dbserver2'
CRS-2676: Start of 'ora.gpnpd' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'dbserver2'
CRS-2676: Start of 'ora.cssdmonitor' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'dbserver2'
CRS-2672: Attempting to start 'ora.diskmon' on 'dbserver2'
CRS-2676: Start of 'ora.diskmon' on 'dbserver2' succeeded
CRS-2676: Start of 'ora.cssd' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'dbserver2'
CRS-2676: Start of 'ora.ctssd' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'dbserver2'
CRS-2676: Start of 'ora.asm' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'dbserver2'
CRS-2676: Start of 'ora.crsd' on 'dbserver2' succeeded
CRS-2672: Attempting to start 'ora.evmd' on 'dbserver2'
CRS-2676: Start of 'ora.evmd' on 'dbserver2' succeeded
Timed out waiting for the CRS stack to start.
[root@dbserver2 ~]# su - grid
[grid@dbserver2 ~]$ asmcmd
ASMCMD> ls
CRS/
ASMCMD> exit
[grid@dbserver2 ~]$
查看日志:
[root@dbserver2 dbserver2]# tail -100 alertdbserver2.log
ctssd(13516)]CRS-2403:The Cluster Time Synchronization Service on host dbserver2 is in observer mode.
2012-06-07 09:29:30.448
[ctssd(13516)]CRS-2407:The new Cluster Time Synchronization Service reference node is host dbserver1.
2012-06-07 09:29:30.462
[ctssd(13516)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log.
2012-06-07 09:29:30.462
[ctssd(13516)]CRS-2409:The clock on host dbserver2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2012-06-07 09:29:30.930
[ctssd(13516)]CRS-2401:The Cluster Time Synchronization Service started on host dbserver2.
2012-06-07 09:30:16.981
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:19.320
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:21.479
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:23.641
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:26.077
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:28.650
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:31.091
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:33.262
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:36.844
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:39.175
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:41.394
[ohasd(12955)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 09:30:41.395
[ohasd(12955)]CRS-2771:Maximum restart attempts reached for resource 'ora.crsd'; will not restart.
2012-06-07 10:00:04.931
[ctssd(13516)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log.
2012-06-07 10:00:04.933
[ctssd(13516)]CRS-2409:The clock on host dbserver2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2012-06-07 10:31:19.688
[ctssd(13516)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log.
2012-06-07 10:31:19.690
[ctssd(13516)]CRS-2409:The clock on host dbserver2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
[root@dbserver2 dbserver2]#
原来这次是由于时钟同步问题导致的。
尝试次时间服务停掉,使用集群时间同步
[root@dbserver1 ~]# service ntpd stop
[root@dbserver2 dbserver2]# service ntpd stop
重新执行,如下:
[root@dbserver2 ~]# cd /apps/oracle/11.2.0/grid/crs/install/
[root@dbserver2 install]# /apps/oracle/11.2.0/grid/crs/install/rootcrs.pl -verbose -deconfig -force
……
error: package cvuqdisk is not installed
Successfully deconfigured Oracle clusterware stack on this node
[root@dbserver2 ~]# /apps/oracle/11.2.0/grid/root.sh
……
CRS-2676: Start of 'ora.evmd' on 'dbserver2' succeeded
Timed out waiting for the CRS stack to start.
[root@dbserver2 dbserver2]# tail -100 /apps/oracle/11.2.0/grid/log/dbserver2/alertdbserver2.log
还有时间问题,如下:
2012-06-07 10:42:47.726
[ctssd(18229)]CRS-2407:The new Cluster Time Synchronization Service reference node is host dbserver1.
2012-06-07 10:42:47.746
[ctssd(18229)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log.
2012-06-07 10:42:47.746
[ctssd(18229)]CRS-2409:The clock on host dbserver2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2012-06-07 10:42:48.651
[ctssd(18229)]CRS-2401:The Cluster Time Synchronization Service started on host dbserver2.
2012-06-07 10:43:15.823
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:18.045
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:20.207
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:22.771
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:24.988
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:27.495
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:29.708
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:32.015
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:34.427
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:36.730
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:38.913
[ohasd(17697)]CRS-2765:Resource 'ora.crsd' has failed on server 'dbserver2'.
2012-06-07 10:43:38.914
[ohasd(17697)]CRS-2771:Maximum restart attempts reached for resource 'ora.crsd'; will not restart.
打开两台机器上的时间服务,再次重新执行,还是不行,日志报错如下:
[ctssd(19775)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log.
2012-06-09 10:04:36.307
[ctssd(19775)]CRS-2409:The clock on host dbserver2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2012-06-09 10:35:09.642
[ctssd(19775)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log.
2012-06-09 10:35:09.644
[ctssd(19775)]CRS-2409:The clock on host dbserver2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
经检查时间,再一次不同步?
[grid@dbserver2 ~]$ ssh dbserver1 date > a.txt|ssh dbserver2 date >> a.txt
[grid@dbserver2 ~]$ cat a.txt
Sat Jun 9 19:14:25 CST 2012
Sat Jun 9 18:58:10 CST 2012
[grid@dbserver2 ~]$
经过一次同步:
[root@dbserver2 ~]# service ntpd restart
Shutting down ntpd: [ OK ]
ntpd: Synchronizing with time server: [ OK ]
Starting ntpd: [ OK ]
[root@dbserver2 ~]#
然后观察:
[grid@dbserver1 ~]$ ssh dbserver1 date > a.txt|ssh dbserver2 date >> a.txt
[grid@dbserver1 ~]$ cat a.txt
Sat Jun 9 19:15:42 CST 2012
Sat Jun 9 19:15:42 CST 2012
[grid@dbserver1 ~]$ ssh dbserver1 date > a.txt|ssh dbserver2 date >> a.txt
[grid@dbserver1 ~]$ cat a.txt
Sat Jun 9 19:15:52 CST 2012
Sat Jun 9 19:15:51 CST 2012
[grid@dbserver1 ~]$ ssh dbserver1 date > a.txt|ssh dbserver2 date >> a.txt
[grid@dbserver1 ~]$ cat a.txt
Sat Jun 9 19:16:02 CST 2012
Sat Jun 9 19:16:00 CST 2012
[grid@dbserver1 ~]$ ssh dbserver1 date > a.txt|ssh dbserver2 date >> a.txt
[grid@dbserver1 ~]$ cat a.txt
Sat Jun 9 19:16:37 CST 2012
Sat Jun 9 19:16:31 CST 2012
[grid@dbserver1 ~]$
可以看到,时间间距在慢慢变大,至少有一个时间不准确
从这个文章参考到了原因以及解决办法:
http://hi.baidu.com/shayusir/blog/item/74ac1df68d85852e5d600845.html
原因是:
Linux 2.6核心里把系统计时器的频率加高到1000 Hz ,VMware没办法真的每隔1ms就报一次讯号给guestOS,所以guestOS里的Linux 2.6无法确实接到计时器的讯号。本来这也不应该造成问题,不过2.6核心处理这个"tick loss"的程式有问题,以致于guestOS里的Linux 2.6系统时间走一秒会慢一秒;外面过了两秒里面只过了一秒。
我的解决办法是,一分钟同步一次。
Shutting down ntpd: [ OK ]
[root@dbserver2 ~]# ntpdate dbserver1
9 Jun 19:29:55 ntpdate[13122]: step time server 10.27.76.183 offset 80.589517 sec
[root@dbserver2 ~]#
[root@dbserver2 cron]# crontab -l
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
* * * * * ntpdate dbserver1>/dev/null 2>&1 &
[root@dbserver2 ~]# service crond restart
Stopping crond: [ OK ]
Starting crond: [ OK ]
[root@dbserver2 ~]#
结果如下:
[grid@dbserver1 ~]$ ssh dbserver1 date > a.txt|ssh dbserver2 date >> a.txt
[grid@dbserver1 ~]$ cat a.txt
Sat Jun 9 19:57:08 CST 2012
Sat Jun 9 19:55:51 CST 2012
[grid@dbserver1 ~]$ ssh dbserver1 date > a.txt|ssh dbserver2 date >> a.txt
[grid@dbserver1 ~]$ cat a.txt
Sat Jun 9 19:57:39 CST 2012
Sat Jun 9 19:57:37 CST 2012
[grid@dbserver1 ~]$
很不幸的是,还是不能解决当前的问题。
[ctssd(16953)]CRS-2412:The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time. Details in /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log.
2012-06-09 20:07:44.619
[ctssd(16953)]CRS-2409:The clock on host dbserver2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
[root@dbserver2 cron]# tail -f /apps/oracle/11.2.0/grid/log/dbserver2/ctssd/octssd.log
odenum [1] hostname [dbserver1] )
2012-06-09 20:13:28.725: [ CTSS][119737248]ctsselect_msm: Sync interval returned in [4]
2012-06-09 20:13:29.727: [ CTSS][119737248]ctsselect_msm: CTSS mode is [66]
2012-06-09 20:13:29.730: [ CTSS][119737248]ctssslave_swm17: LT [1339244009sec 730191usec], MT [1339244011sec 303569usec], Delta [3007usec]
2012-06-09 20:13:29.730: [ CTSS][119737248]ctssslave_swm19: The offset is [-1573378 usec] and sync interval set to [4]
2012-06-09 20:13:29.730: [ CTSS][119737248]ctssslave_sync_with_master: Received from master (mode [0x6e] nodenum [1] hostname [dbserver1] )
2012-06-09 20:13:29.730: [ CTSS][119737248]ctsselect_msm: Sync interval returned in [4]
2012-06-09 20:13:30.732: [ CTSS][119737248]ctsselect_msm: CTSS mode is [66]
2012-06-09 20:13:30.734: [ CTSS][119737248]ctssslave_swm17: LT [1339244010sec 734589usec], MT [1339244012sec 225815usec], Delta [2037usec]
2012-06-09 20:13:30.734: [ CTSS][119737248]ctssslave_swm19: The offset is [-1491226 usec] and sync interval set to [4]
2012-06-09 20:13:30.734: [ CTSS][119737248]ctssslave_sync_with_master: Received from master (mode [0x6e] nodenum [1] hostname [dbserver1] )
20
在杨的博客上找到:Oracle的时间同步只能在后台选择一个可接受的时间间隔,大概是0.13秒,而我现在一分钟同步一次,远大于这个值,貌似crontab最小间隔就是一分钟了。
相关文章推荐
- Linux下安装Oracle11G R2过程中遇到的问题
- linux下mysql的卸载、安装全过程及遇到"MySQL提示:The server quit without updating PID file问题的解决办法"
- Linux下安装Oracle11G R2详解
- Oracle学习笔记安装篇之在Redhat Enterprise Linux 6.3 x86_64下安装Oracle11g R2
- Linux(CentOS 7.0)安装Oracle11g R2 64位
- 【转】一步一步在Linux上安装 Oracle11g R2 RAC
- Linux静默安装Oracle11g R2
- IBM X3400服务器安装Linux操作系统和Oracle11G R2 64位数据库
- 分享red hat linux 6上安装oracle11g时遇到的gcc: error trying to exec 'cc1': execvp: No such file or directory的问题处理过程
- 在VMware上编译linux内核出现VFS: Cannot open root device "LABEL=/" or unknown-block(0,0) 错误的解决方法
- Oracle学习笔记安装篇之在Redhat Enterprise Linux 6.3 x86_64下安装Oracle11g R2
- VMware下Linux安装VMWare Tools 后无法启动图形界面,出现"no screens found" 的解决办法
- linux下安装oracle11g R2
- linux下安装VMware出错:Gtk-Message: Failed to load module "canberra-gtk-module"解决方法
- Oracle学习笔记安装篇之在Redhat Enterprise Linux 6.3 x86_64下安装Oracle11g R2
- Oracle学习笔记安装篇之在Redhat Enterprise Linux 7.0 x86_64下安装Oracle11g R2
- VMware安装VMware tool是 遇到The path "" is not a valid path to the 3.10.0-693.el7.x86_64 kernel headers.
- Oracle学习笔记安装篇之在Redhat Enterprise Linux 7.0 x86_64下安装Oracle11g R2
- 在Linux下安装配置Oracle11g R2
- linux 使用rpm安装软件时,遇到"warning: rpmts_HdrFromFdno: Header V3 RSA/SHA256 Signature, key ID fd431d51: NO