您的位置:首页 > 大数据 > 人工智能

Repairing or Restoring an Inconsistent OCR in RAC

2015-08-25 15:53 651 查看
note:268937.1 

PURPOSE
-------

To provide a method of repairing or restoring an inconsistent OCR.

SCOPE & APPLICATION
-------------------

This article is intended for DBAs and Support Engineers who need to correct
an inconsistent OCR (Oracle Configuration Repository).

REPAIRING OR RESTORING AN INCONSISTENT OCR (ORACLE CONFIGURATION REPOSITORY)
----------------------------------------------------------------------------

If you have encountered a condition where you are unable to add or remove a CRS
resource, your OCR file may be inconsistent.  Example error when removing a CRS
resource with SRVCTL:

  PRKS-1028 : Configuration for <CRS resource> does not exist in cluster registry.

  or

  CRS-0210: Could not find resource  

Example error when adding a resource with SRVCTL:

  PRKS-1003 : Failed to register CRS resource...

POSSIBLE CAUSES OF OCR INCONSISTENCY
------------------------------------

1. The "-f" option of srvctl.  Do not use the -f (force) option in srvctl, the
-f flag causes it to ignore any errors and will remove whatever pieces it can
anyway.  If you don't use "-f" on the remove command, then srvctl would stop when
it encountered an error, and would not remove the OCR entries.  You can run
srvctl remove without the -f as many times as you need without causing an
inconsistency.  The only time a "-f" option should be used is if you have
verified that you have an inconsistency in the OCR and you are trying to
correct it.

2. CRS_Unregister was used when it was not needed.  You should use srvctl remove
to remove CRS resources.  CRS_Unregister should only be used if you are trying
to repair an inconsistency.

3. The rootdeletenode.sbs script has a "-f" srvctl command in it.  This is a
known issue.  Best practice is to modify rootdeletenode.sbs to not do a "-f".
4. Another potential cause of inconsistency is running multiple commands to
configure the same object at the same time.  Like multiple "srvctl add service"
commands to add different services to the same database at once.
FILES TO GATHER FOR OCR INCONSISTENCY
-------------------------------------

If you have a reproducable testcase that creates inconsistency in the OCR,
please open a service request and provide the exact steps to Oracle Support
Services.  The following are some files that may need to be reviewed for issues
involving OCR inconsistencies:

- CRS Log files:

  cd $ORA_CRS_HOME
  tar cf /var/backup/crs.tar crs/init crs/log css/init css/log evm/init evm/log srvm/log

- OCR dump file - To get this cd to $ORA_CRS_HOME/bin as the root user and issue
"ocrdump".  This will generate two files (ocrdump.log and OCRDUMPFILE).  

- $ORA_CRS_HOME/bin/crs_stat -u and $ORA_CRS_HOME/bin/crs_stat -p output

- srvctl config output:

  srvctl config database -d <db_name> -a
  srvctl config service -d <db_name> -a
  srvctl config nodeapps -n <node name>
  srvctl config nodeapps -n <node name> -a -g -s -l
  srvctl config asm -n <node name>

- Trace output of SRVCTL commands.  Set the environment variable SRVM_TRACE to
true prior to running srvctl commands.  

FIXING INCONSISTENT OCR FILES
-----------------------------

The following methods can be used to correct an inconsistent OCR file:

Method 1: Repair the OCR
Method 2: Restore the OCR from a backup.
Method 3: Re-install CRS

METHOD 1 - REPAIR THE OCR
-------------------------

If you cannot use srvctl to remove a CRS resource, find out if there is
information missing in CRS or in SRVM.  To do this, run
$ORA_CRS_HOME/bin/crs_stat to see if the CRS resource exists.  Also use
one of the following srvctl commands to see if the resource exists in
SRVM:

  srvctl config database -d <db_name> -a
  srvctl config service -d <db_name> -a
  srvctl config nodeapps -n <node name>
  srvctl config nodeapps -n <node name> -a -g -s -l
  srvctl config asm -n <node name>

You may either see information missing from the srvctl config command
or from the crs_stat command.  If you are unable to find an
inconsistency and you cannot remove the resource with srvctl, open a
service request or proceed to method 2.

SRVCTL CONFIG DATA IS MISSING BUT CRS_STAT INFORMATION IS PRESENT
-----------------------------------------------------------------

If information is missing from srvctl config, you can attempt to use
crs_unregister.  You should only use crs_unregister commands with the
understanding that your OCR may need to be restored anyway.  If
crs_unregister does not work you may need to either restore or re-create
your OCR.  To use crs_unregister:

1. Get the resource name of the resource you are trying to remove with
   $ORA_CRS_HOME/bin/crs_stat.  Example:

   cd $ORA_CRS_HOME/bin
   ./crs_stat

   You will see CRS resources, example of a CRS resource:

   NAME=ora.V10SN.V10SN2.inst
   TYPE=application
   TARGET=ONLINE
   STATE=ONLINE on opcbsol2

2. Attempt to unregister the CRS resource with crs_unregister.  Example:

   cd $ORA_CRS_HOME/bin
   ./crs_unregister ora.V10SN.V10SN2.inst

3. If your original goal was to add a CRS resource, try to use srvctl to add
   the resource.  If your original goal was to remove a CRS resource, verify
   that it is removed from crs_stat and srvctl config.  

If there is still a problem with the OCR, proceed to method 2.

CRS_STAT DATA IS MISSING BUT SRVCTL CONFIG INFORMATION IS PRESENT
-----------------------------------------------------------------

If information is missing from crs_stat, you can attempt to use
srvctl remove -f.  You should only use srvctl remove -f commands with the
understanding that your OCR may need to be restored anyway.  If
srvctl remove -f does not work you may need to either restore or re-create
your OCR.  To use srvctl remove -f:

1. Determine what resource exists in srvctl config but is missing from crs_stat

2. Remove the resource with the -f option in srvctl.

   srvctl remove database -d <database-name> -f
   srvctl remove instance  -d <database-name> [-i <instance-name>] -f
   srvctl remove service -d <database-name> -s <service-name> [-i <instance-name>] -f
   srvctl remove nodeapps -n <node-name>  -f

3. If your original goal was to add a CRS resource, try to use srvctl to add
   the resource.  If your original goal was to remove a CRS resource, verify
   that it is removed from crs_stat and srvctl config.  

METHOD 2 - RESTORE THE OCR FROM A BACKUP
----------------------------------------

If crs_unregister or srvctl remove -f does not fix the OCR problem, you may
need to restore the OCR from a backup.  Oracle automatically takes backups of
the OCR every 4 hours.  Oracle also keeps the last 3 backups, up to 4 hours old,
one day old, and one week old available.  Here are the steps for restoring the
OCR.

1. Find out what time the problem that the inconsistency in the OCR occurred.

2. Find an OCR backup from a time prior to when the inconsistency occurred.  
   To do this cd to $ORA_CRS_HOME/cdata/<cluster name> or run
   $ORA_CRS_HOME/bin/ocrconfig -showbackup.  Example:

   # pwd
   /t02/app/oracle/product/crs/cdata/crs_opcbsol
   # ls -ltr
   total 46560
   -rw-r-----   1 root     root     3960832 Apr 12 19:53 week.ocr
   -rw-r-----   1 root     root     3960832 Apr 13 03:53 day.ocr
   -rw-r-----   1 root     root     3960832 Apr 14 03:54 backup02.ocr
   -rw-r-----   1 root     root     3960832 Apr 14 03:54 day_.ocr
   -rw-r-----   1 root     root     3960832 Apr 14 07:54 backup01.ocr
   -rw-r-----   1 root     root     3960832 Apr 14 11:54 backup00.ocr

3. If you have a backup of the OCR from prior to the time of the inconsistency,
   reboot the nodes in single user mode or runlevel 1.  If you are unable to
   reboot into single user mode for some reason, you can disable CRS with:

   Sun or Linux:

        /etc/init.d/init.crs disable
        /etc/init.d/init.crs stop

   HP-UX or HP Tru64::

        /sbin/init.d/init.crs disable
        /sbin/init.d/init.crs stop

   IBM AIX:

        /etc/init.crs disable
        /etc/init.crs stop

4. After all nodes are rebooted in single user mode and/or you have verified that
   CRS is not running (ps -ef | grep crs), restore the OCR with ocrconfig.  
   Example:

   cd $ORA_CRS_HOME/bin
   ./ocrconfig -restore /t02/app/oracle/product/crs/cdata/crs_opcbsol/week.ocr

5. Re-enable CRS if it was disabled.  Example:

   Sun or Linux:

        /etc/init.d/init.crs enable

   HP-UX or HP Tru64::

        /sbin/init.d/init.crs enable

   IBM AIX:

        /etc/init.crs enable

6. Reboot the nodes.

METHOD 3 - RE-INSTALL CRS
-------------------------

If all else fails, re-install CRS.  Only do this after consulting with Oracle
Support Services and there is no reasonable way to fix the inconsistency.

1. Use Note 239998.1 to completely remove the CRS installation.  

2. Re-install CRS

3. Run the CRS root.sh as prompted at the end of the CRS install.

4. Run the root.sh in the database $ORACLE_HOME to re-run VIPCA.  This will re-
create the VIP, GSD, and ONS resources.

5. Use NETCA to re-add any listeners.

6. Add databases and instances with SRVCTL, syntax is in Note 259301.1

RELATED DOCUMENTS
-----------------

Note 259301.1 CRS and 10g Real Application Clusters
Note 178683.1 - Tracing GSD, SRVCTL, GSDCTL, and SRVCONFIG
Note 239998.1 - 10g RAC How to Clean Up After a Failed CRS Install
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  CRS-0210