假设我们的RAC环境中OCR磁盘和votedisk磁盘全部被破坏,并且都没有备份,那么我们该如何恢复我们的RAC环境。
最近简单的办法就是重新初始化我们的ocr盘和votedisk盘,把集群中的所有相关资源重新注册到OCR磁盘和votedisk磁盘中。
1.停掉所有节点的Clusterware Stack
[root@rac3 bin]# ./crsctl stop crsStopping resources.Successfully stopped CRS resources Stopping CSSD.Shutting down CSS daemon.Shutdown request successfully issued.[root@rac4 bin]# ./crsctl stop crsStopping resources.Successfully stopped CRS resources Stopping CSSD.Shutting down CSS daemon.Shutdown request successfully issued.
2.为安全期间,我们先备份一下ocr和votedisk
为防止我们的实验失败,我们先备份一下ocr盘和votedisk盘。
当然,在正式的RAC环境中是不会不备份ocr盘和votedisk盘,我们模拟的也是一种极端情况。[root@rac3 bin]# ./crsctl query css votedisk 0. 0 /dev/raw/raw2located 1 votedisk(s).[root@rac3 bin]# dd if=/dev/raw/raw2 of=/home/oracle/votedisk.bak208864+0 records in208864+0 records out[root@rac3 bin]# ./ocrcheckStatus of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 104344 Used space (kbytes) : 4340 Available space (kbytes) : 100004 ID : 1887132889 Device/File Name : /dev/raw/raw1 Device/File integrity check succeeded Device/File not configured Cluster registry integrity check succeeded[root@rac3 bin]# ./ocrconfig -export /home/oracle/ocr.bak
3.我们先破坏一下ocr和votedisk
[root@rac3 bin]# dd if=/dev/zero of=/dev/raw/raw1 bs=1M count=100100+0 records in100+0 records out[root@rac3 bin]# dd if=/dev/zero of=/dev/raw/raw1 bs=1M count=130dd: writing `/dev/raw/raw1': No space left on device102+0 records in101+0 records out[root@rac3 bin]# dd if=/dev/zero of=/dev/raw/raw2 bs=1M count=130 dd: writing `/dev/raw/raw2': No space left on device102+0 records in101+0 records out
现在ocr和votedisk已经被我们破坏,目前我们的RAC肯定是启不来了。
现在我们就利用重建的方式重新把信息注册到ocr和votedisk中去。
4.分别在每个节点上执行$CRS_HOME/install/rootdele.sh
[root@rac3 install]# ./rootdelete.sh Shutting down Oracle Cluster Ready Services (CRS):OCR initialization failed with invalid format: PROC-22: The OCR backend has an invalid formatShutdown has begun. The daemons should exit soon.Checking to see if Oracle CRS stack is down...Oracle CRS stack is not running.Oracle CRS stack is down now.Removing script for Oracle Cluster Ready servicesUpdating ocr file for downgradeCleaning up SCR settings in '/etc/oracle/scls_scr'
[root@rac4 install]# ./rootdelete.sh Shutting down Oracle Cluster Ready Services (CRS):OCR initialization failed with invalid format: PROC-22: The OCR backend has an invalid formatShutdown has begun. The daemons should exit soon.Checking to see if Oracle CRS stack is down...Oracle CRS stack is not running.Oracle CRS stack is down now.Removing script for Oracle Cluster Ready servicesUpdating ocr file for downgradeCleaning up SCR settings in '/etc/oracle/scls_scr'
5.在任意一个节点上执行脚本$CRS_HOME/install/rootdeinstall.sh
只需要一个节点上执行即可
[root@rac3 install]# ./rootdeinstall.sh Removing contents from OCR mirror device2560+0 records in2560+0 records outRemoving contents from OCR device2560+0 records in2560+0 records out
6.在和步骤5同一个节点上执行$CRS_HOME/root.sh脚本
[root@rac3 crs_1]# ./root.shWARNING: directory '/opt/ora10g/product/10.2.0' is not owned by rootWARNING: directory '/opt/ora10g/product' is not owned by rootWARNING: directory '/opt/ora10g' is not owned by rootWARNING: directory '/opt' is not owned by rootChecking to see if Oracle CRS stack is already configuredSetting the permissions on OCR backup directorySetting up NS directoriesOracle Cluster Registry configuration upgraded successfullyWARNING: directory '/opt/ora10g/product/10.2.0' is not owned by rootWARNING: directory '/opt/ora10g/product' is not owned by rootWARNING: directory '/opt/ora10g' is not owned by rootWARNING: directory '/opt' is not owned by rootSuccessfully accumulated necessary OCR keys.Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.node: node 1: rac3 rac3-priv rac3node 2: rac4 rac4-priv rac4Creating OCR keys for user 'root', privgrp 'root'..Operation successful.Now formatting voting device: /dev/raw/raw2Format of 1 voting devices complete.Startup will be queued to init within 90 seconds.Adding daemons to inittabExpecting the CRS daemons to be up within 600 seconds.CSS is active on these nodes. rac3CSS is inactive on these nodes. rac4Local node checking complete.Run root.sh on remaining nodes to start CRS daemons.
7.在其他节点执行$CRS_HOME/root.sh脚本,注意最后一个节点的输出
[root@rac4 crs_1]# ./root.sh WARNING: directory '/opt/ora10g/product/10.2.0' is not owned by rootWARNING: directory '/opt/ora10g/product' is not owned by rootWARNING: directory '/opt/ora10g' is not owned by rootWARNING: directory '/opt' is not owned by rootChecking to see if Oracle CRS stack is already configuredCurrent Oracle Cluster Registry mirror location '/dev/raw/raw7' in '/etc/oracle/ocr.loc' and '' does not matchUpdate either '/etc/oracle/ocr.loc' to use '' or variable CRS_OCR_LOCATIONS in rootconfig.sh with '/dev/raw/raw7' then rerun rootconfig.sh
发现有报错,报错信息应该是ocr mirror location和当前不匹配,这是之前我们试验ocr转移位置时留下的(/dev/raw/raw7),
/etc/oracle/ocr.loc文件里我们已经ocrmirrorconfig_loc参数注释掉了,系统怎么还能看得到那??http://www.cnblogs.com/myrunning/p/4253696.html
[root@rac4 oracle]# cat ocr.loc #Device/file /dev/raw/raw1 getting replaced by device /dev/raw/raw8 ocrconfig_loc=/dev/raw/raw1#ocrmirrorconfig_loc=/dev/raw/raw7local_only=false
我们把/etc/oracle/ocr.loc文件的"#ocrmirrorconfig_loc=/dev/raw/raw7" 去掉,重新执行$CRS_HOME/root.sh
[root@rac4 oracle]# cat ocr.loc #Device/file /dev/raw/raw1 getting replaced by device /dev/raw/raw8 ocrconfig_loc=/dev/raw/raw1local_only=false
去掉"#ocrmirrorconfig_loc=/dev/raw/raw7",重新执行$CRS_HOME/root.sh,发现问题解决:
[root@rac4 crs_1]# ./root.sh WARNING: directory '/opt/ora10g/product/10.2.0' is not owned by rootWARNING: directory '/opt/ora10g/product' is not owned by rootWARNING: directory '/opt/ora10g' is not owned by rootWARNING: directory '/opt' is not owned by rootChecking to see if Oracle CRS stack is already configuredSetting the permissions on OCR backup directorySetting up NS directoriesOracle Cluster Registry configuration upgraded successfullyWARNING: directory '/opt/ora10g/product/10.2.0' is not owned by rootWARNING: directory '/opt/ora10g/product' is not owned by rootWARNING: directory '/opt/ora10g' is not owned by rootWARNING: directory '/opt' is not owned by rootclscfg: EXISTING configuration version 3 detected.clscfg: version 3 is 10G Release 2.Successfully accumulated necessary OCR keys.Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.node: node 1: rac3 rac3-priv rac3node 2: rac4 rac4-priv rac4clscfg: Arguments check out successfully.NO KEYS WERE WRITTEN. Supply -force parameter to override.-force is destructive and will destroy any previous clusterconfiguration.Oracle Cluster Registry for cluster has already been initializedStartup will be queued to init within 90 seconds.Adding daemons to inittabExpecting the CRS daemons to be up within 600 seconds.CSS is active on these nodes. rac3 rac4CSS is active on all nodes.Waiting for the Oracle CRSD and EVMD to startOracle CRS stack installed and running under init(1M)Running vipca(silent) for configuring nodeappsThe given interface(s), "eth0" is not public. Public interfaces should be used to configure virtual IPs.
发现问题,由于"eth0" is not public.vipca没有执行成功,这需要我们手动地在这个节点上执行vipca
8.重新执行vipca命令
执行vipca报出Error 0(Native: listNetInterfaces:[3])错误,如图:
这是因为我们需要重新设置一下RAC的公共网络及私有网络:
使用root用户重新手动执行vipca:
9.验证ONS/GSD/VIP有没有正常注册到集群中
[oracle@rac4 ~]$ crs_stat -tName Type Target State Host ------------------------------------------------------------ora.rac3.gsd application ONLINE ONLINE rac3 ora.rac3.ons application ONLINE ONLINE rac3 ora.rac3.vip application ONLINE ONLINE rac4 ora.rac4.gsd application ONLINE ONLINE rac4 ora.rac4.ons application ONLINE ONLINE rac4 ora.rac4.vip application ONLINE ONLINE rac4
10.使用netca命令重新配置监听
使用netca命令配置监听,该命令会自动把Listener注册到Clusterware中。
使用oracle用户手动执行netca命令:
确认一下我们刚刚配置的Listener有没有注册到监听中:
[oracle@rac4 ~]$ crs_stat -tName Type Target State Host ------------------------------------------------------------ora....C3.lsnr application ONLINE ONLINE rac3 ora.rac3.gsd application ONLINE ONLINE rac3 ora.rac3.ons application ONLINE ONLINE rac3 ora.rac3.vip application ONLINE ONLINE rac3 ora....C4.lsnr application ONLINE ONLINE rac4 ora.rac4.gsd application ONLINE ONLINE rac4 ora.rac4.ons application ONLINE ONLINE rac4 ora.rac4.vip application ONLINE ONLINE rac4
可以看到至此,我们把listener、ons、gsd、vip都已经注册到ocr中了,下一步还需要把ASM、数据库注册到ocr中我们的实验就完成了。
11.把ASM注册到OCR中
[oracle@rac3 ~]$ srvctl add asm -n rac3 -i +ASM1 -o /opt/ora10g/product/10.2.0/db_1[oracle@rac3 ~]$ srvctl add asm -n rac4 -i +ASM2 -o /opt/ora10g/product/10.2.0/db_1 [oracle@rac3 ~]$ [oracle@rac3 ~]$ srvctl config asm -n rac3+ASM1 /opt/ora10g/product/10.2.0/db_1[oracle@rac3 ~]$ srvctl config asm -n rac4+ASM2 /opt/ora10g/product/10.2.0/db_1
12.启动ASM验证
[oracle@rac3 ~]$ srvctl start asm -n rac3[oracle@rac3 ~]$ srvctl start asm -n rac4[oracle@rac3 ~]$ crs_stat -tName Type Target State Host ------------------------------------------------------------ora....SM1.asm application ONLINE ONLINE rac3 ora....C3.lsnr application ONLINE ONLINE rac3 ora.rac3.gsd application ONLINE ONLINE rac3 ora.rac3.ons application ONLINE ONLINE rac3 ora.rac3.vip application ONLINE ONLINE rac3 ora....SM2.asm application ONLINE ONLINE rac4 ora....C4.lsnr application ONLINE ONLINE rac4 ora.rac4.gsd application ONLINE ONLINE rac4 ora.rac4.ons application ONLINE ONLINE rac4 ora.rac4.vip application ONLINE ONLINE rac4
通过上面的输出可以看出ASM已经成功启动,启动的时候可以关注一下asm的启动日志,方便有错误的时候即使发现问题:
[oracle@rac3 bdump]$ pwd
/opt/ora10g/admin/+ASM/bdump[oracle@rac3 bdump]$ tail -f alert_+ASM1.log
13.把数据库注册到OCR中
[oracle@rac3 ~]$ srvctl add database -d racdb -o $ORACLE_HOME[oracle@rac3 ~]$ crs_stat -tName Type Target State Host ------------------------------------------------------------ora....SM1.asm application ONLINE ONLINE rac3 ora....C3.lsnr application ONLINE ONLINE rac3 ora.rac3.gsd application ONLINE ONLINE rac3 ora.rac3.ons application ONLINE ONLINE rac3 ora.rac3.vip application ONLINE ONLINE rac3 ora....SM2.asm application ONLINE ONLINE rac4 ora....C4.lsnr application ONLINE ONLINE rac4 ora.rac4.gsd application ONLINE ONLINE rac4 ora.rac4.ons application ONLINE ONLINE rac4 ora.rac4.vip application ONLINE ONLINE rac4 ora.racdb.db application OFFLINE OFFLINE
14.把实例注册到OCR中
[oracle@rac3 ~]$ srvctl add instance -d racdb -n rac3 -i racdb1[oracle@rac3 ~]$ srvctl add instance -d racdb -n rac4 -i racdb2[oracle@rac3 ~]$ crs_stat -tName Type Target State Host ------------------------------------------------------------ora....SM1.asm application ONLINE ONLINE rac3 ora....C3.lsnr application ONLINE ONLINE rac3 ora.rac3.gsd application ONLINE ONLINE rac3 ora.rac3.ons application ONLINE ONLINE rac3 ora.rac3.vip application ONLINE ONLINE rac3 ora....SM2.asm application ONLINE ONLINE rac4 ora....C4.lsnr application ONLINE ONLINE rac4 ora.rac4.gsd application ONLINE ONLINE rac4 ora.rac4.ons application ONLINE ONLINE rac4 ora.rac4.vip application ONLINE ONLINE rac4 ora.racdb.db application OFFLINE OFFLINE ora....b1.inst application OFFLINE OFFLINE ora....b2.inst application OFFLINE OFFLINE
15.修改实例和ASM实例的依赖关系
[oracle@rac3 ~]$ srvctl modify instance -d racdb -i racdb1 -s +ASM1[oracle@rac3 ~]$ srvctl modify instance -d racdb -i racdb2 -s +ASM2 [oracle@rac3 ~]$
16.启动数据库进行验证
[oracle@rac3 ~]$ srvctl start database -d racdb[oracle@rac3 ~]$ crs_stat -tName Type Target State Host ------------------------------------------------------------ora....SM1.asm application ONLINE ONLINE rac3 ora....C3.lsnr application ONLINE ONLINE rac3 ora.rac3.gsd application ONLINE ONLINE rac3 ora.rac3.ons application ONLINE ONLINE rac3 ora.rac3.vip application ONLINE ONLINE rac3 ora....SM2.asm application ONLINE ONLINE rac4 ora....C4.lsnr application ONLINE ONLINE rac4 ora.rac4.gsd application ONLINE ONLINE rac4 ora.rac4.ons application ONLINE ONLINE rac4 ora.rac4.vip application ONLINE ONLINE rac4 ora.racdb.db application ONLINE ONLINE rac3 ora....b1.inst application ONLINE ONLINE rac3 ora....b2.inst application ONLINE ONLINE rac4
现在可以看到我们注册到CRS中的数据已经正常启动了
[oracle@rac3 ~]$ sqlplus '/as sysdba'SQL*Plus: Release 10.2.0.1.0 - Production on Thu Jan 29 12:47:02 2015Copyright (c) 1982, 2005, Oracle. All rights reserved.Connected to:Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit ProductionWith the Partitioning, Real Application Clusters, OLAP and Data Mining optionsSQL> col name for a50SQL> select * from v$dbfile; FILE# NAME---------- -------------------------------------------------- 4 +DATA/racdb/datafile/users.259.845203503 3 +DATA/racdb/datafile/sysaux.257.845203501 2 +DATA/racdb/datafile/undotbs1.258.845203501 1 +DATA/racdb/datafile/system.256.845203499 5 +DATA/racdb/datafile/undotbs2.264.845203661 6 +DATA/racdb/datafile/rlst.268.8526574656 rows selected.SQL>
17.手动添加service到ocr中
[oracle@rac3 ~]$ srvctl add service -d racdb -s racdbservice -r racdb1 -a racdb2 -P BASIC[oracle@rac3 ~]$ crs_stat -tName Type Target State Host ------------------------------------------------------------ora....SM1.asm application ONLINE ONLINE rac3 ora....C3.lsnr application ONLINE ONLINE rac3 ora.rac3.gsd application ONLINE ONLINE rac3 ora.rac3.ons application ONLINE ONLINE rac3 ora.rac3.vip application ONLINE ONLINE rac3 ora....SM2.asm application ONLINE ONLINE rac4 ora....C4.lsnr application ONLINE ONLINE rac4 ora.rac4.gsd application ONLINE ONLINE rac4 ora.rac4.ons application ONLINE ONLINE rac4 ora.rac4.vip application ONLINE ONLINE rac4 ora.racdb.db application ONLINE ONLINE rac3 ora....b1.inst application ONLINE ONLINE rac3 ora....b2.inst application ONLINE ONLINE rac4 ora....vice.cs application OFFLINE OFFLINE ora....db1.srv application OFFLINE OFFLINE
[oracle@rac3 ~]$ srvctl start service -d racdb[oracle@rac3 ~]$ crs_stat -tName Type Target State Host ------------------------------------------------------------ora....SM1.asm application ONLINE ONLINE rac3 ora....C3.lsnr application ONLINE ONLINE rac3 ora.rac3.gsd application ONLINE ONLINE rac3 ora.rac3.ons application ONLINE ONLINE rac3 ora.rac3.vip application ONLINE ONLINE rac3 ora....SM2.asm application ONLINE ONLINE rac4 ora....C4.lsnr application ONLINE ONLINE rac4 ora.rac4.gsd application ONLINE ONLINE rac4 ora.rac4.ons application ONLINE ONLINE rac4 ora.rac4.vip application ONLINE ONLINE rac4 ora.racdb.db application ONLINE ONLINE rac3 ora....b1.inst application ONLINE ONLINE rac3 ora....b2.inst application ONLINE ONLINE rac4 ora....vice.cs application ONLINE ONLINE rac3 ora....db1.srv application ONLINE ONLINE rac3
至此,我们已经正确的重新初始化了我们的OCR盘和VoteDisk盘,并且没有用到备份。