OCR Updation:
Three utilities to perform OCR updates
1.
SRVCTL (recommended)
– remote administration utility
2.
DBCA (till 10.2)
3.
OEM
4.
SRVCTL – Service Control:
-
It is the most
widely used utility in RAC environment
-
It is used to
perform administration & control of OCR file
Registry sequence of services into OCR:
1.
Node applications
(automatically done in
11.2)
2.
ASM instances (automatically done in
11.2)
3.
Databases
4.
Database
instances
5.
Database
services.
Note: To unregister you have
to follow in reverse order.
OLR – Oracle Local Registry
-
Both OLR &
GPNP profile needed by lower/HAS stack & OCR, VD is needed by upper / CSR
Stack.
-
If OLR or GPNP
got corrupted, the corresponding node will go down where as if OCR, VD gets
corrupted the complete Cluster will go down.
-
Every daemon of
the node will communicate with the peer (same) daemon nodes.
-
Oracle
availability perform OLR backup at the time of execution of root.sh scrIPt of
grid infrastructure installation & stores in the location “$ GRID_HOME/cdata//backup_date_time.olr”.
-
The default
location of OLR file is “$ GRID_HOME/cdata/.olr”.
OLR Backup: Using root user
$ G_H# ./ocrconfig – local –
manual backup
$ G_H# ./ocrconfig – local –
backuploc
$ G_H# ./ocrcheck - local
Restoring OLR:
-
Bring the init
level into either init1 or init2
-
Stop the cluster
in the specific node
-
Restore the OLR
from the backup location “# ./ocrconfig – local – restore ”
-
Start cluster
-
Change the init
level to either 3 or 5 (init 3 for CLI and init 5 for GUI mode)
-
OCR – Oracle cluster registry or repository
It is a critical & shared Clusterware file and contains
the complete cluster information like cluster node name, their corresponding IP’s,
CSS parameters, OCR autobackup information & registered resources like
nodeapps, ASM instances with their corresponding node names, databases &
database instances & database services
CRSD
daemon is responsible for updating OCR file whenever the utilities like srvctl,
dbca, oem, netca etc.
CSSD
daemon automatically brings up online all the cluster resources which got
registered in OCR file
To
know the OCR location
# ./ocrcheck //
disk ocr location
# cat
/etc/oracle/ocr.loc //
In linux & HP-UX
# cat
/var/opt/oracle/ocr.loc //
in Solaris & IBM-AIX
OCR Backup method: 3 ways to perform backup
1.
Automatic
2.
Physical
3.
Logical
1. Automatic:
Oracle automatically perform
OCR backup for every regular interval of 4 hrs since the CRS start time and
stores in master node.
Identifying
the master node:
# vi $ G_H/log//crsd/crsd.log
I AM THE NEW OCR MASTER
OR
THE NEW OCR MASTER NODE IS
Backup location:
$ G_H/cdata/
Backup00.ocr (latest)
Backup01.ocr
Backup02.ocr
Day.ocr
Week.ocr
Oracle retains the latest
three 4 hours backup, similarly one latest day backup and one latest week
backup by purging all the remaining backup.
Note: It is no possible to
change the automatic backup interval time
Manual Backup:
# ./ocrconfig – manual
backup
(it will create backup in default location $ G_H/cdata//backup_date_time.ocr)
# ./ocrconfig – backuploc
(recommended is shared storage)
Restoring OCR:
-
Stop the complete
cluster on all the nodes “# ./crsctl stop crs”
-
Identify the
latest backup (backup00.ocr)
-
Restore the
backup “ # ./ocrconfig – restore
-
Start the cluster
in all the nodes
-
Check the
integrity of the restored OCR backup “# ./cluvfy comp ocr –n all –verbose”
-
2. Physical
backup: Oracle supports image or
sector level backup of OCR using dd utility(if OCR in on raw devices). &
cp,(if OCR is on general file system)
# ./ cp
# dd if= of= //if:
input file, of: output file.
Restoring:
# ./ cp
# dd if= of= //if:
input file, of: output file.
3. Logical
backup:
# ./ocrconfig – export
# ./ocrconfig – import
Note:
Oracle recommends taking the backup of OCR file whenever the cluster
configuration got modified (ex: adding a node/ deleting a node)
OCR Multiplexing: To avoid OCR lost &the complete cluster goes down
due to the single point of failure (SPF) of OCR, Oracle supports OCR multiplexing
from 10.2 onwards in max 2 locations (1 as primary other as mirror copy) but
from 11.2 onwards it is supporting max 5 locations (1 as primary and remaining
as mirror copies)
Note:
from 11.2 onwards, oracle support storage of OCR in ASM diskgroups so it
provides mirroring depending on the redundancy level.
GPNP – Grid Plug n Play Profile:
-
It contains basic
cluster information like location of voting disk, ASM spfile location, all the IP
addresses and their subnet masks
-
This is a node
specific file
-
It is and xml
formatted file
Backup loc: $ G_H/gpnp//profile/peer/profile.xml
Actual loc: $ G_H/gpnp/profile/peer/profile.xml
Voting Disk (VD):
-
It is another
& shared file which contains the node membership of all the nodes within
the cluster
-
CSSD Daemon is
responsible for sending the heartbeat messages to other nodes for every 1 sec
and write the response into VD
VD Backup:
-
Oracle supports
only physical method to take the backup of VD.
-
From 11.2
onwards, oracle not recommend to take the backup of VD because it automatically
maintains VD backup into OCR file
Restoring VD:
1.
Stop the CRS on
all the nodes
2.
Restore the VD “#
./crsctl –restore vdisk ”
3.
Start the CRS on
all the nodes
4.
Check the
integrity of restored VD. “# ./cluvfy comp vdisk –n all verbose”
VD Multiplexing: To avoid VD lost and the complete cluster goes down, due to SPF of VD,
oracle supports multiplexing of VC from 10.2 onwards in max 31 locations, but
from 11.2 it is supporting in max 15 locations.
Node Eviction:
It is the process of
automatically rebooting a cluster node due to private network or VC access
failure to avoid data corruption.
If node1 & node2 can
communicate with each other but not with the node3 through private network, a
split syndrome can occur for the formation of 2 sub cluster and try to master a
single resource their by having data corruption. To avoid this split blind
syndrome, the master node evicts the corresponding node by the handshake node
membership information of D.
CSS Parameter:
1.
Miscount: default
30 sec: It specifies the maximum private network latency to wait before
triggering node eviction process by the master node.
2.
Disk timeout: Default
is 200 sec: It specifies the VD access latency if elapsed to have node eviction
process by the master node.
3.
Reboot Time: default
3 sec: The affected node waits till the reboot time elapsed for actual node
reboot process (this is to make some 3rd party application goes down
properly)