Saturday, December 15, 2012

LGWR Generates Trace file "Warning: Log Write Time xxxms, Size xxxxkb"

Found tracefile omr_lgwr_841.tr, and below error after patched database to 10.2.0.4


/u00/app/oracle/admin/omr/bdump/omr_lgwr_841.trc
Oracle Database 10g Release 10.2.0.4.0 - Production
ORACLE_HOME = /u00/app/oracle/product/10.2.0
System name:    SunOS
Node name:      etravel2k
Release:        5.10
Version:        Generic_147441-01
Machine:        i86pc
Instance name: omr
Redo thread mounted by this instance: 1
Oracle process number: 8
Unix process pid: 841, image: oracle@etravel2k (LGWR)

*** 2012-11-23 09:52:50.704
*** SERVICE NAME:() 2012-11-23 09:52:50.703
*** SESSION ID:(275.1) 2012-11-23 09:52:50.703
tkcrrsarc: (WARN) Failed to find ARCH for message (message:0x1)
tkcrrpa: (WARN) Failed initial attempt to send ARCH message (message:0x1)
Maximum redo generation record size = 156160 bytes
Maximum redo generation change vector size = 150676 bytes
tkcrrsarc: (WARN) Failed to find ARCH for message (message:0x10)
tkcrrpa: (WARN) Failed initial attempt to send ARCH message (message:0x10)
*** 2012-11-23 09:52:58.396
Warning: log write time 500ms, size 682KB
*** 2012-11-23 09:52:58.993
Warning: log write time 590ms, size 887KB
*** 2012-11-23 09:52:59.655
Warning: log write time 670ms, size 952KB
*** 2012-11-23 09:53:00.285
Warning: log write time 620ms, size 1087KB
*** 2012-11-23 09:53:00.918
Warning: log write time 640ms, size 960KB
*** 2012-11-23 09:53:01.594
Warning: log write time 670ms, size 1087KB
*** 2012-11-23 09:53:02.196
Warning: log write time 610ms, size 960KB
*** 2012-11-23 09:53:02.908
Warning: log write time 710ms, size 992KB
*** 2012-11-23 09:53:03.562
Warning: log write time 650ms, size 1055KB

when found out this error on metalink(601316.1), they tell this problem can occur on any platform, after upgrading to 10.2.0.4

And This warning message will be generated only if the log write time is more than 500 ms.

Anyway this problem tell us and we should check the disk is slow or not... and OS problem.

If everything OK... So, we can ignore these messages.

Monday, December 3, 2012

Enabling Flashback Database in Oracle:


Enabling Flashback Database in Oracle:

SQL> select flashback_on from v$database;

FLASHBACK_ON
------------------
NO



SQL> SHUTDOWN IMMEDIATE;
SQL> STARTUP MOUNT;



SQL> ALTER SYSTEM SET DB_FLASHBACK_RETENTION_TARGET=4320; # 3 days
SQL> ALTER SYSTEM SET DB_RECOVERY_FILE_DEST_SIZE=10737418240; # 10G
SQL> ALTER SYSTEM SET DB_RECOVERY_FILE_DEST='/u01/app/oracle/flash_recovery_area';

SQL> show parameter DB_RECOVERY_FILE_DEST

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest                string      /u01/app/oracle/flash_recovery_area
db_recovery_file_dest_size           big integer 10G


SQL> ALTER DATABASE FLASHBACK ON;
SQL> ALTER DATABASE OPEN;

SQL> SELECT flashback_on FROM v$database;

FLASHBACK_ON
-----------------
YES

In order to disable flash recovery area issue,
SQL> ALTER SYSTEM SET DB_RECOVERY_FILE_DEST='' SCOPE=BOTH SID='*';
System altered


To check the estimated flashback size :
SQL> SELECT ROUND(ESTIMATED_FLASHBACK_SIZE/1024/1024/1024) FROM V$FLASHBACK_DATABASE_LOG;

To create a guaranteed restore point :
SQL> create restore point RESTORE_POINT guarantee flashback database;
SQL> select NAME, SCN, TIME from V$RESTORE_POINT;


To flashback the database to specific time :
SQL>  flashback database to timestamp to_timestamp('dd/mm/ccyy HH24:MI:SS', 'DD/MM/YYYY HH24:MI:SS');

To drop a restore point
SQL> select NAME, SCN, TIME from V$RESTORE_POINT;
SQL> drop restore point RESTORE_POINT;

Wednesday, November 28, 2012

RMAN-06059: expected archived log not found, lost of archived log compromises recoverability



RMAN attempted to backup an archive log file, but couldn't find it.

This can happen for a variety of reasons; the file has been manually moved or deleted, the archive log destination has recently been changed, the file has been compressed, etc.

Starting backup
current log archived
released channel: t1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 14:24:55
RMAN-06059: expected archived log not found, lost of archived log
compromises recoverability
ORA-19625: error identifying file /u06/omr/oradata/arch/omr_1_22171_686671135.arc
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3

Your options are either to restore the missing file(s), or to perform a crosscheck. To perform a crosscheck, run the following command from within RMAN:

RMAN> change archivelog all crosscheck;

Crosschecked 178 objects


RMAN> backup archivelog all;

input archive log thread=1 sequence=22177 recid=21267 stamp=800555057
input archive log thread=1 sequence=22178 recid=21268 stamp=800556249
channel ORA_DISK_1: starting piece 1 at 28-NOV-12
channel ORA_DISK_1: finished piece 1 at 28-NOV-12
piece handle=/u01/app/oracle/flash_recovery_area/OMR/backupset/2012_11_28/o1_mf_annnn_TAG20121128T164411_8cdxwxk3_.bkp tag=TAG20121128T164411 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:46
Finished backup at 28-NOV-12

Starting Control File and SPFILE Autobackup at 28-NOV-12
piece handle=/zone/catdb/backup/auto_cntrl_c-1333713947-20121128-05 comment=NONE
Finished Control File and SPFILE Autobackup at 28-NOV-12
-


It is advisable to perform a full backup of the database at this point.

When an archive log crosscheck is performed, RMAN checks each archive log in turn to make sure that it exists on disk (or tape). Those that are missing are marked as unavailable. If you have got missing logs, this won't bring them back. It will allow you to get past this error and back-up the database though.

Saturday, November 24, 2012

Unable to start DB Control oracle 10.2.0.4 & 10.2.0.5


After 31-Dec-2010, all 10.2.0.4 and 10.2.0.5 dbconsoles will face problems during secure/unsecure operations and during dbconsole configuration.


In Enterprise Manager Database Control with Oracle Database 10.2.0.4 and 10.2.0.5, the root certificate used to secure communications via the Secure Socket Layer (SSL) protocol will expire on 31-Dec-2010 00:00:00. The certificate expiration will cause errors if you attempt to configure Database Control on or after 31-Dec-2010.


If you plan to configure Database Control with either of these Oracle Database releases, Oracle strongly recommends that you apply Patch 8350262 to your Oracle Home installations before you configure Database Control.


How to apply the patch?
Download the patch 8350262, unzip to /u01/patch/10204/8350262

Apply the patch,

[oracle@emvml 8350262]$  $ORACLE_SID/OPatch/opatch apply
Invoking OPatch 10.2.0.4.9

Oracle Interim Patch Onstaller version 10.2.0.4.9
Copying (c) 2009, Oracle Corparation.  All rights reserved.

.

At the end you should get a message OPatch succeeded.

After applying the patch, there might me some DB Console related process might not get killed. In order to kill those process.
Go to the patch directory and execute killDBConsole,


After killing the process, resecure the SSL by issuing the below command,




[oracle@emvml 8350262]$  $ORACLE_SID/bin/emctl secure dbconsole -reset 


Enter the following details,


Step 1 : Enter enterprise manager root password :SYSMAN PASSWORD
Step 2 : Y
Step 3 : Y


Note : If it is a RAC database,first node use -reset, on the second node resecure the SSL by the giving the above command without giving -reset


Example : $ORACLE_SID/bin/emctl secure dbconsole


Once the resecure command sucessful,


Start the DB Console,




Start the EM,
Hope this document help !!!


Note: This document is taken from http://ankrose.blogspot.com/2011_11_11_archive.html

Sunday, October 7, 2012

CLUSTER ADMINISTRATION




OCR Updation: Three utilities to perform OCR updates
1.       SRVCTL (recommended) – remote administration utility
2.       DBCA (till 10.2)
3.       OEM
4.        
SRVCTL – Service Control:
-          It is the most widely used utility in RAC environment
-          It is used to perform administration & control of OCR file

Registry sequence of services into OCR:
1.       Node applications                (automatically done in 11.2)
2.       ASM instances                      (automatically done in 11.2)
3.       Databases
4.       Database instances
5.       Database services.

Note: To unregister you have to follow in reverse order.





OLR – Oracle Local Registry

-          Both OLR & GPNP profile needed by lower/HAS stack & OCR, VD is needed by upper / CSR Stack.
-          If OLR or GPNP got corrupted, the corresponding node will go down where as if OCR, VD gets corrupted the complete Cluster will go down.
-          Every daemon of the node will communicate with the peer (same) daemon nodes.
-          Oracle availability perform OLR backup at the time of execution of root.sh scrIPt of grid infrastructure installation & stores in the location “$ GRID_HOME/cdata//backup_date_time.olr”.
-          The default location of OLR file is “$ GRID_HOME/cdata/.olr”.

OLR Backup: Using root user

$ G_H# ./ocrconfig – local – manual backup
$ G_H# ./ocrconfig – local – backuploc
$ G_H# ./ocrcheck - local

Restoring OLR:
-          Bring the init level into either init1 or init2
-          Stop the cluster in the specific node
-          Restore the OLR from the backup location “# ./ocrconfig – local – restore
-          Start cluster
-          Change the init level to either 3 or 5 (init 3 for CLI and init 5 for GUI mode)



-           
OCR – Oracle cluster registry or repository


It is a critical & shared Clusterware file and contains the complete cluster information like cluster node name, their corresponding IP’s, CSS parameters, OCR autobackup information & registered resources like nodeapps, ASM instances with their corresponding node names, databases & database instances & database services

CRSD daemon is responsible for updating OCR file whenever the utilities like srvctl, dbca, oem, netca etc.

CSSD daemon automatically brings up online all the cluster resources which got registered in OCR file

To know the OCR location
# ./ocrcheck                                                        // disk ocr location
# cat /etc/oracle/ocr.loc                                    // In linux & HP-UX
# cat /var/opt/oracle/ocr.loc                            // in Solaris & IBM-AIX

OCR Backup method: 3 ways to perform backup
1.       Automatic
2.       Physical
3.       Logical

1.       Automatic:
Oracle automatically perform OCR backup for every regular interval of 4 hrs since the CRS start time and stores in master node.

Identifying the master node:
# vi $ G_H/log//crsd/crsd.log

I AM THE NEW OCR MASTER
OR
THE NEW OCR MASTER NODE IS

Backup location:
$ G_H/cdata/
                Backup00.ocr (latest)
                Backup01.ocr
                Backup02.ocr
                Day.ocr
                Week.ocr

Oracle retains the latest three 4 hours backup, similarly one latest day backup and one latest week backup by purging all the remaining backup.

Note: It is no possible to change the automatic backup interval time

Manual Backup:

# ./ocrconfig – manual backup
                (it will create backup in default location $ G_H/cdata//backup_date_time.ocr)
# ./ocrconfig – backuploc (recommended is shared storage)

Restoring OCR:
-          Stop the complete cluster on all the nodes “# ./crsctl stop crs”
-          Identify the latest backup (backup00.ocr)
-          Restore the backup “ # ./ocrconfig – restore
-          Start the cluster in all the nodes
-          Check the integrity of the restored OCR backup “# ./cluvfy comp ocr –n all –verbose”
-           
2.       Physical backup: Oracle supports image or sector level backup of OCR using dd utility(if OCR in on raw devices). & cp,(if OCR is on general file system)

# ./ cp
# dd if= of= //if: input file, of: output file.

Restoring:
# ./ cp
# dd if= of= //if: input file, of: output file.

3.       Logical backup:
# ./ocrconfig – export
# ./ocrconfig – import

Note: Oracle recommends taking the backup of OCR file whenever the cluster configuration got modified (ex: adding a node/ deleting a node)

OCR Multiplexing: To avoid OCR lost &the complete cluster goes down due to the single point of failure (SPF) of OCR, Oracle supports OCR multiplexing from 10.2 onwards in max 2 locations (1 as primary other as mirror copy) but from 11.2 onwards it is supporting max 5 locations (1 as primary and remaining as mirror copies)

Note: from 11.2 onwards, oracle support storage of OCR in ASM diskgroups so it provides mirroring depending on the redundancy level.

GPNP – Grid Plug n Play Profile:

-          It contains basic cluster information like location of voting disk, ASM spfile location, all the IP addresses and their subnet masks
-          This is a node specific file
-          It is and xml formatted file
Backup loc: $ G_H/gpnp//profile/peer/profile.xml
Actual loc: $ G_H/gpnp/profile/peer/profile.xml

Voting Disk (VD):


-          It is another & shared file which contains the node membership of all the nodes within the cluster
-          CSSD Daemon is responsible for sending the heartbeat messages to other nodes for every 1 sec and write the response into VD

VD Backup:
-          Oracle supports only physical method to take the backup of VD.
-          From 11.2 onwards, oracle not recommend to take the backup of VD because it automatically maintains VD backup into OCR file

Restoring VD:
1.       Stop the CRS on all the nodes
2.       Restore the VD “# ./crsctl –restore vdisk
3.       Start the CRS on all the nodes
4.       Check the integrity of restored VD. “# ./cluvfy comp vdisk –n all verbose”

VD Multiplexing: To avoid VD lost and the complete cluster goes down, due to SPF of VD, oracle supports multiplexing of VC from 10.2 onwards in max 31 locations, but from 11.2 it is supporting in max 15 locations.

Node Eviction:






It is the process of automatically rebooting a cluster node due to private network or VC access failure to avoid data corruption.
If node1 & node2 can communicate with each other but not with the node3 through private network, a split syndrome can occur for the formation of 2 sub cluster and try to master a single resource their by having data corruption. To avoid this split blind syndrome, the master node evicts the corresponding node by the handshake node membership information of D.

CSS Parameter:
1.       Miscount: default 30 sec: It specifies the maximum private network latency to wait before triggering node eviction process by the master node.
2.       Disk timeout: Default is 200 sec: It specifies the VD access latency if elapsed to have node eviction process by the master node.
3.       Reboot Time: default 3 sec: The affected node waits till the reboot time elapsed for actual node reboot process (this is to make some 3rd party application goes down properly)

ASM – Automatic Storage Management


Files Systems:



Disadvantages of Raw Devices:
1.       It supports storage of only 1 file in only 1 raw device. Hence archive redo logs files & flashback logs which are generated numerously are not suitable member for raw devices.
2.       General O/S commands like cp, ls, mv, du etc will not work in Raw Devices.
3.       Only dd (diskdump) is used for format, backup, restore the raw devices
4.       Raw devices will not support collection of I/O statistics
5.       They cannot be resize online
6.       In Linux environment, out of 15 partitions we can use only 14 for creation of raw devices, In Solaris we can use only 6 out of 7 partitions per a disk


To overcome all the disadvantages we use LVM (Logical volume manager) `
1.       It is a logical storage area which is created by collection of multiple disk partitions onto which we can create any type of file system.
2.       It supports storage of multiple file in a single volume.
3.       Online resizing is possible
4.       Supports collection of I/O statistics
5.       It improves the I/O performance & availability with the help of software level RAID techniques*.

Types of LVMs & Vendors:
LVMs
Vendors
1.       VERITAS Volume Manager
2.       Tivole Volume Manager
3.       Sun Volume Manager (SVM)
4.       ASM(from oracle 10g)
Symantec
IBM
Oracle SUN
Oracle


ASM:
-         It is a type of LVM supported from oracle 10g and has a special type of instance. INSTANCE_TYPE=ASM. & has a small footprint of SGA with size 100-128 MB.
-          It supports for creation of logical volume known as disk groups, internally uses both strIPing and mirroring.
-          Hence it does not have any control file to mount, so its least and last stage is nomount. It has to mount the diskgroups
-          Diskgroup is a logical storage area which is created by collection of multiple disk partitions.
-         ASM supports storage of multiple database related files like control files, redo, data, archive logs, flashback logs, RMAN backup pieces, spfile etc. but it will not support the storage of static files like pfile, listener.ora. tnsnames.ora sqlnet.ora etc.
-         From 11.2 onwards, by using ADVM (ASM dynamic volume manager) & ACFS (ASM cluster file system) we can store static files also.

Note: Sometimes ASM instance may contain large pool also.
-          1 ASM instance will support creation of multiple disk groups and will provide services to multiple clients.

ASM Clients: These are general DB instances which are dependent on ASM instance in order to access the diskgroups.




ASM Instance Background processes:

RBAL – Rebalance Master: It is responsible for managing and coordinating the disk group activities and also responsible for generating the plans for even distribution of ASM instance (extends) for better load balancing whenever a new disk is added and removed.

ARBn – ASM Rebalancer: It is a slave process of RBAL background process and it is responsible for actual load balancing of ASM Disks.


ASMB – ASM Background: It is responsible for successful establishment of communication channel between ASM instance & ASM clients.

GMON – Global Monitor: It is responsible for coordinating the disk group activities whenever a disk group becomes offline or drop.

KATE – Konductor for ASM Temporary Errand: It is responsible for making online for disk groups.

ASM client Background Processes:

RBAL – Rebalance Master: It is responsible for successful opening and closing the diskgroups whenever a read or write operations occur.

PZ9X: It is responsible for gathering the dynamic views information globally across all the instances of database.

ASM related dynamic views:
In RAC environment, all the dynamic views start with gv$ , in non-RAC g$
1.       gv$ asm_disk
2.       gv$ asm_diskgroups
3.       gv$ asm_io_stat
4.       gv$ asm_clients
5.       gv$ asm_template (total 19 views)


  
ASM in RAC Environment: