Archive for April, 2009

MAA challenge lab, part #3

Wednesday, April 15th, 2009

1) Finally got some time for cleaning up Grid Control (dropping ora2 and ora3). Secured all agents (on VMs: gc, prac1, prac2). I’ve also cleaned up XEN dom0 (from quadvm). These VMs are not needed anymore. db3.lab (RAC on prac1, prac2) is in GC. Installed 32-bit agent on srac1 (single node standby).
2) Testing application of single-node RAC standby for differences in Standby Redo Logs processing (verifcation performed by using read-only mode).
3) LNS (ASYNC=buffers_number in LOG_ARCHIVE_DEST_2 parameter) performance fun.
Prepared srac2 for future RAC extension (to two nodes: srac1, srac2). Also installed GC agent on srac2 (
4) prac3: cloning and adding it into the Clusterware prac_cluster ( from prac2). Deploying GC agent on this node (prac1 and prac2 are both, in future I’ll try to upgrade it via GC). Later manually creating +ASM3 and db33 instances (redo, undo, srvctl, etc.). It means that I have 3 node primary RAC :)
5) srac2: Plan is to add it to the srac_cluster and make it 2 node standby RAC. +ASM2 was running, but more work is needed (mainly registrations in CRS/OCR).
6) Flash Recovery Area on standby ASM’s diskgroup +DATA1 was exhausted (thus MRP0 died) so I performed full RMAN backup with archivelogs to QUADVM dom0′s NFS and afterwards I’ve deleted archivelogs to reclaim some space. On SRAC standby I’ve changed archivelog deletion policy (in RMAN) and then restarted MRP0.
Unfortunatley I’ve lost my RAID5 array on synapse (dom0 hosting srac_cluster: srac1, srac2; it’s and old LH 6000R HP server) — 2 drives have failed, so my standby RAC is doomed until I’ll rebuild synapse on new SCSI drives (to be ordered) :(
UPDATE: I’ve verified backups of my srac1 and srac2 VMs but the backups for ASM diskgroup +DATA1 failed.  Also my OCR and voting disks are lost. It will be real fun & challenge to recover this standby RAC enviorniment (this will be also pretty like restoring non DataGuarded RAC enviorniment after site crash). I belive I won’t have to rebuild my standby from primary, because I’ve backuped this standby earlier. OCR hopefully can be restored from Clusterware auto-backup location.

Finally two node RAC {prac1,prac2} is being protected by DataGuard single-node standby RAC {srac1}.

XEN shared storage and oracle software storage on synapse for srac1 and configuring it as /u01 on srac1
Clusterware, ASM +DATA1, database (RAC) installation on srac1 (x86_32).

Oracle RAC main components… (quick self memo)

Wednesday, April 15th, 2009

I always forget how to resolve those shortcuts and which one does what, especially CSS versus CRS :)

CSS – Cluster Synchronization Services – Component responsible for controlling which nodes are members of the clusters. When node joins or leaves the cluster, CSS notifies the other nodes of the change in configuration. If this process fails, then the cluster will be restarted. Under Linux, CSS is implemented by the ocssd daemon, which runs as the root user.

CRS – Cluster Ready Services – Manages resources (databases, instances, services, listeners, VIPs, application processes, GSD, ONS) Configuration is stored in OCR. When status of resource changes CRS emits an event. CRS will try to restart resource up to 5 times before giving up. Under Linux, CRS is implemented as the crsd process, which runs with root privileges. In the event of failure this process restarts automaticlly.

EVM - Event Manager – Implemented as the evmd daemon (also runs as root). Oracle Clusterware also communicates with the ONS, which is a publish & subsribe service that communicates FAN events to clients.