Archive for the ‘Laboratory’ Category

Very simple demonstration how to limit latency impact (improve consumer experience) by switching to out-of-band/background validation connection checking for DB connections

Thursday, March 31st, 2016

This is very simple demonstration on how to limit latency impact (improve consumer experience) by switching to out-of-band/background validation connection checking for DB connections:

Test setup: laptop <-- 0ms latency --> VM { Wildfly JBoss} <-- 6ms latency --> VM { Oracle DB }, latency was simulated using linux tc netem (6ms but only one way).
Benchmark generator settings: ApacheBenchmark 2.4 launched with concurrency = 1, timeout = 5, 1000 requests to simple JSP page, test was repeated multiple times to heat caches/etc.
Middle-tier: JBoss/WildFly Core 2.0.10.Final, 100 connection pool, validation connection checker explicitly set to “SELECT 1 FROM DUAL”
Application: simple JSP page performing 1 DB CALL (1x INSERT)

with 6ms RTT with JDBC Connection Pool with default foreground connection check (actually 4 packets; 4*6 =~ 24ms):

Requests per second:    55.97 [#/sec] (mean)
Time per request:       17.867 [ms] (mean)
Time per request:       17.867 [ms] (mean, across all concurrent requests)
Transfer rate:          16.07 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:    13   18   9.8     16     218
Waiting:       13   17   9.8     16     218
Total:         13   18   9.8     17     218

Percentage of the requests served within a certain time (ms)
  50%     17
  66%     17
  75%     18
  80%     18
  90%     19
  95%     21
  98%     38
  99%     71
 100%    218 (longest request)

with 6ms RTT with JDBC Connection Pool with background connection check (just 2 packets to DB; 2*6 =~ 12ms):

Requests per second:    100.74 [#/sec] (mean)
Time per request:       9.927 [ms] (mean)
Time per request:       9.927 [ms] (mean, across all concurrent requests)
Transfer rate:          28.92 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:     7   10   2.8      9      41
Waiting:        7    9   2.8      9      41
Total:          7   10   2.9      9      42

Percentage of the requests served within a certain time (ms)
  50%      9
  66%     10
  75%     10
  80%     10
  90%     11
  95%     13
  98%     18
  99%     26
 100%     42 (longest request)

This demonstrated that the average consumer experience can be improved from ~18ms to ~10ms just by setting two options that move the checking “out-of-band” for business transaction processing. Of course the more times the application calls .getConnection() even transparently like with the case of JSP, the more impact of those settings.

p.s. I’ve used

<check-valid-connection-sql>select 1 from dual</check-valid-connection-sql>

in my experiment however you should be really using

<valid-connection-checker class-name="org.jboss.jca.adapters.jdbc.extensions.oracle.OracleValidConnectionChecker"/> 

as it calls native OJDBC .isValid() or ping methods that perform the check even without SQL layer (smaller packets).

Long remote Oracle/SQLNet connection times due to the IPv6 enabled on RHEL 5.x

Sunday, January 9th, 2011

On one of my RAC clusters in lab i’ve noticed that it took sometimes up to 5-15 seconds in order to connect to Oracle database instead of maximum 1s. It was happening on some old VMs back from 2008. The root-cause analysis showed that each RAC node was doing DNS search for it’s own name, which was something i wouldn’t expect normally. Paying attenting to the details showed that Oracle RDBMS was performing DNS on it’s own server name , but not for A DNS entry, but for AAAA (IPv6). Yes libc (DNS resolver) was asking REMOTE DNS SERVERS for it’s own server name – but in IPv6 format (AAAA) – because it couldn’t find the required info via /etc/hosts. This is going to happen even with NETWORING_IPV6=OFF in /etc/sysconfig/network.

The solution was pretty easy after estabilishing the root casue. Just ensure that:

  • all RAC nodes are in /etc/hosts
  • all RAC nodes . are in /etc/hosts
  • that resolv.conf provides at least 2-3 DNS servers that are reachable within max 50ms (check each one with dig(1)). resolv.conf can have options to perform round robin accross those. [my enviorniment was affected by this]
  • you have disabled IPv4 via NETWORKING_IPV6
  • you have aliased IPv6 to “off” in kernel modules to completley disable ipv6 kernel functionality, this in turn also disables any libc IPv6 action [my enviorniment was affected by this, just add "alias net-pf-10 off" to the modprobe.conf]
  • lsmod does NOT displays ipv6 (kernel module is not loaded)

Of course as you can see I wouldn’t be affected by this one if the DNS name server wouldn’t misconfigured in the first place (my lab in 2008 looked differently than it looks now).

Index partition corruption detected via RMAN validate (11.1) and resolution

Sunday, January 9th, 2011

SELF-MEMO: How to fix corruption on indexed partitions (i was bringing my Extended RAC 11.1.0.6 configuration back to life):

RMAN> validate database check logical;

Starting validate at 07-JAN-11
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=120 instance=erac2 device type=DISK
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00001 name=+DATA1/erac/datafile/system.259.654366597
input datafile file number=00004 name=+DATA1/erac/datafile/soeindex.dbf
input datafile file number=00003 name=+DATA1/erac/datafile/soedata.dbf
input datafile file number=00008 name=+DATA1/erac/datafile/undotbs5.270.655999987
input datafile file number=00002 name=+DATA1/erac/datafile/sysaux.260.654366615
input datafile file number=00007 name=+DATA1/erac/datafile/undotbs4.269.655999975
input datafile file number=00005 name=+DATA1/erac/datafile/users.264.654366663
channel ORA_DISK_1: validation complete, elapsed time: 00:05:46
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
1    OK     0              20084        459520          13210477
  File Name: +DATA1/erac/datafile/system.259.654366597
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              431075
  Index      0              6321
  Other      0              2040

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
2    FAILED 0              16058        45584           13210492
  File Name: +DATA1/erac/datafile/sysaux.260.654366615
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              10669
  Index      2              7436
  Other      0              11421

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
3    OK     0              39255        126976          12914293
  File Name: +DATA1/erac/datafile/soedata.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              85332
  Index      0              12
  Other      0              2377

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
4    OK     0              13378        276649          12928336
  File Name: +DATA1/erac/datafile/soeindex.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              0
  Index      0              258967
  Other      0              4304

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
5    OK     0              44           640             13209647
  File Name: +DATA1/erac/datafile/users.264.654366663
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              532
  Index      0              1
  Other      0              63

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
7    OK     0              10238        25600           13210492
  File Name: +DATA1/erac/datafile/undotbs4.269.655999975
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              0
  Index      0              0
  Other      0              15362

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
8    OK     0              120          89600           13210482
  File Name: +DATA1/erac/datafile/undotbs5.270.655999987
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              0
  Index      0              0
  Other      0              89480

validate found one or more corrupt blocks
See trace file /u01/app/oracle/diag/rdbms/erac/erac2/trace/erac2_ora_12052.trc for details
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
including current control file for validation
including current SPFILE in backup set
channel ORA_DISK_1: validation complete, elapsed time: 00:00:01
List of Control File and SPFILE
===============================
File Type    Status Blocks Failing Blocks Examined
------------ ------ -------------- ---------------
SPFILE       OK     0              2
Control File OK     0              1146
Finished validate at 07-JAN-11

RMAN>

Querying for exact info about affected segments:

SQL> SELECT * FROM v$database_block_corruption;

     FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO
---------- ---------- ---------- ------------------ ---------
         2       7260          1           13210364 CORRUPT
         2       7308          1           13210260 CORRUPT

SQL>

worth_blogging_corruption_rman_erac2

Fixing in this case is simple, just executed below SQL DML statements to rebuild those partitions:

alter index WRH$_SERVICE_STAT_PK rebuild partition WRH$_SERVIC_427808814_0;
alter index WRH$_SERVICE_STAT_PK rebuild partition WRH$_SERVICE_STAT_MXDB_MXSN;
alter index WRH$_PARAMETER_PK rebuild partition WRH$_PARAME_427808814_0;
alter index WRH$_PARAMETER_PK rebuild partition WRH$_PARAMETER_MXDB_MXSN;

Again I’ve executed RMAN validation to ensure everything is ok:

(..)
channel ORA_DISK_1: validation complete, elapsed time: 00:05:36
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
1    OK     0              20084        459520          13258305
  File Name: +DATA1/erac/datafile/system.259.654366597
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              431075
  Index      0              6321
  Other      0              2040

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
2    OK     0              15982        45584           13258305
  File Name: +DATA1/erac/datafile/sysaux.260.654366615
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              10669
  Index      0              7505
  Other      0              11428

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
3    OK     0              39255        126976          12914293
  File Name: +DATA1/erac/datafile/soedata.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              85332
  Index      0              12
  Other      0              2377

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
4    OK     0              13378        276649          12928336
  File Name: +DATA1/erac/datafile/soeindex.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              0
  Index      0              258967
  Other      0              4304

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
5    OK     0              44           640             13209647
  File Name: +DATA1/erac/datafile/users.264.654366663
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              532
  Index      0              1
  Other      0              63

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
7    OK     0              10238        25600           13258305
  File Name: +DATA1/erac/datafile/undotbs4.269.655999975
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              0
  Index      0              0
  Other      0              15362

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
8    OK     0              120          89600           13258101
  File Name: +DATA1/erac/datafile/undotbs5.270.655999987
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              0
  Index      0              0
  Other      0              89480

channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
including current control file for validation
including current SPFILE in backup set
channel ORA_DISK_1: validation complete, elapsed time: 00:00:00
List of Control File and SPFILE
===============================
File Type    Status Blocks Failing Blocks Examined
------------ ------ -------------- ---------------
SPFILE       OK     0              2
Control File OK     0              1146
Finished validate at 07-JAN-11

RMAN>

Oracle 11g ADRCI memo for autopurging

Sunday, January 9th, 2011

SELF-MEMO: How to configure ADRCI in 11.1 or 11.2 to purge logs and change policy for removing unneeded log files.

[oracle@rac1 ~]$ du -sm /u01/app/oracle/diag/
766     /u01/app/oracle/diag/
[oracle@rac1 ~]$ adrci

ADRCI: Release 11.1.0.6.0 - Beta on Fri Jan 7 12:28:45 2011

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

ADR base = "/u01/app/oracle"
adrci> set control (SHORTP_POLICY = 360)
DIA-48448: This command does not support multiple ADR homes

adrci> show homes
ADR Homes:
diag/tnslsnr/rac1/listener_rac1
diag/rdbms/erac/erac1
diag/asm/+asm/+ASM1
adrci> set home diag/tnslsnr/rac1/listener_rac1
adrci> set control (SHORTP_POLICY = 360)
adrci> set control (LONGP_POLICY = 360)
adrci> purge
adrci>
adrci> set home diag/rdbms/erac/erac1
adrci> set control (SHORTP_POLICY = 360)
adrci> set control (LONGP_POLICY = 360)
adrci> purge
adrci>
adrci> set home diag/asm/+asm/+ASM1
adrci> set control (SHORTP_POLICY = 360)
adrci> set control (LONGP_POLICY = 360)
adrci> purge
adrci>
adrci> CNTRL+D
[oracle@rac1 ~]$
[oracle@rac1 ~]$
[oracle@rac1 ~]$ du -sm /u01/app/oracle/diag/
185     /u01/app/oracle/diag/
[oracle@rac1 ~]$

MAA challenge lab, part #3

Wednesday, April 15th, 2009

11/04/2009:
1) Finally got some time for cleaning up Grid Control (dropping ora2 and ora3). Secured all agents (on VMs: gc, prac1, prac2). I’ve also cleaned up XEN dom0 (from quadvm). These VMs are not needed anymore. db3.lab (RAC on prac1, prac2) is in GC. Installed 10.2.0.5 32-bit agent on srac1 (single node standby).
2) Testing application of single-node RAC standby for differences in Standby Redo Logs processing (verifcation performed by using read-only mode).
3) LNS (ASYNC=buffers_number in LOG_ARCHIVE_DEST_2 parameter) performance fun.
Prepared srac2 for future RAC extension (to two nodes: srac1, srac2). Also installed GC agent on srac2 (10.2.0.5).
4) prac3: cloning and adding it into the Clusterware prac_cluster (addNode.sh from prac2). Deploying GC 10.2.0.5 agent on this node (prac1 and prac2 are both 10.2.0.4, in future I’ll try to upgrade it via GC). Later manually creating +ASM3 and db33 instances (redo, undo, srvctl, etc.). It means that I have 3 node primary RAC :)
5) srac2: Plan is to add it to the srac_cluster and make it 2 node standby RAC. +ASM2 was running, but more work is needed (mainly registrations in CRS/OCR).
6) Flash Recovery Area on standby ASM’s diskgroup +DATA1 was exhausted (thus MRP0 died) so I performed full RMAN backup with archivelogs to QUADVM dom0′s NFS and afterwards I’ve deleted archivelogs to reclaim some space. On SRAC standby I’ve changed archivelog deletion policy (in RMAN) and then restarted MRP0.
Unfortunatley I’ve lost my RAID5 array on synapse (dom0 hosting srac_cluster: srac1, srac2; it’s and old LH 6000R HP server) — 2 drives have failed, so my standby RAC is doomed until I’ll rebuild synapse on new SCSI drives (to be ordered) :(
UPDATE: I’ve verified backups of my srac1 and srac2 VMs but the backups for ASM diskgroup +DATA1 failed.  Also my OCR and voting disks are lost. It will be real fun & challenge to recover this standby RAC enviorniment (this will be also pretty like restoring non DataGuarded RAC enviorniment after site crash). I belive I won’t have to rebuild my standby from primary, because I’ve backuped this standby earlier. OCR hopefully can be restored from Clusterware auto-backup location.

15/03/2009:
Finally two node RAC {prac1,prac2} is being protected by DataGuard single-node standby RAC {srac1}.

21/02/2009:
XEN shared storage and oracle software storage on synapse for srac1 and configuring it as /u01 on srac1
Clusterware, ASM +DATA1, database (RAC) installation on srac1 (x86_32).

MAA challenge lab, part #2

Tuesday, February 3rd, 2009

25/01/2009: Cloning prac2 and adding it to Clusterware cluster (with prac1). Debugging blogged earlier Clusterware bug.

26/01/2009: Reading about migration of single instance ASM to full clustered ASM/RAC. Experiments with NOLOGGING and RMAN recovery on xeno workstation (db TEST).

27/01/2009: I’ve managed to migrate to full working RAC for db3.lab1 {nodes prac1 and prac2} with ASM storage (ASM migration done using DBCA; RAC migration performed by using rconfig). Deployed GC agent on prac2.

29/01/2009: Experiments with DBMS_REPAIR, NOLOGGING and RMAN recovery on xeno workstation (db TEST).

Brief description of work on my MAA challenge lab… #1

Saturday, January 24th, 2009

This picture below (click to enlarge) shows what I’m planning to deploy in my home lab in order to prepare better for OCP certification. It can be summarized as full Maximum Availbility Architecture implementation… Grid Control is being used to increase productivity, but I don’t want to integrate Oracle VM into the stack, just systems and databases:

MAA challenge lab

17/01/2008: Installation and fight with Grid Control on VM gc. Preparing VM linux template named 01_prac1 from which other machines are going to be cloned (simple as recursive “cp” in dom0).

18/01/2008: Installation & fight with Grid Control after I’ve dropped “emrep” database (main GC repository database). This happened while I was playing with cloned database “misc1″ from “emrep”. I didn’t read message while running “DROP DATABASE ..”  from RMAN and I’ve sent both to /dev/null, because the DBID was the same for the orginal one and the “misc1″ clone.  The primary reason was that I wanted misc1 cloned from emrep but it failed :) ). Did I say that I’ve also deleted backups? ;) After new, sucessfull fresh installation of GC, I’ve taken full backup (from XEN dom0) of 01_gc VM for future purposes. I’m starting to regret that I’ve haven’t used LVM in dom0 for snapshot/fast-backup purposes…

20/01/2008: Setting up 09_ora1 VirtualMachine from VM template 01_prac1. Installing single 11g database named “db1.lab1″ with dedicated tablespace & user for sysbench version 0.4.8 (0.4.10 doesn’t work with Oracle).

23/01/2008: Cloning ora2 from ora1. Changing hostnames, IPs (trick: the same ethernet MACs but on different XEN bridges, changes performed from console:)). Uninstalling Agent10g, vanishing db on ora2. Setting up DNS server on quadvm (in XEN dom0) for whole lab. Installing GC agent on ora2 – something is wrong… GC console doesn’t catch up new targets (primary I’m looking for “Host” component). Agent is itself discovered by GC…. starting from the beginning (rm -rf 08_ora2 :o ) and so on…
Finally got ora3 up running instead of ora2. Then it turned out that Laurent Schneider has blogged post about metalink note in which the procedure of agent removal is described. So finally I’ve got ora2 back in GC (with gc, ora1 and ora3).

Next step was setting up host prac1 as for single instance non-RAC ASM database “db3.lab1″. First  clusterware has been installed. I wanted to have it 32-bit version, because my future standby RAC hardware is only 32-bit capable but it appears that I would have to install 32-bit userland RPMs, so I decided to try in the long term 64-bit primary RAC with 32-bit standby RAC… Also Grid Control agent was installed on prac1.

24/01/2008: Raised RAM for 01_prac1 to 1.6GB from 1.2GB (due to frequent swapping occuring, I want my 11g memory_target for that fresh db3.lab1 database to be left at 450MB). Succesfull migration of ASM storage /dev/hde to /dev/hdi (simulating storage migration – thanks to ASM it is straightforward. I’m going to blog about it in my next post).