Archive for May, 2009

Storage migration for ASM database deployed on Oracle Enterprise Linux in Oracle VM without downtime.

Thursday, May 21st, 2009

Suppose you wanted to migrate your database from storage array SAN1 to SAN2 without downtime. With Oracle databases using ASM this is possible. It was performed on configuration better described here. One note: the LUN can be visible through dom0 or directly by domU (by passing PCI hardware handling into our domU VM) — this posts explains only first scenario, as this is more common scenario  Brief steps include:

  1. Prepare new LUNs on storage array (not described here)
  2. Attach new LUNs to the Oracle VM (not described here, simulated here by using simple zero-padded file created using dd utility; I expect this was performed earlier[scsi bus rescanning and so on] or file created in /OVS).
  3. Modifing VM’s XEN config file.
  4. Online attaching block device to VM.
  5. Preparing new storage device from inside of the target VM.
  6. Discovering new LUN in ASM.
  7. Actual rebalance process…
  8. Verification.

Step 3: Modify vm.cfg file for additional storage.

This is straighforward, just add one line. Do NOT restart the VM. There is no need.

[root@quadovm 01_prac1]# cat vm.cfg
bootloader = '/usr/bin/pygrub'
disk = ['file:/OVS/running_pool/01_prac1/system.img,hda,w',
'file:/OVS/running_pool/01_prac1/oracle_software.img,hdd,w',
'file:/OVS/running_pool/prac.storage.raw,hde,w!',
'file:/OVS/running_pool/prac.ocr,hdf,w!',
'file:/OVS/running_pool/prac.voting,hdg,w!'
<strong>'file:/OVS/running_pool/prac.storage2.raw,hdi,w!',</strong>
]
memory = 1638
name = '01_prac1'
[..]
[root@quadovm 01_prac1]#

Step 4: Attach block device to the running VM.

[root@quadovm 01_prac1]# xm block-attach 01_prac1 file:///OVS/running_pool/prac.storage2.raw /dev/hdi 'w!'
[root@quadovm 01_prac1]#

Step 5: Prepare prac1 VM for new device.

New added storage should be automatically detected, this can be verified by checking dmesg output:

[root@prac1 ~]# dmesg|grep hdi
hdi: unknown partition table
[root@prac1 ~]# ls -al /dev/hde /dev/hdi
brw-rw---- 1 root dba  33, 0 Jan 24 13:00 /dev/hde
brw-r----- 1 root disk 56, 0 Jan 24 12:59 /dev/hdi
[root@prac1 ~]#
[root@prac1 ~]# fdisk -l /dev/hd[ei] 2> /dev/null | grep GB
Disk /dev/hde: 15.7 GB, 15728640000 bytes
Disk /dev/hdi: 16.1 GB, 16106127360 bytes
[root@prac1 ~]#

As we can see new LUN is bigger (it should be bigger or equal, but I haven’t checked what happens if you add a smaller one). Now we have to assign correct permissions so that ASM/Database can use new /dev/hdi device without problems (this doesn’t include modifing udev rules in /etc/udev/, and it is required to make new devices come with right permissions after reboot — do your own home work:) ) :

[root@prac1 ~]# chgrp dba /dev/hdi
[root@prac1 ~]# chmod g+w /dev/hdi
[root@prac1 ~]#

Step 6: Preparing ASM for new disk.

Verification of current diskgroups and changing diskgroup DATA1 ASM_POWER_BALANCE to zero.

SQL> col name format a20
SQL> SELECT name, type, total_mb, free_mb, required_mirror_free_mb req_mirr_free, usable_file_mb FROM V$ASM_DISKGROUP;

NAME                 TYPE     TOTAL_MB    FREE_MB REQ_MIRR_FREE USABLE_FILE_MB
-------------------- ------ ---------- ---------- ------------- --------------
DATA1                EXTERN      15000      14143             0          14143

SQL> ALTER DISKGROUP DATA1 REBALANCE POWER 0 WAIT;

Diskgroup altered.

SQL>

Next we have to force ASM to discover new devices by modifing asm_diskstring parameter (I’m using IFILE for ASM, so I’ve to manually edit pfile for ASM. If I don’t, ASM won’t remember new settings after restarting).

SQL> show parameter string

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
asm_diskstring                       string      /dev/hde

SQL>
SQL> alter system set asm_diskstring='/dev/hde', '/dev/hdi' scope=memory;

System altered.

SQL> show parameter string

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
asm_diskstring                       string      /dev/hde, /dev/hdi

SQL>
[oracle@prac1 11.1.0]$ vi /u01/app/oracle/product/11.1.0/db_1/dbs/init+ASM1.ora
#Modify asm_diskstring here too
asm_diskstring='/dev/hde','/dev/hdi'

Step 7: The main part: ASM rebalance

SQL> ALTER DISKGROUP DATA1 ADD DISK '/dev/hdi';

Diskgroup altered.

SQL> SELECT GROUP_NUMBER, OPERATION, STATE FROM V$ASM_OPERATION;

GROUP_NUMBER OPERA STATE
------------ ----- --------------------
1 REBAL RUN

SQL> select name,path,state,failgroup from v$asm_disk;

NAME                 PATH            STATE                FAILGROUP
-------------------- --------------- -------------------- ----------
DATA1_0000     /dev/hde       NORMAL               DATA1_0000
DATA1_0001     /dev/hdi        NORMAL               DATA1_0001

SQL> ALTER DISKGROUP DATA1 DROP DISK DATA1_0000;

Diskgroup altered.

SQL> SELECT GROUP_NUMBER, OPERATION, STATE, EST_MINUTES FROM V$ASM_OPERATION;

GROUP_NUMBER  OPERA   STATE    EST_MINUTES
---------------------  ---------  ----------- -----------
1             REBAL    RUN        32 

SQL>

Typical snapshot of iostat right now (10 sec averages):

Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
hde               0.00     0.00 340.80  1.10    14.20     0.00    85.08     1.43    4.18   0.35  12.04
hdi               0.00     0.00  0.40 357.40     0.01    14.28    81.77    11.52   32.22   2.40  86.04

From other normal SQL session:

SQL> insert into t2(id) values ('2');

1 row created.

SQL> commit;

Commit complete.

SQL>

Back to the ASM instance:

SQL> ALTER DISKGROUP DATA1 REBALANCE POWER 11;

Step 8: Verification.

We’ll just execute some big heavy-intensive SQL statement to generate some IO (thnx to Tanel for blogging this query):

SQL> create table t4 as select rownum r from
(select rownum r from dual connect by rownum <= 1000) a,
(select rownum r from dual connect by rownum <= 1000) b,
(select rownum r from dual connect by rownum <= 1000) c
where rownum <= 100000000;

From iostat we can monitor that only hdi is used, which assures us that database is really using hdi.

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
33.33    0.00   26.47    5.88    0.00   34.31

Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
hde               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
hdi               0.00     0.00 107.84 351.96     1.60    12.35    62.14     4.96   10.30   0.91  41.96

Graphing Oracle’s Shared Server configuration (V$QUEUE)

Thursday, May 21st, 2009

Tuning Oracle’s Shared Server(MTS) is not so straightforward task, but I’ve developed simple perl script named plotvqueue.pl that can draw over time the following info:
1) absolute COMMON waits – how many waits there were in the COMMON (VC) queue in interval time (1min)
2) no# of COMMON queued – number of items in the COMMON queue (serviced by shared servers)
3) absolute DISPATCHERS waits – how many waits there were for all the DISPATCHERs queues in interval time (1min)
4) no# of DISPATCHERS queued – number of items in the DISPATCHERs queues.

This can be used as a hint for raising dispatchers and/or shared_servers settings.

for detailed discussion be sure to read Metalink Doc.ID 1005259.6 (“Shared Server (MTS) Diagnosis”), especially get familiar with the following fragment about architecture:

sharedserverarchitecture

“The stages that Shared Server goes through are quite simple. After the client sends the connection request to the Listener, it will either redirect or hand off (called warm hand-off) the connection to the Dispatcher (the Dispatcher does not necessarily need to be on the same host as the Listener). Once the client has connected to a Dispatcher it stays connected to that Dispatcher. Before the client completes the database log in, the Dispatcher associates a Virtual Circuit (VC) for that database session. There exists exactly one row in the VC view (V$CIRCUIT) for each client connection. This view also shows the current status of the client’s VC. Once the VC has been associated with the database session, the client will complete the database logon by passing the username and password to the Dispatcher. This request, as part of the VC for that new session, will be placed in the Common Queue where the first available Shared Server will complete the logon. Once each phase of the logon has completed, the Shared Server will pass the VC back to the Dispatcher, which then passes the response back to the Client (this actually takes several round trips to the client, in just the same manner as if it was a Dedicated connection).

Once the logon has completed, the client starts a normal conversation with the database. When the client makes a request to the database, it is the Dispatcher that picks up this request from the operating system’s network protocol. The Dispatcher then determines which client session the request came from (remember that a Dispatcher can be configured for Connection Pooling and Multiplexing: see the Net Administration Guide for more information on those configurations), tags that sessions’ VC that there is a new message (there is also a pointer to that session buffer in the VC) and places the VC in the Common Queue. The CQ is serviced by the Shared Servers on a first-in-first-out basis. The next available Shared Server pulls the VC from the CQ and processes the request. Part of the VC structure is the identity of the Dispatcher that created it (and which client is connected to it). When the Shared Server is finished processing the request, it writes the output to the session buffer, changes the VC’s ownership back to the Dispatcher that created it, places the VC into that Dispatcher’s queue, and posts the Dispatcher it has something in its queue. The Dispatcher then checks its queue and sends what is in the session buffer through the operating system network protocol back to the Client.”

The process of drawing can be divided in 4 stages:
1) installation of this script and it’s requirements (DBD, DBI, instantclient, etc.)
2) creation of sampling tables (plotvqueue_*) — see script comments
3) real collecting the data, running PL/SQL statement, e.g. from screen(1) to sample 24h run of the database with 1 min interval
4) running this script to analyse data
Optional steps (to catch-up performance differences after changes):
5) online changing database parameters (max_shared_servers, etc.)
6) same as step #3
7) same as step #4

The following screenshots comes from real production system before and after tuning (raising max_shared_servers, shared_servers and dispatchers).
commonqueue_may12_13

dispqueue_may12_13

SHA1, SHA256, SHA512 in Oracle for free without using DBMS_CRYPTO

Thursday, May 21st, 2009

SHA1, SHA256, SHA512 in Oracle for free without using DBMS_CRYPTO! (yay! without Enterprise Edition!*) powered by GNU CRYPTO project

For detailed list of algorithms please consider this link. (much more than DBMS_CRYPTO in 11g, which requires you to buy Enterprise Edition).

[oracle@xeno src]$ ls -l
total 764
-rw-rw-r-- 1 vnull vnull    458 Mar  1 05:53 CommonHash.java
-rw-r--r-- 1 vnull vnull 598036 Mar  1 04:47 gnu-crypto.jar
-rw-r--r-- 1 vnull vnull  96430 Mar  1 04:47 javax-crypto.jar
-rw-r--r-- 1 vnull vnull  16969 Mar  1 04:47 javax-security.jar
-rw-rw-r-- 1 vnull vnull    214 Mar  1 05:27 Main.java
-rw-rw-r-- 1 vnull vnull    145 Mar  1 05:27 SHA1.java
-rw-rw-r-- 1 vnull vnull    152 Mar  1 05:18 SHA256.java
-rw-rw-r-- 1 vnull vnull    152 Mar  1 05:18 SHA512.java
[oracle@xeno src]$
[oracle@xeno src]$ loadjava -u vnull/*** -v -resolve *.java *.jar
arguments: '-u' 'vnull/***' '-v' '-resolve' 'CommonHash.java' 'Main.java' 'SHA1.java' 'SHA256.java' 'SHA512.java' 'gnu-crypto.jar' 'javax-crypto.jar' 'javax-security.jar'
[..]
Classes Loaded: 560
Resources Loaded: 3
Sources Loaded: 0
Published Interfaces: 0
Classes generated: 0
Classes skipped: 1
Synonyms Created: 0
Errors: 0
[oracle@xeno src]$

Now as SYSDBA:

SQL> GRANT CONNECT,RESOURCE,EXECUTE ON DBMS_CRYPTO, CREATE PROCEDURE TO vnull;
SQL> conn vnull/***
Connected.
SQL> CREATE OR REPLACE FUNCTION gnuhash_sha256 (string IN VARCHAR2) RETURN VARCHAR2 AS  LANGUAGE JAVA NAME 'SHA256.calcHash(java.lang.String) return java.lang.String';
  2  /

Function created.

SQL> CREATE OR REPLACE FUNCTION gnuhash_sha512 (string IN VARCHAR2) RETURN VARCHAR2 AS LANGUAGE JAVA NAME 'SHA512.calcHash(java.lang.String) return java.lang.String';
  2  /

Function created.

SQL> CREATE OR REPLACE FUNCTION gnuhash_sha1 (string IN VARCHAR2) RETURN VARCHAR2 AS LANGUAGE JAVA NAME 'SHA1.calcHash(java.lang.String) return java.lang.String';
  2  /

Function created.

SQL> select gnuhash_sha1('1234') from dual;

GNUHASH_SHA1('1234')
--------------------------------------------------------------------------------
7110eda4d09e062aa5e4a390b0a572ac0d2c0220

SQL> select gnuhash_sha256('1234') from dual;

GNUHASH_SHA256('1234')
--------------------------------------------------------------------------------
03ac674216f3e15c761ee1a5e255f067953623c8b388b4459e13f978d7c846f4

SQL> select gnuhash_sha512('1234') from dual;

GNUHASH_SHA512('1234')
--------------------------------------------------------------------------------
d404559f602eab6fd602ac7680dacbfaadd13630335e951f097af3900e9de176b6db28512f2e000b9d04fba5133e8b1c6e8df59db3a8ab9d60be4b97cc9e81db

SQL>

Verify results using OpenSSL :

[vnull@xeno ~]$ echo -n "1234" | openssl dgst -sha1
7110eda4d09e062aa5e4a390b0a572ac0d2c0220
[vnull@xeno ~]$ echo -n "1234" | openssl dgst -sha256
03ac674216f3e15c761ee1a5e255f067953623c8b388b4459e13f978d7c846f4
[vnull@xeno ~]$ echo -n "1234" | openssl dgst -sha512
d404559f602eab6fd602ac7680dacbfaadd13630335e951f097af3900e9de176b6db28512f2e000b9d04fba5133e8b1c6e8df59db3a8ab9d60be4b97cc9e81db
[vnull@xeno ~]$

A little bonus, performance verification: DBMS_CRYPTO from 11.1.0.6 versus GNU.CRYPTO.HASH Java library running in JVM in Oracle (oracle_sha1 vs gnuhash_sha1, Oracle does not support SHA-2 standard yet, only SHA1=160 bits).

SQL> CREATE OR REPLACE FUNCTION oracle_sha1 (string IN VARCHAR2) RETURN VARCHAR2 AS BEGIN
  2  RETURN sys.dbms_crypto.hash(UTL_I18N.STRING_TO_RAW ('1234','AL32UTF8'),
  3  sys.dbms_crypto.hash_sh1);
  4  END;
  5  /

Function created.

SQL> select oracle_sha1('1234') from dual;

ORACLE_SHA1('1234')
--------------------------------------------------------------------------------
7110EDA4D09E062AA5E4A390B0A572AC0D2C0220

SQL>

From this quick & dirty test you can see there is only 4% performance difference between native DBMS_CRYPTO and GNU_HASH…

benchmark

* = UPDATE(12/11/2013): please verify with your Oracle Sales representative. Depending on who you ask and how you ask you may get different answer. DBMS_CRYPTO doesn’t seem to be licensed as ASO anymore, same for SSL encryption for RAC databases even for SE, so in those scenarios if you dont want to have encrypted data on the write you may have less complicated alternatives, YMMV.

Fincore – how to monitor VM cache (A.K.A. what’s inside)

Thursday, May 21st, 2009

Back in November 2008, I’ve asked question how to monitor VM cache performance (cache hit ratio) under Linux on Kevin Closson’s blog. Now I’ve just found utility that allows to show pretty nicely what *is* in VM cache. The utility is called “fincore”, it is written in Python and has some strange Python/perl dependecies, but it works:

[oracle@xeno test]$ fincore -justsummarize *.dbf
page size: 4096 bytes
0 pages, 0.0  bytes in core for 10 files; 0.00 pages, 0.0  bytes per file.
[oracle@xeno test]$ echo $ORACLE_SID
oceperf
[oracle@xeno test]$ sqlplus "/ as sysdba"

SQL*Plus: Release 11.1.0.6.0 - Production on Tue Mar 17 10:35:01 2009

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup;
ORACLE instance started.

Total System Global Area  535662592 bytes
Fixed Size                  1301112 bytes
Variable Size             348128648 bytes
Database Buffers          180355072 bytes
Redo Buffers                5877760 bytes
Database mounted.
Database opened.
SQL> show parameter filesystem

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
filesystemio_options                 string      none
SQL>


So let’s verify again (11g on Linux uses O_DIRECT IO by default).

[oracle@xeno test]$ fincore -justsummarize *.dbf
page size: 4096 bytes
0 pages, 0.0  bytes in core for 10 files; 0.00 pages, 0.0  bytes per file.
[oracle@xeno test]$

Use those DBF files with VM cache (trick):

[oracle@xeno test]$ cat *.dbf > /dev/null
[oracle@xeno test]$ fincore -justsummarize *.dbf
page size: 4096 bytes
154788 pages, 604.6 Mbytes in core for 10 files; 15478.80 pages, 60.5 Mbytes per file.
[oracle@xeno test]$

But not all is cached:

[oracle@xeno test]$ du -sh *.dbf
5.1M    ble.dbf
431M    sysaux01.dbf
551M    system01.dbf
131M    temp01.dbf
3.2M    temp02.dbf
226M    undotbs01.dbf
33M     undotbs2.dbf
33M     undotbs3.dbf
21M     uniform2_01.dbf
348M    users01.dbf
[oracle@xeno test]$
[oracle@xeno test]$ fincore -justsummarize undotbs3.dbf
page size: 4096 bytes
3029 pages, 11.8 Mbytes in core for 1 file; 3029.00 pages, 11.8 Mbytes per file.
[oracle@xeno test]$ ls -alh undotbs3.dbf
-rw-r----- 1 oracle oinstall 33M Feb 27 14:34 undotbs3.dbf
[oracle@xeno test]$

And we have total ~917 MB cached in VM:

[oracle@xeno test]$ grep ^Cached /proc/meminfo
Cached:         939208 kB
[oracle@xeno test]$

The fincore uses mincore(2) syscall, and it appeared in the old days of 2.3.99pre1, so it should work on most of your old boxes, provided that somebody would get rid of that python dependency ;)