Part 5: Oracle SecureFile LOB is better than BasicFile, isn’t it?

In this part I’ll try to discuss one of the biggest performance problem with BasicFile LOBs – enq HW contention (wait event that belongs to “configuration” class). Often it may be observed on systems with concurrent INSERTs into BasicFile LOBs with data bigger tan ~3960 bytes (or with LOBs having DISABLED STORAGE ON ROW), it probably is common name for things related to space allocation (allocating/formating the next extent for LOBSEGMENT itself OR looking for free space and reclaiming it). More research typically ends up in reading the following engineering notes in Metalink:

  • LOB HWM CONTENTION :Using AWR Reports to Identify the Problem; Confirm and Verify the Fix (Doc ID 837883.1)
  • Bug 9801919 “enq: HW – contention” against segments that add an extent frequently during high concurrency
  • ‘enq HW – contention’ For Busy LOB Segment (Doc ID 740075.1)
  • BUG:4867884 LOB HW LOCK CONTENTION WITH SPACE RECLAIMATION
  • Bug 6376915 HW enqueue contention for ASSM LOB segments
  • Bug 7319120 HW contention for BasicFile LOBs – special workaround

Especially in the last Bug 7319120, the workaround is stated as “The preferred option for such contention on basicfile LOBs is to migrate the LOBS to Securefiles in RDBMS 11g or later. Securefile LOBs have been designed for better concurrency“, but as I’ve presented in earlier posts, it is not beatiful as it sounds (huge problems shown in previous posts with wasting space with SecureFiles on early 11.2.0.2.x releases and supprisgnly the same on at least <= 11.2.0.4.2 as of writing). Most of the remaining bugs are already fixed starting 11.2.0.3.0, but as stated by Oracle SecureFiles are "the way forward".

To best of my knowledge typical solutions to "enq HW - contention" include usually one of the following:

  1. properly setting LOB option cache vs nocache (default) and tuning I/O – the enq HW for LOBs seems to affected by storage operations as it will be shown later
  2. proper (hash) partitioning scheme on the affected table/LOB to split contention
  3. altering LOB option chunksize (often increasing the default 8k to something like 16k/32k gives some minimal better results – but this is very workload specific)
  4. setting event 44951

Test env specification
Armed with the knowledge I’ve decided to really test that interesting event for my specific artificial workload. By this i mean version 11.2.0.3.8, BasicFile LOB, NOCACHE, pctversion 0, lobs sizes between 6000 … 10000, 7 concurrent INSERT and DELETE jobs preceeded by prefilling LOBs and then DELETing — to simulate in non-ideal clean conditions for table (ab)used as queue. DB was in NOARCHIVELOG and flashback was disabled. The only difference was storage being RAID1 of just 2xSAS 15k RPM drives (shared between data and redo) – instead of ramdisk (I’ve needed that I/O latency for enq HW to become visible in Top5),

HW contention root cause
The even more interesting fact is that INSERTs into LOB (with related LOB extent allocation) is masked by enq HW contention which (when you look via V$wait_chains) is actually waiting for “db file sequential read” :

CHAIN_SIGNATURE                                    WHO             WAIT_EVENT_TEXT              BLOCKED_BY              P1 P1_TEXT                 P2 P2_TEXT                 P3 P3_TEXT
-------------------------------------------------- --------------- ---------------------------- --------------- ---------- --------------- ---------- --------------- ---------- ---------------
'db file sequential read'<='enq: HW - contention'  129,571         enq: HW - contention         160,25          1213661190 name|mode                5 table space #     12583050 block
'db file sequential read'<='enq: HW - contention'  160,25          db file sequential read      ,                        3 file#                 4514 block#                   1 blocks
'db file sequential read'<='enq: HW - contention'  6,7             enq: HW - contention         160,25          1213661190 name|mode                5 table space #     12583050 block
'db file sequential read'<='enq: HW - contention'  33,9            enq: HW - contention         160,25          1213661190 name|mode                5 table space #     12583050 block
'db file sequential read'<='enq: HW - contention'  65,9            enq: HW - contention         160,25          1213661190 name|mode                5 table space #     12583050 block
'db file sequential read'<='enq: HW - contention'  190,13          enq: HW - contention         160,25          1213661190 name|mode                5 table space #     12583050 block
'db file sequential read'<='enq: HW - contention'  223,9           enq: HW - contention         160,25          1213661190 name|mode                5 table space #     12583050 block

In the above case 6 sessions were waiting with enq TX contetnion (looking for free space for INSERT) for 7th session (sid=160,serial#=25) actually searching for space/reclaiming the space to finish – and actually the 7th session was doing single block (8kB) physical DB I/O read. The Top5 session look similiar to this:

ELAPSED
---------------------------------------------------------------------------
+000000000 00:24:54.463960000 

Top 5 Timed Foreground Events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                           Avg
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
enq: HW - contention                 62,717         918     15   32.2 Configurat
direct path write                    93,156         277      3    9.7 User I/O
db file sequential read              64,995         199      3    7.0 User I/O
DB CPU                                               91           3.2
buffer busy waits                    26,578           1      0     .0 Concurrenc

and most useful for us statistics (look at physical direct writes for lobs):

Statistic                                     Total     per Second     per Trans
-------------------------------- ------------------ -------------- -------------
[..]
lob reads                                    63,000          188.8           1.0
lob writes                                  315,000          943.7           5.0
lob writes unaligned                        314,990          943.7           5.0
physical read IO requests                    64,851          194.3           1.0
physical read bytes                     531,267,584    1,591,684.2       8,428.1
physical read total IO requests              65,656          196.7           1.0
physical read total bytes               544,384,000    1,630,981.2       8,636.2
physical read total multi block                   0            0.0           0.0
physical reads                               64,852          194.3           1.0
physical reads cache                            558            1.7           0.0
physical reads cache prefetch                     1            0.0           0.0
physical reads direct                        64,294          192.6           1.0
physical reads direct temporary                   0            0.0           0.0
physical write IO requests                  105,628          316.5           1.7
physical write bytes                  1,036,320,768    3,104,829.8      16,440.4
physical write total IO requests            170,303          510.2           2.7
physical write total bytes            1,339,469,824    4,013,068.1      21,249.6
physical write total multi block                 43            0.1           0.0
physical writes                             126,504          379.0           2.0
physical writes direct                      125,324          375.5           2.0
physical writes direct (lob)                125,316          375.5           2.0
physical writes from cache                    1,180            3.5           0.0
physical writes non checkpoint              126,504          379.0           2.0
[..]

Below are data after changing the LOB to CACHE, which had resulted in drastic reduction of enq HW contention (as the dependend I/O is faster because it is usually cached) and V$WAIT_CHAINS results in many for example wait chains like those(notice lack of “-> db file sequential read”):

  • ‘log file parallel write’<='log buffer space'<='buffer busy waits'
  • ‘log file parallel write’<='log buffer space'
  • ‘enq: HW – contention’<='enq: HW - contention'

while the Top5 results in different stuff (the main bottleneck moved to the REDO subsystem now):

ELAPSED
---------------------------------------------------------------------------
+000000000 00:06:29.795479000 

Top 5 Timed Foreground Events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                           Avg
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
log buffer space                      1,542         116     75   15.6 Configurat
enq: HW - contention                 58,609         110      2   14.8 Configurat
buffer busy waits                   125,436          99      1   13.3 Concurrenc
DB CPU                                               91          12.3
log file switch (checkpoint in            6          10   1620    1.3 Configurat
Statistic                                     Total     per Second     per Trans
-------------------------------- ------------------ -------------- -------------
[..]
lob reads                                    63,000          347.3           1.0
lob writes                                  315,009        1,736.4           5.0
physical read IO requests                       629            3.5           0.0
physical read bytes                       5,619,712       30,976.6          89.1
physical read total IO requests               1,629            9.0           0.0
physical read total bytes                21,804,032      120,186.7         345.8
physical read total multi block                   0            0.0           0.0
physical reads                                  686            3.8           0.0
physical reads cache                            661            3.6           0.0
physical reads cache prefetch                    57            0.3           0.0
physical reads direct                            25            0.1           0.0
physical reads direct temporary                   0            0.0           0.0
physical write IO requests                   16,462           90.7           0.3
physical write bytes                    234,405,888    1,292,076.2       3,718.0
physical write total IO requests             19,116          105.4           0.3
physical write total bytes            1,399,786,496    7,715,808.2      22,202.6
physical write total multi block              2,341           12.9           0.0
physical writes                              28,614          157.7           0.5
physical writes direct                           25            0.1           0.0
physical writes direct (lob)                      0            0.0           0.0
physical writes from cache                   28,589          157.6           0.5
physical writes non checkpoint               16,153           89.0           0.3
[..]

So in this case changing to CACHE results in reducing the work from 7 jobs from almost 25mins to 6 mins. Also you might notice a rather huge decrease of direct physical random writes which is almost always excellent results for most disk based systems (those blocks would be certianly flushed out by DBWR using much more optimized/sequential writes).

The mysterious event 44591
The mentioned event is described as “causes ASSM LOB space to batch up reclaim instead of just reclaiming the requested/required number of LOB chunks. To enable this fix, set event 44951 to the maximum number of chunks that you would like to have reclaimed per attempt. The maximum allowed is 1024. Anything larger becomes 1024. However, if the requested amount of space is larger than the event’s value, the reclaim will be for the requested amount of space.
It is interesting that several years ago people complained when migrated LOBs from MSSM to ASSM and voila and in different source we find “With Manual Segment Space Management, this value cannot be altered and is fixed at 128.“. Somewhere else it is written: “The fix for this problem introduces a *mandatory event* which must be set in order for the ASSM architectural change leading to more efficient allocation of LOB chunks during high load and transaction contention when Extent allocation is needed. The event is *not* set even after you apply the patch.

So here are the results with NOCACHE LOB:

event44951 not set:

ELAPSED                                                                                                                                              ---------------------------------------------------------------------------
+000000000 00:21:57.406259000

Top 5 Timed Foreground Events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                           Avg
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
enq: HW - contention                 62,591         787     13   39.3 Configurat
direct path write                    90,263         252      3   12.6 User I/O
db file sequential read              64,616         172      3    8.6 User I/O
DB CPU                                              111           5.6
buffer busy waits                    24,388           1      0     .0 Concurrenc

event44951 setTo 3:

ELAPSED
---------------------------------------------------------------------------
+000000000 00:18:33.058539000  

Top 5 Timed Foreground Events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                           Avg
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
enq: HW - contention                 41,254         585     14   47.3 Configurat
direct path write                    86,039         285      3   23.0 User I/O
db file sequential read              43,801         144      3   11.6 User I/O
DB CPU                                               88           7.1
buffer busy waits                    39,976           1      0     .1 Concurrenc

event44951 set to 16:

ELAPSED
---------------------------------------------------------------------------
+000000000 00:22:31.832235000   

Top 5 Timed Foreground Events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                           Avg
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
direct path write                   107,809         911      8   34.9 User I/O
db file sequential read              12,804         185     14    7.1 User I/O
enq: HW - contention                  3,376         177     52    6.8 Configurat
DB CPU                                               81           3.1
log file switch (private stran            4           1    323     .0 Configurat

event44951 set to 32:

ELAPSED
---------------------------------------------------------------------------
+000000000 00:49:27.268629000   

Top 5 Timed Foreground Events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                           Avg
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
direct path write                   111,440       1,435     13   24.5 User I/O
enq: HW - contention                  2,597       1,027    395   17.5 Configurat
db file sequential read              20,339         429     21    7.3 User I/O
DB CPU                                               79           1.4
log file switch (private stran            5           2    474     .0 Configurat

event44951 set to 512:

ELAPSED
---------------------------------------------------------------------------
+000000000 00:38:45.176198000   

Top 5 Timed Foreground Events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                           Avg
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
direct path write                   116,908       1,833     16   40.2 User I/O
db file sequential read              17,262         354     20    7.8 User I/O
DB CPU                                               78           1.7
enq: HW - contention                     23          64   2802    1.4 Configurat
log file switch (private stran            6           3    473     .1 Configurat

So apparently the most interesting result (for LOB reclamation) is in my case the one with event set to 3 (chunks). I could test more, but time for this research was the most limiting constraint. When tablespace and LOB are created by default this mean 8kB, so in this case this would mean 8kB*3=24kB of I/O per reclamation. Apparently the bigger the event level/property the more I/O work for the potential reclamation is being done by the session requiring more space, all of this is hidden under enq HW contention for LOB segment, but for *different sessions*. The results in CACHE LOB scenario with event set to properly tuned level would probably result in even more optimal processing time.

SUMMARY
My conclusion is that enq HW contention being in single “Configuration event class” visible in Grid/Cloud Control, is a little bit unfortunate, because it might be also related to the I/O storage performance and this can be even affected by current db cache contents (which would or NOT contain the affected LOBSEGMENT blocks required for reclamation if possible). The LRU cache might be of course function of time/workload, also the physical I/O from the DB itself might be served by SAN cache.

Other interesting sources/posts/discussions:

Comments are closed.