Archive for the ‘stats’ Category

Deadlock between DB gathering stats job and global index deferred cleanup (new feature in 12c)

Monday, April 10th, 2017

I’m not going to explain again what Richard explained very nicely here:

but the mentioned feature is not completely bug-free as evidenced below:

ORA-12012: error on auto execute of job "SYS"."PMO_DEFERRED_GIDX_MAINT_JOB"
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at "SYS.DBMS_PART", line 131
ORA-06512: at line 1

yes, deadlock between two jobs performing stats gathering (sic!) and global index cleanup job:

  1. current sql: alter index “CENSORED” modify partition “PART_1200_1400″ coalesce cleanup
  2. current sql: call dbms_stats.gather_database_stats_job_proc ( )

we’ve been unable to reproduce it on testing env, however it is worth nothing that we run with parallel degree for stats gathering (might be important factor ) and the table in question also utilizes the incremental stats gathering (but i don’t think this would be a factor). This happened regularly after one of our cleanup jobs finished with many “alter table X drop partition update global indexes”, so full sequence is as follows:

1. sometime earlier – many “alter table X drop partition update global indexes”
2. later , concurrent run of those two jobs:
2a. PMO_DEFERRED_GIDX_MAINT_JOB”
2b. GATHER_DATABASE_STATS

The workaround seems to be reschedule one of the jobs from step #2.

The DB stats gathering job failed because waiting on row cache lock:

is waiting for 'row cache lock' with wait info:
{
p1: 'cache id'=0x8
p2: 'mode'=0x0
p3: 'request'=0x5
time in wait: 6 min 19 sec
timeout after: never
wait id: 24771
blocking: 1 session
current sql: call dbms_stats.gather_database_stats_job_proc ( )
wait history:
* time between current wait and wait #1: 0.000956 sec
1. event: 'latch: call allocation'
time waited: 0.000144 sec
wait id: 24770 p1: 'address'=0x6000b350
p2: 'number'=0xc
p3: 'tries'=0x0
* time between wait #1 and #2: 0.000135 sec
2. event: 'enq: PS - contention'
time waited: 0.000113 sec
wait id: 24769 p1: 'name|mode'=0x50530006
p2: 'instance'=0x1
p3: 'slave ID'=0x7
* time between wait #2 and #3: 0.000012 sec
3. event: 'PX Deq: Slave Session Stats'
time waited: 0.000002 sec
wait id: 24768 p1: 'sleeptime/senderid'=0x0
p2: 'passes'=0x0
}

and library cache lock during index cleanup:

which is waiting for 'library cache lock' with wait info:
{
p1: 'handle address'=0x321d7c6138
p2: 'lock address'=0x2f8fcd88f8
p3: '100*mode+namespace'=0x597c400040002
time in wait: 2.551053 sec
timeout after: 14 min 57 sec
wait id: 354679
blocking: 1 session
current sql: alter index "CENSORED" modify partition "PART_1200_1400" coalesce cleanup
wait history:
* time between current wait and wait #1: 0.000024 sec
1. event: 'db file sequential read'
time waited: 0.000133 sec
wait id: 354678 p1: 'file#'=0xd
p2: 'block#'=0x57d5e0a
p3: 'blocks'=0x1
* time between wait #1 and #2: 0.000016 sec
2. event: 'db file sequential read'
time waited: 0.000142 sec
wait id: 354677 p1: 'file#'=0xd
p2: 'block#'=0x57d5546
p3: 'blocks'=0x1
* time between wait #2 and #3: 0.000016 sec
3. event: 'db file sequential read'
time waited: 0.000129 sec
wait id: 354676 p1: 'file#'=0xd
p2: 'block#'=0x57d48e9
p3: 'blocks'=0x1
}

Measuring I/O waste on incremental statistics gathering on partitioned tables on 11gR2 (STALE_PERCENT does not work with partitions)

Wednesday, July 15th, 2015

Inspiration to this article is this post: http://progeeking.com/2015/03/09/optimizer-statistics-stale-percentage-and-partitioning/ Please be sure to read it first. The main difference is looking at exactly what reads does the DBMS_STATS perform having INCREMENTAL=TRUE scenario.

Apparently it is well known fact that STALE_PERCENT does not work with (sub)partitions on 11gR2, however one might be interested in knowing exactly what additional I/O from DBMS_STATS can be generated due to this “bug”/missing feature. I’ve used the below SQL script to measure this on pretty recent 11.2.0.4.x edition:

-- prepare start
drop table stats_test purge;
exec dbms_stats.set_global_prefs('trace',to_char(0));

create table stats_test (
  id number primary key,
  merchant_id number not null,
  accounting_date date not null,
  somedata varchar2(128)
)
partition by range (merchant_id) interval (1)
subpartition by hash (accounting_date) SUBPARTITIONS 8
(partition p0 values less than (1))
;

insert into stats_test select rownum, MOD(rownum, 4), sysdate-(rownum/86400), 'blablabla'  from
  (select rownum r from dual connect by rownum <= 1000) a,
  (select rownum r from dual connect by rownum <= 1000) b,
  (select rownum r from dual connect by rownum <= 1000) c
where rownum <= 100000;

create index stats_test_idx1 on stats_test (somedata) local;

exec dbms_stats.set_table_prefs('JWARTAK', 'STATS_TEST', 'INCREMENTAL', 'TRUE');
-- just to be sure, it seems that ALL doesnt work with INCREMENTAL=TRUE
-- so with this AUTO setting you end up with no subpartition stats
-- you can tweak between AUTO and ALL here
exec dbms_stats.set_table_prefs('JWARTAK', 'STATS_TEST', 'GRANULARITY', 'ALL');
-- just to be sure
exec dbms_stats.set_table_prefs('JWARTAK', 'STATS_TEST', 'ESTIMATE_PERCENT', DBMS_STATS.AUTO_SAMPLE_SIZE); 

-- 10% by default but just to be sure
select DBMS_STATS.GET_PREFS ('STALE_PERCENT','JWARTAK','STATS_TEST') stale_percent from dual;

-- just in case
exec dbms_stats.gather_table_stats('JWARTAK', 'STATS_TEST', cascade=>true);
exec dbms_stats.gather_table_stats('JWARTAK', 'STATS_TEST', cascade=>true);
-- prepare end

---experiement start , see STALE_STATS
select PARTITION_NAME,SUBPARTITION_NAME,SUBPARTITION_POSITION, STALE_STATS, to_char(LAST_ANALYZED, 'HH24:MI DD/MM/YYYY') last_analyzed from user_tab_statistics where table_name='STATS_TEST' order by PARTITION_NAME, SUBPARTITION_POSITION;

select count(*) from stats_test;
select count(*) from stats_test where MERCHANT_ID=1;
-- just add 1 row (so 25000 -> 25001 rows in partition; 1 row just adds one row in subpartition - so now 1 out of 8 subpartitions has 3126 rows )
-- 3125 rows -> 3126 rows in subpartition is enough trigger whole partition dbms_stats scan
insert into stats_test (id, merchant_id, accounting_date, somedata) values (110012, 1, sysdate, 'blah');
commit;

-- just in case
exec DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO;

-- at this point all STALE_STATS = null (subpartitions) or =NO (undetected change)
select PARTITION_NAME,SUBPARTITION_NAME,SUBPARTITION_POSITION, STALE_STATS, to_char(LAST_ANALYZED, 'HH24:MI DD/MM/YYYY') last_analyzed from user_tab_statistics where table_name='STATS_TEST' order by PARTITION_NAME, SUBPARTITION_POSITION;

-- 1st run, you will get some dirty results
exec dbms_stats.gather_table_stats('JWARTAK', 'STATS_TEST');

-- another run to see the difference,
exec dbms_stats.gather_table_stats('JWARTAK', 'STATS_TEST');

-- enable debug mode
set serveroutput on
--exec dbms_stats.set_global_prefs('trace',to_char(1+4+8+16+64+2048+4096+8192));
exec dbms_stats.set_global_prefs('trace',to_char(4+8+16+64+2048+4096+8192));

-- repeat insert (now 25001 -> 25002 row in single subpartition)
insert into stats_test (id, merchant_id, accounting_date, somedata) values (110016, 1, sysdate, 'blah');
commit;
-- you can comment this out, at least on 11.2.0.4.x there is no difference
exec DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO;

-- 2nd run, >>> REAL RESULT <<<
exec dbms_stats.gather_table_stats('JWARTAK', 'STATS_TEST');

-- EXPERIMENT END

As you can see from the above there are two main variables here:
- dbms_stats granularity – you can set your own ALL, AUTO or any other to perform measurements
- trace level – based on http://www.pythian.com/blog/options-for-tracing-oracle-dbms_stats/ – the most interesting is where you want the output to go (diagnostic directory or via DBMS_OUTPUT buffer)
- FLUSH_DATABASE_MONITORING_INFO – you may want to flush it or not

The result is (at least on 11.2.0.4.x) that after adding just single row to single (sub)partition, even with INCREMENTAL_STATS=TRUE, you’ll get:

a) with GRANULARITY=AUTO (means table and partition but without dedicated subpartition stats, in most cases this will be “GLOBAL AND PARTITION”):

somewhere:~$ grep -i -e 'Started index ' -e stale -e 'need to be updated' -e 'gather index ' -e query /tmp/run    | grep -v Ending
DBMS_STATS: gather stats on partition SYS_P4061815: stats are stale;
DBMS_STATS: Starting query at 15-JUL-15 01.40.21.747485000 PM +02:00
DBMS_STATS: Starting query at 15-JUL-15 01.40.21.790532000 PM +02:00
DBMS_STATS: Started index JWARTAK.SYS_C005852548 at 15-JUL-15 01.40.21.899589000 PM +02:00 granularity: GLOBAL AND PARTITION gIdxGran:
DBMS_STATS: Starting query at 15-JUL-15 01.40.21.902620000 PM +02:00
DBMS_STATS: Started index JWARTAK.STATS_TEST_IDX1 at 15-JUL-15 01.40.21.974521000 PM +02:00 granularity: GLOBAL AND PARTITION gIdxGran:
DBMS_STATS: gather index stats STATS_TEST_IDX1onSYS_P4061834:
DBMS_STATS: Starting query at 15-JUL-15 01.40.21.978291000 PM +02:00
DBMS_STATS: Starting query at 15-JUL-15 01.40.22.000544000 PM +02:00
somewhere:~$

So this translates to:

  • 2x sampled FULL “PARTITION” SCANs against single partition where new row has been added
  • 1x sampled FULL INDEX SCAN on PK index
  • 2x sampled FULL INDEX SCANs against changed LOCAL index partition where new row has been added
  • certainly 1 new row out of ~25000 isn’t 10% change on partition

b) with GRANULARITY=ALL:

somewhere:~$ grep -i -e 'Started index ' -e stale -e 'need to be updated' -e 'gather index ' -e query /tmp/run_gALL    | grep -v Ending
DBMS_STATS: gather stats on subpartition SYS_SUBP4061878:stats are stale
DBMS_STATS: Starting query at 15-JUL-15 01.46.57.166410000 PM +02:00
DBMS_STATS: Starting query at 15-JUL-15 01.46.57.180291000 PM +02:00
DBMS_STATS: gather stats on partition SYS_P4061886: stats are stale;
DBMS_STATS: Starting query at 15-JUL-15 01.46.57.356143000 PM +02:00
DBMS_STATS: Starting query at 15-JUL-15 01.46.57.405171000 PM +02:00
DBMS_STATS: Started index JWARTAK.SYS_C005852574 at 15-JUL-15 01.46.57.522214000 PM +02:00 granularity: ALL gIdxGran:
DBMS_STATS: Starting query at 15-JUL-15 01.46.57.524947000 PM +02:00
DBMS_STATS: Started index JWARTAK.STATS_TEST_IDX1 at 15-JUL-15 01.46.57.593190000 PM +02:00 granularity: ALL gIdxGran:
DBMS_STATS: gather index statsSTATS_TEST_IDX1 on SYS_SUBP4061878:stats are stale
DBMS_STATS: Starting query at 15-JUL-15 01.46.57.596544000 PM +02:00
DBMS_STATS: gather index stats STATS_TEST_IDX1onSYS_P4061905:stats are stale
DBMS_STATS: Starting query at 15-JUL-15 01.46.57.601476000 PM +02:00
DBMS_STATS: Starting query at 15-JUL-15 01.46.57.622425000 PM +02:00
somewhere:~$
  • 2x sampled FULL “SUBPARTITION” SCANs against single subpartition where new raw has been added
  • 2x sampled FULL “PARTITION” SCANs against single partition where new raw has been added
  • sampled FULL INDEX SCAN on PK index
  • 1x sampled FULL INDEX SCAN against changed LOCAL index subpartition where new raw has been added
  • 2x sampled FULL INDEX SCANs against changed LOCAL index partition where new raw has been added
  • certainly 1 new row out of ~25000 isn’t 10% change on partition or subpartition here

SUMMARY: So with 1 single row added/deleted or just update, in some random subpartition, we can cause on 11gR2 an avalanche of I/O reads due to DBMS_STATS scanning unnecessary segments. When you start to think about table sizes that are above 1-2TB combined with weak I/O subsystem and rather “big” indexes (e.g. 100-200GB), you can quickly run out of MAINTENANCE_WINDOW schedule due to this overhead (in extreme it could be several hundredths GB scanned via DBMS_STATS due to 1 byte changed ;) ).

Some other interesting results:

  1. as was demonstrated STALE_PERCENT does not work against (sub)partitions in 11gR2
  2. since 12gR1 you have an access to INCREMENTAL_STATS + INCREMENTAL_STALENESS (new feature) combo, where you can just set INCREMENTAL_STALENESS to USE_STALE_PERCENT and trigger stats collection for (sub)partitions only after certain threshold of changed rows is reached
  3. apparently and contrary to documentation on 11.2.0.4.x series, INCREMENTAL_STATS=TRUE works with GRANULARITY=ALL (at least in this test)
  4. STALE_STATS column from {user|all|dba}_{tab|ind}_statistics is useless for partitioned tables (flushing does not help anyway), DBMS_STATS GATHER LIST is much better for this purpose
  5. there is interesting granularity option APPROX_GLOBAL AND PARTITION but I’m not sure is it 10.2, 11.1 history artifact as i did nott have enough time to test it in details