Deadlock between DB gathering stats job and global index deferred cleanup (new feature in 12c)

I’m not going to explain again what Richard explained very nicely here:

but the mentioned feature is not completely bug-free as evidenced below:

ORA-12012: error on auto execute of job "SYS"."PMO_DEFERRED_GIDX_MAINT_JOB"
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at "SYS.DBMS_PART", line 131
ORA-06512: at line 1

yes, deadlock between two jobs performing stats gathering (sic!) and global index cleanup job:

  1. current sql: alter index “CENSORED” modify partition “PART_1200_1400″ coalesce cleanup
  2. current sql: call dbms_stats.gather_database_stats_job_proc ( )

we’ve been unable to reproduce it on testing env, however it is worth nothing that we run with parallel degree for stats gathering (might be important factor ) and the table in question also utilizes the incremental stats gathering (but i don’t think this would be a factor). This happened regularly after one of our cleanup jobs finished with many “alter table X drop partition update global indexes”, so full sequence is as follows:

1. sometime earlier – many “alter table X drop partition update global indexes”
2. later , concurrent run of those two jobs:
2a. PMO_DEFERRED_GIDX_MAINT_JOB”
2b. GATHER_DATABASE_STATS

The workaround seems to be reschedule one of the jobs from step #2.

The DB stats gathering job failed because waiting on row cache lock:

is waiting for 'row cache lock' with wait info:
{
p1: 'cache id'=0x8
p2: 'mode'=0x0
p3: 'request'=0x5
time in wait: 6 min 19 sec
timeout after: never
wait id: 24771
blocking: 1 session
current sql: call dbms_stats.gather_database_stats_job_proc ( )
wait history:
* time between current wait and wait #1: 0.000956 sec
1. event: 'latch: call allocation'
time waited: 0.000144 sec
wait id: 24770 p1: 'address'=0x6000b350
p2: 'number'=0xc
p3: 'tries'=0x0
* time between wait #1 and #2: 0.000135 sec
2. event: 'enq: PS - contention'
time waited: 0.000113 sec
wait id: 24769 p1: 'name|mode'=0x50530006
p2: 'instance'=0x1
p3: 'slave ID'=0x7
* time between wait #2 and #3: 0.000012 sec
3. event: 'PX Deq: Slave Session Stats'
time waited: 0.000002 sec
wait id: 24768 p1: 'sleeptime/senderid'=0x0
p2: 'passes'=0x0
}

and library cache lock during index cleanup:

which is waiting for 'library cache lock' with wait info:
{
p1: 'handle address'=0x321d7c6138
p2: 'lock address'=0x2f8fcd88f8
p3: '100*mode+namespace'=0x597c400040002
time in wait: 2.551053 sec
timeout after: 14 min 57 sec
wait id: 354679
blocking: 1 session
current sql: alter index "CENSORED" modify partition "PART_1200_1400" coalesce cleanup
wait history:
* time between current wait and wait #1: 0.000024 sec
1. event: 'db file sequential read'
time waited: 0.000133 sec
wait id: 354678 p1: 'file#'=0xd
p2: 'block#'=0x57d5e0a
p3: 'blocks'=0x1
* time between wait #1 and #2: 0.000016 sec
2. event: 'db file sequential read'
time waited: 0.000142 sec
wait id: 354677 p1: 'file#'=0xd
p2: 'block#'=0x57d5546
p3: 'blocks'=0x1
* time between wait #2 and #3: 0.000016 sec
3. event: 'db file sequential read'
time waited: 0.000129 sec
wait id: 354676 p1: 'file#'=0xd
p2: 'block#'=0x57d48e9
p3: 'blocks'=0x1
}

Tags:

Comments are closed.