Oracle 12 and latches, part 2

In my previous post, I looked at non shared latches and how the latching is done by Oracle. This post is a description on how the latching works for shared latches.

The information is quite internal, if you landed on this page it might be a good idea to start with my first post on this topic: first post.

A famous example for shared latches is the ‘cache buffers chains’ latch.

For the sake of the test I quite randomly scanned a test table, and had a little gdb script to look at the function call ksl_get_shared_latch:

break ksl_get_shared_latch
  commands
    silent
    printf "ksl_get_shared_latch laddr:%x, willing:%d, where:%d, why:%d, mode:%d\n", $rdi, $rsi, $rdx, $rcx, $r8
    c
  end

After getting the results, I looked up the latch addresses (laddr) in V$LATCH.ADDR, and searched for a cache buffers chains latch.

Next, I did setup the same construction as the previous investigation into non shared latches, I started two sqlplus / as sysdba sessions. In the first session, I simply took a shared latch in exclusive mode (16 as the fourth argument).

SQL> oradebug setmypid
Statement processed.
SQL> oradebug call ksl_get_shared_latch 0x94af8768 1 0 2303 16
Function returned 1

This takes latch 0x94af8768 in mode 16, exclusive mode.

I essentially did the same as with the previous investigation, I ran perf record, on the session trying to get the latch, and looked at the functions which were used during the spinning for the shared latch with perf report. If the spinning is not clear enough visible, set “_spin_count” to an higher value.

After the investigation of perf report, I worked out the relevant functions. Here is the gdb script with the relevant functions that I discovered:

break ksl_get_shared_latch
  commands
    silent
    printf "ksl_get_shared_latch laddr:%x, willing:%d, where:%d, why:%d, mode:%d\n", $rdi, $rsi, $rdx, $rcx, $r8
    c
  end
break kslgess
  commands
    silent
    printf "kslgess %x, %d, %d, %d\n", $rdi, $rsi, $rdx, $rcx
    c
  end
break kslskgs
  commands
    silent
    printf "kslskgs %x, %d, %d, %d\n", $rdi, $rsi, $rdx, $rcx
    c
  end
break *0xc291a9
  commands
    silent
    printf " kslskgs loop: %d\n", $r15d
    c
  end
break kslwlmod
  commands
    silent
    printf "kslwlmod %d, %d, %d, %d\n", $rdi, $rsi, $rdx, $rcx
    c
  end
break skgpwwait
  commands
    silent
    printf "skgpwwait %d, %d, %d, %d\n", $rdi, $rsi, $rdx, $rcx
    c
  end
break sskgpwwait
  commands
    silent
    printf "sskgpwwait %d, %d, %d, %d\n", $rdi, $rsi, $rdx, $rcx
    c
  end
break semop
  commands
    silent
    printf "semop %d, %d, %d, %d\n", $rdi, $rsi, $rdx, $rcx
    c
  end

If you attach to the second process and source the breakpoint script, this is what you will see:

ksl_get_shared_latch laddr:94af8768, willing:1, where:0, why:2303, mode:16
kslgess 94af8768, 16, 1, 0
kslskgs 94af8768, 1, 464092136, 464092728
 kslskgs loop: 2000
 kslskgs loop: 1999
...
 kslskgs loop: 2
 kslskgs loop: 1
kslwlmod 464092480, -1780205088, -1800435864, 1
kslskgs 94af8768, 1, 464092136, 464092728
 kslskgs loop: 1
skgpwwait 464091920, 1628424256, -1780204168, 0
sskgpwwait 464091920, 1628424256, -1780204168, 0
semop 1409027, 464091720, 1, -1

1-A shared latch is gotten using the distinct ksl_get_shared_latch function call. In the ksl_get_shared_latch function an attempt to get the latch is done, and returns if succeeded.
2-Next kslgess (kernel service latch get spin shared (latch)). It does not appear a latch get attempt is done in this function.
3-kernel service latch spin kernel get shared (this is a guess). Here the latch is tried up to _spin_count-1 times.
4-8 this is the kslskgs function spinning over a latch get.
9-kernel service latch wait list modification. Here the process is (almost?) done spinning, and registers itself in the post wait interface for getting post when the latch is freed.
10/11-here a last time the latch is tried.
12-system kernel generic post wait wait: the function to setup the waiting for getting posted.
13-system system kernel generic post wait wait: probably the port specific code for the post wait setup.
14-semop: semaphore operation: the session starts sleeping on a semaphore waiting to get posted. As has been indicated previously, this makes the process stay clear of running unless it is posted.

This also means that instead of the (very) CPU heaving spinning on the latches, which was done in the old days, this way is very light on the CPU, and seems to be very efficient. Or more efficient than the old days.

A word of caution: it seems the shared latch get in exclusive mode can not be freed; the session that has gotten the latch will hang when calling kslfre, and the session that was waiting for the latch did get:

ORA-03113: end-of-file on communication channel
ORA-24323: value not allowed

This seems to be expected behaviour, as indicated by Stefan Koehler and Andrey Nikolaev, which is not fixed in Oracle version 12.1.0.2.

3 comments

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.