Oracle 12 and latches, part 2
In my previous post, I looked at non shared latches and how the latching is done by Oracle. This post is a description on how the latching works for shared latches.
The information is quite internal, if you landed on this page it might be a good idea to start with my first post on this topic: first post.
A famous example for shared latches is the ‘cache buffers chains’ latch.
For the sake of the test I quite randomly scanned a test table, and had a little gdb script to look at the function call ksl_get_shared_latch:
break ksl_get_shared_latch commands silent printf "ksl_get_shared_latch laddr:%x, willing:%d, where:%d, why:%d, mode:%d\n", $rdi, $rsi, $rdx, $rcx, $r8 c end
After getting the results, I looked up the latch addresses (laddr) in V$LATCH.ADDR, and searched for a cache buffers chains latch.
Next, I did setup the same construction as the previous investigation into non shared latches, I started two sqlplus / as sysdba sessions. In the first session, I simply took a shared latch in exclusive mode (16 as the fourth argument).
SQL> oradebug setmypid Statement processed. SQL> oradebug call ksl_get_shared_latch 0x94af8768 1 0 2303 16 Function returned 1
This takes latch 0x94af8768 in mode 16, exclusive mode.
I essentially did the same as with the previous investigation, I ran perf record, on the session trying to get the latch, and looked at the functions which were used during the spinning for the shared latch with perf report. If the spinning is not clear enough visible, set “_spin_count” to an higher value.
After the investigation of perf report, I worked out the relevant functions. Here is the gdb script with the relevant functions that I discovered:
break ksl_get_shared_latch commands silent printf "ksl_get_shared_latch laddr:%x, willing:%d, where:%d, why:%d, mode:%d\n", $rdi, $rsi, $rdx, $rcx, $r8 c end break kslgess commands silent printf "kslgess %x, %d, %d, %d\n", $rdi, $rsi, $rdx, $rcx c end break kslskgs commands silent printf "kslskgs %x, %d, %d, %d\n", $rdi, $rsi, $rdx, $rcx c end break *0xc291a9 commands silent printf " kslskgs loop: %d\n", $r15d c end break kslwlmod commands silent printf "kslwlmod %d, %d, %d, %d\n", $rdi, $rsi, $rdx, $rcx c end break skgpwwait commands silent printf "skgpwwait %d, %d, %d, %d\n", $rdi, $rsi, $rdx, $rcx c end break sskgpwwait commands silent printf "sskgpwwait %d, %d, %d, %d\n", $rdi, $rsi, $rdx, $rcx c end break semop commands silent printf "semop %d, %d, %d, %d\n", $rdi, $rsi, $rdx, $rcx c end
If you attach to the second process and source the breakpoint script, this is what you will see:
ksl_get_shared_latch laddr:94af8768, willing:1, where:0, why:2303, mode:16 kslgess 94af8768, 16, 1, 0 kslskgs 94af8768, 1, 464092136, 464092728 kslskgs loop: 2000 kslskgs loop: 1999 ... kslskgs loop: 2 kslskgs loop: 1 kslwlmod 464092480, -1780205088, -1800435864, 1 kslskgs 94af8768, 1, 464092136, 464092728 kslskgs loop: 1 skgpwwait 464091920, 1628424256, -1780204168, 0 sskgpwwait 464091920, 1628424256, -1780204168, 0 semop 1409027, 464091720, 1, -1
1-A shared latch is gotten using the distinct ksl_get_shared_latch function call. In the ksl_get_shared_latch function an attempt to get the latch is done, and returns if succeeded.
2-Next kslgess (kernel service latch get spin shared (latch)). It does not appear a latch get attempt is done in this function.
3-kernel service latch spin kernel get shared (this is a guess). Here the latch is tried up to _spin_count-1 times.
4-8 this is the kslskgs function spinning over a latch get.
9-kernel service latch wait list modification. Here the process is (almost?) done spinning, and registers itself in the post wait interface for getting post when the latch is freed.
10/11-here a last time the latch is tried.
12-system kernel generic post wait wait: the function to setup the waiting for getting posted.
13-system system kernel generic post wait wait: probably the port specific code for the post wait setup.
14-semop: semaphore operation: the session starts sleeping on a semaphore waiting to get posted. As has been indicated previously, this makes the process stay clear of running unless it is posted.
This also means that instead of the (very) CPU heaving spinning on the latches, which was done in the old days, this way is very light on the CPU, and seems to be very efficient. Or more efficient than the old days.
A word of caution: it seems the shared latch get in exclusive mode can not be freed; the session that has gotten the latch will hang when calling kslfre, and the session that was waiting for the latch did get:
ORA-03113: end-of-file on communication channel ORA-24323: value not allowed
This seems to be expected behaviour, as indicated by Stefan Koehler and Andrey Nikolaev, which is not fixed in Oracle version 12.1.0.2.
Pingback: Oracle 12 and latches, part 3 | Frits Hoogland Weblog
Pingback: Latch acquisition/release call-graph : Dynamic tracing tools in action | Hatem Mahmoud Oracle's blog
Pingback: Introduction to pinatrace annotate version 2: a look into latches again | Frits Hoogland Weblog