A look into into Oracle redo, part 4: the log writer null write

This is the fourth blogpost on a series of blogposts about how the Oracle database handles redo. The previous blogpost talked about the work cycle of the log writer: https://fritshoogland.wordpress.com/2018/02/12/a-look-into-oracle-redo-part-3-the-log-writer-work-cycle-overview/. This posts is looking into the execution of the kcrfw_redo_write_driver function executed in the ksbcti.

Now that we are familiar with how the logwriter works in general, we need to take a closer look to the kcrfw_redo_write_driver function. First let me be clear: the kcrfw_redo_write_driver function inside ksbcti is called every time the logwriter process times out on its semaphore, which is every 3 seconds, so this means it is called too when nothing is written in the public redo strands. In fact, when the log writer times out on semtimedop (which means it is not (yet) semctl’ed or semop’ed), it doesn’t know if there are entries in the public redo strands at that point in time.

There is a lot of things that go on in the kcrfw_redo_write_driver function. The following tracing are small snippets of execution from the function, not the every detail in this function.

The first thing that is of interest is the logwriter obtaining a LWN (log write number) number and SCN.

 | | > kcsnew3(0x600113b8, 0x7ffc3dd10d28, ...)
 | | | > kcsnew8(0x600113b8, 0x7ffc3dd10b70, ...)
 | | | | > kslgetl(0x60049800, 0x1, ...)
 | | | | < kslgetl+0x00000000012f returns: 0x1
 | | | | > kslfre(0x60049800, 0x1, ...)
 | | | | < kslfre+0x0000000001e2 returns: 0
 | | | < kcsnew8+0x000000000117 returns: 0

Please mind at this time it’s unknown if there is anything to write in the public redo strands.

So the kcsnew3 function is executed, with 0x600113b8 as first argument. That is the address of kcsgscn_, the instance global SCN. The kcsnew3 function calls the kcsnew8 function, which calls kslgetl for latch address 0x60049800. That address belongs to the ‘lgwr LWN SCN’ latch in my instance. In the function call overview there is nothing more to see. If see look at the pina memory trace, we see a lot more; this is what is executed in between kslgetl and kslfre:

kcsnew8+107:0x0000000060016274(fixed sga|var:kcrfsg_+76 ):W:4:0x1/1()
kcsnew8+118:0x0000000060016270(fixed sga|var:kcrfsg_+72 ):R:4:0xc5/197()
kcsnew8+118:0x0000000060016270(fixed sga|var:kcrfsg_+72 ):W:4:0xc6/198()
kcsnew8+125:0x00000000600113b8(fixed sga|var:kcsgscn_+0 ):R:8:0x762b7e/7744382()
kcsnew8+131:0x00007fffa1071ae0():W:4:0/0()
kcsnew8+140:0x00007fffa1071b12():W:1:0/0()
kcsnew8+149:0x00000000600113b8(fixed sga|var:kcsgscn_+0 ):R:8:0x762b7e/7744382()
kcsnew8+149:0x00000000600113b8(fixed sga|var:kcsgscn_+0 ):W:8:0x762b7f/7744383()
kcsnew8+165:0x00007fffa1071ae0():R:4:0/0()
kcsnew8+184:0x00007f520ba8be08():R:8:0x600113b8/1610683320(fixed sga|var:kcsgscn_+0 )
kcsnew8+193:0x00007f520ba8ae90():R:8:0x7a823620/2055353888(shared pool|permanent memor,duration 1,cls perm+769 )
kcsnew8+212:0x000000007a823e40(shared pool|permanent memor,duration 1,cls perm+7695936 ):R:8:0x719beed8/190604(shared pool|ksu:stats_freel,duration 1,cls freeabl+24 )
kcsnew8+219:0x00000000719bf790(shared pool|ksu:stats_freel,duration 1,cls freeabl+2256 ):R:8:0xc6/198()
kcsnew8+219:0x00000000719bf790(shared pool|ksu:stats_freel,duration 1,cls freeabl+2256 ):W:8:0xc7/199()
kcsnew8+223:0x00007fffa1071bf0():W:8:0x762b7f/7744383()
kcsnew8+232:0x0000000060016260(fixed sga|var:kcrfsg_+56 shared pool|x_kcrfws.lwn_scn+0 ):W:8:0x762b7f/7744383()
kcsnew8+238:0x0000000060016274(fixed sga|var:kcrfsg_+76 ):W:4:0/0()
kcsnew8+246:0x0000000060016288(fixed sga|var:kcrfsg_+96 shared pool|pointer:lgwr lwn scn latch+0 ):R:8:0x60049800/1610913792(fixed sga|(parent)latch:lgwr LWN SCN+0 fixed sga|var:lwn_scn_lat_+0 )
kcsnew8+250:0x00007fffa1071ac8():W:8:0x10e3ed3f/283372863()

What is visible here, is that at kcrfsg_ offset 76 ‘1’ is written. This is quite probably a flag indicating the LWN related entries are being modified (this is a guess, of course the LWN SCN latch essentially performs the same function). Especially since this is set to ‘0’ at the end if the function (kcsnew8+238). At kcrfsg_ offset 72 a number is read and increased by 1. This is most likely the LWN number. If you take a look at the number of gets for the ‘lgwr LWN’ you will see the number of gets is very close (albeit not equal) to the LWN number.

Then the LWN SCN is fetched from kcsgscn_; it first is read, and increased by one to 7744383 and written back in kcsgscn_ (get and advance). The SCN from kcsgscn_ then is written in the kcrfsg_ struct at offset 56 a little further as well, after which the earlier set ‘1’ is reset to ‘0’ at offset 76. This SCN is visible in x$kcrfws as LWN SCN.

The next thing to look at is the function kcrfw_gather_lwn a little further:

 | | > kcrfw_gather_lwn(0x7ffc3dd10d68, 0x77eced90, ...)
 | | | > kslgetl(0x76fe0c10, 0x1, ...)
 | | | < kslgetl+0x00000000012f returns: 0x1
 | | | > kcrfw_gather_strand(0x7ffc3dd10d68, 0, ...)
 | | | < kcrfw_gather_strand+0x0000000000c2 returns: 0
 | | | > kslfre(0x76fe0c10, 0x118b2, ...)
 | | | < kslfre+0x0000000001e2 returns: 0
 | | | > kslgetl(0x76fe0cb0, 0x1, ...)
 | | | < kslgetl+0x00000000012f returns: 0x1
 | | | > kcrfw_gather_strand(0x7ffc3dd10d68, 0x1, ...)
 | | | < kcrfw_gather_strand+0x0000000000c2 returns: 0
 | | | > kslfre(0x76fe0cb0, 0x19, ...)
 | | | < kslfre+0x0000000001e2 returns: 0
 | | < kcrfw_gather_lwn+0x00000000065c returns: 0xffffffff

In the kcrfw_gather_lwn function first latch address x76fe0c10 is taken in immediate mode, which is the redo allocation child #1, and then the function kcrfw_gather_strand is called, which returns 0, after which the latch is freed. This pattern repeats itself with a different latch address, 0x76fe0cb0, which is redo allocation child #2. This gives a hint the two public strands are active. By looking at the memory accesses trace again, a fixed sga variable is used to find out how many strands are active:

7038:kcrfw_gather_lwn+76:0x00000000600169b8(Fixed Size(pgsz:2048k)|+92600 fixed sga|var:kcrf_max_strands_+0 ):R:4:0x2/2()

A little further information is obtained from kcrfsg_ about the redo allocation latch and the KCRFA structure address:

7052:kcrfw_gather_lwn+223:0x0000000060016570(fixed sga|var:kcrfsg_+840 ):R:8:0x76fe0c10/1996360720(shared pool|(child)latch#1:redo allocation+0 shared pool|permanent memor,duration 1,cls perm+16120848 )
7053:kcrfw_gather_lwn+230:0x0000000060016228(fixed sga|var:kcrfsg_+0 shared pool|pointer:shared pool redo struct+0 ):R:8:0x76fdf388/1996354440(shared pool|KCRFA+0 shared pool|permanent memor,duration 1,cls perm+16114568 )

Then the redo allocation latch child #1 is obtained via kslgetl, after which kcrfw_gather_lwn calls kcrfw_gather_strand:

kcrfw_gather_strand+2:0x00007fffa1071b00():W:8:0x7fffa1071c00/140735894985728()
kcrfw_gather_strand+15:0x0000000060016228(fixed sga|var:kcrfsg_+0 shared pool|pointer:shared pool redo struct+0 ):R:8:0x76fdf388/1996354440(shared pool|KCRFA+0 shared pool|permanent memor,duration 1,cls perm+16114568 )
kcrfw_gather_strand+22:0x00007fffa1071af8():W:8:0x76fdf388/1996354440(shared pool|KCRFA+0 shared pool|permanent memor,duration 1,cls perm+16114568 )
kcrfw_gather_strand+29:0x00007fffa1071af0():W:8:0x76fdf2b0/1996354224(shared pool|permanent memor,duration 1,cls perm+16114352 )
kcrfw_gather_strand+36:0x0000000076fdf458(shared pool|KCRFA+208 shared pool|LAST_BUF_WRITTEN+0 shared pool|permanent memor,duration 1,cls perm+16114776 ):R:4:0x65e/1630()
kcrfw_gather_strand+53:0x0000000076fdf480(shared pool|KCRFA+248 shared pool|TOTAL_BUFS_KCRFA+0 shared pool|permanent memor,duration 1,cls perm+16114816 ):R:4:0x20000/131072()
kcrfw_gather_strand+70:0x00007fffa1071de8():R:8:0x762b7f/7744383()
kcrfw_gather_strand+79:0x0000000076fdf390(shared pool|KCRFA+8 shared pool|permanent memor,duration 1,cls perm+16114576 ):R:8:0x762b7c/7744380()
kcrfw_gather_strand+86:0x0000000076fdf3a8(shared pool|KCRFA+32 shared pool|NEXT_BUF_NUM_KCRFA_CLN+0 shared pool|permanent memor,duration 1,cls perm+16114600 ):R:4:0x65e/1630()
kcrfw_gather_strand+121:0x0000000076fdf45c(shared pool|KCRFA+212 shared pool|LAST_BUF_GATHERED_KCRFA+0 shared pool|permanent memor,duration 1,cls perm+16114780 ):R:4:0x65e/1630()
kcrfw_gather_strand+169:0x00007fffa1071bf8():W:4:0x65e/1630()
kcrfw_gather_strand+176:0x00007fffa1071bfc():W:4:0/0()
kcrfw_gather_strand+182:0x00007fffa1071af0():R:8:0x76fdf2b0/1996354224(shared pool|permanent memor,duration 1,cls perm+16114352 )
kcrfw_gather_strand+186:0x00007fffa1071af8():R:8:0x76fdf388/1996354440(shared pool|KCRFA+0 shared pool|permanent memor,duration 1,cls perm+16114568 )
kcrfw_gather_strand+193:0x00007fffa1071b00():R:8:0x7fffa1071c00/140735894985728()
kcrfw_gather_strand+194:0x00007fffa1071b08():R:8:0x31f0f00/52367104()

In kcrfw_gather_strand, we see the sequence we have seen before: at offset 0 of kcrfsg_, the KCRFA struct address is looked up, after which the function investigates addresses in the KCRFA struct for the first public redo strand only. I’ve annotated these in the pinatrace snippet above. At offset 36 in the function, the last buffer written to the online redologfile is obtained from offset 208 in KCRFA, which is public strand buffer 1630. A little further, at offset 86 and 121 in the function kcrfw_gather_strand, the most recently used public strand buffer that is written to is obtained at KCRFA offset 32 (NEXT_BUF_NUM_KCRFA_CLN), and the highest buffer number that has been investigated (gathered) by the logwriter process at KCRFA offset 212 (LAST_BUF_GATHERED_KCRFA). In all these cases, the public redo strand buffer number is 1630. This means no redo has been written into the first public redo strand.
After reading the KCRFA information for the first public redo strand, the latch for this struct is freed.

A little further, the latch is obtained that protects the second public strand, and kcrfw_gather_strand is executed again to read the KCRFA information for this strand:

kcrfw_gather_strand+2:0x00007fffa1071b00():W:8:0x7fffa1071c00/140735894985728()
kcrfw_gather_strand+15:0x0000000060016228(fixed sga|var:kcrfsg_+0 shared pool|pointer:shared pool redo struct+0 ):R:8:0x76fdf388/1996354440(shared pool|KCRFA+0 shared pool|permanent memor,duration 1,cls perm+16114568 )
kcrfw_gather_strand+22:0x00007fffa1071af8():W:8:0x76fdf4b0/1996354736(shared pool|KCRFA+296 shared pool|permanent memor,duration 1,cls perm+16114864 )
kcrfw_gather_strand+29:0x00007fffa1071af0():W:8:0x76fdf1f0/1996354032(shared pool|permanent memor,duration 1,cls perm+16114160 )
kcrfw_gather_strand+36:0x0000000076fdf580(shared pool|KCRFA+504 shared pool|permanent memor,duration 1,cls perm+16115072 ):R:4:0x1d/29()
kcrfw_gather_strand+53:0x0000000076fdf5a8(shared pool|KCRFA+544 shared pool|permanent memor,duration 1,cls perm+16115112 ):R:4:0x20000/131072()
kcrfw_gather_strand+70:0x00007fffa1071de8():R:8:0x762b7f/7744383()
kcrfw_gather_strand+79:0x0000000076fdf4b8(shared pool|KCRFA+304 shared pool|permanent memor,duration 1,cls perm+16114872 ):R:8:0x762ae7/7744231()
kcrfw_gather_strand+86:0x0000000076fdf4d0(shared pool|KCRFA+328 shared pool|permanent memor,duration 1,cls perm+16114896 ):R:4:0x1d/29()
kcrfw_gather_strand+121:0x0000000076fdf584(shared pool|KCRFA+508 shared pool|permanent memor,duration 1,cls perm+16115076 ):R:4:0x1d/29()
kcrfw_gather_strand+169:0x00007fffa1071bf8():W:4:0x1d/29()
kcrfw_gather_strand+176:0x00007fffa1071bfc():W:4:0/0()
kcrfw_gather_strand+182:0x00007fffa1071af0():R:8:0x76fdf1f0/1996354032(shared pool|permanent memor,duration 1,cls perm+16114160 )
kcrfw_gather_strand+186:0x00007fffa1071af8():R:8:0x76fdf4b0/1996354736(shared pool|KCRFA+296 shared pool|permanent memor,duration 1,cls perm+16114864 )
kcrfw_gather_strand+193:0x00007fffa1071b00():R:8:0x7fffa1071c00/140735894985728()
kcrfw_gather_strand+194:0x00007fffa1071b08():R:8:0x31f0f00/52367104()

This is exactly the same function, just reading the same information for a different strand, which means at a different offset in KCRFRA. If you subtract the size of the the information for a single strand which we calculated to be 296 from the KCRFA offsets (see blogpost number 2), you get the same numbers as above (504-296=208=LAST_BUF_WRITTEN). Once you understand what is read, it should be fairly easy to see nothing has been written into the second strand too. So at this point, the logwriter knows there is nothing to write.

What does the logwriter do when there is nothing to write? It executes a ‘null write’:

 | | > kcrfw_do_null_write(0, 0, ...)
 | | | > kcrfw_slave_phase_batchdo(0, 0, ...)
 | | | | > kcrfw_slave_phase_enter(0, 0x40, ...)
 | | | | < kcrfw_slave_phase_enter+0x000000000449 returns: 0
 | | | <> kcrfw_slave_phase_exit(0, 0x40, ...)
 | | | < kcrfw_slave_phase_exit+0x00000000035a returns: 0
 | | | > kcrfw_post(0, 0, ...)
 | | | | > kcrfw_slave_single_getactivegroup(0, 0, ...)
 | | | | < kcrfw_slave_single_getactivegroup+0x000000000047 returns: 0x769a5620
 | | | | > kspGetInstType(0x1, 0x1, ...)
 | | | | | > vsnffe_internal(0x19, 0x1, ...)
 | | | | | | > vsnfprd(0x19, 0x1, ...)
 | | | | | | < vsnfprd+0x00000000000f returns: 0x8
 | | | | | | > kfIsASMOn(0x19, 0x1, ...)
 | | | | | | <> kfOsmInstanceSafe(0x19, 0x1, ...)
 | | | | | | < kfOsmInstanceSafe+0x000000000031 returns: 0
 | | | | | < vsnffe_internal+0x0000000000a7 returns: 0
 | | | | | > kspges(0x115, 0x1, ...)
 | | | | | < kspges+0x00000000010f returns: 0
 | | | | < kspGetInstType+0x0000000000b1 returns: 0x1
 | | | | > kcrfw_slave_phase_enter(0x1, 0x40, ...)
 | | | | < kcrfw_slave_phase_enter+0x00000000006f returns: 0x40
 | | | | > kcscu8(0x60016290, 0x7ffc3dd10a98, ...)
 | | | | < kcscu8+0x000000000047 returns: 0x1
 | | | | > kcsaj8(0x60016290, 0x7ffc3dd10a38, ...)
 | | | | < kcsaj8+0x0000000000dc returns: 0x1
 | | | | > kcrfw_slave_phase_exit(0x1, 0x40, ...)
 | | | | < kcrfw_slave_phase_exit+0x00000000008e returns: 0
 | | | | > kslpsemf(0x97, 0, ...)
 | | | | | > ksl_postm_init(0x7ffc3dd08730, 0x7ffc3dd10750, ...)
 | | | | | < ksl_postm_init+0x00000000002b returns: 0
 | | | | < kslpsemf+0x0000000006b5 returns: 0x1f
 | | | | > kcrfw_slave_barrier_nonmasterwait(0x769a5628, 0x4, ...)
 | | | | < kcrfw_slave_barrier_nonmasterwait+0x000000000035 returns: 0x600161a0
 | | | < kcrfw_post+0x000000000c1c returns: 0xd3
 | | < kcrfw_do_null_write+0x0000000000b2 returns: 0xd3

This means there is housekeeping to do, despite no public redo strand buffers needing writing. The most crucial things it does that I can detect, is updating the on disk SCN in the kcrfsg_ struct with the already set LWN SCN and posting any outstanding processes that are sleeping on a semaphore via post/wait. These are done in the kcrfw_post function, which probably means (kernel cache redo file write) ‘post write’, however there was nothing to write (null write), since there was nothing in the public redo strands. If you look to the above function trace, you see inside kcrfw_do_null_write, kcrfw_post is called.

Inside kcrfw_post, kcscu8 (kernel cache service get current SCN) is called with 0x60016290 as first argument, which is the memory address of offset 104 in the kcrfsg_ struct in the fixed SGA, which is externalised as ON_DISK_SCN in X$KCRFWS, which is done to read the current value of the on disk SCN:

kcscu8+2:0x00007fffa10718a0():W:8:0x7fffa1071bb0/140735894985648()
kcscu8+13:0x00007fffa1071898():W:8:0x77eced90/2012016016(Variable Size(pgsz:2048k)|+384626064 shared pool|permanent memor,duration 1,cls perm+2451496 )
kcscu8+20:0x00007fffa1071890():W:8:0x7f520ba98f40/139990359707456()
kcscu8+27:0x00000000600162a0(Fixed Size(pgsz:2048k)|+90784 fixed sga|var:kcrfsg_+120 ):R:4:0/0()
kcscu8+31:0x00000000600162a4(Fixed Size(pgsz:2048k)|+90788 fixed sga|var:kcrfsg_+124 ):R:4:0/0()
kcscu8+39:0x0000000060016290(Fixed Size(pgsz:2048k)|+90768 fixed sga|var:kcrfsg_+104 shared pool|x_kcrfws.on_disk_scn+0 ):R:8:0x762b7e/7744382()
kcscu8+43:0x00000000600162a0(Fixed Size(pgsz:2048k)|+90784 fixed sga|var:kcrfsg_+120 ):R:4:0/0()
kcscu8+56:0x00007fffa1071b18():W:8:0x762b7e/7744382()
kcscu8+59:0x00007fffa1071898():R:8:0x77eced90/2012016016(Variable Size(pgsz:2048k)|+384626064 shared pool|permanent memor,duration 1,cls perm+2451496 )
kcscu8+63:0x00007fffa1071890():R:8:0x7f520ba98f40/139990359707456()
kcscu8+70:0x00007fffa10718a0():R:8:0x7fffa1071bb0/140735894985648()
kcscu8+71:0x00007fffa10718a8():R:8:0x3202150/52437328()

The next function that is called is kcsaj8 (kernel cache service adjust SCN), with 0x60016290 as first argument, indicating it is going to change the on disk SCN:

kcsaj8+105:0x0000000060016290(Fixed Size(pgsz:2048k)|+90768 fixed sga|var:kcrfsg_+104 shared pool|x_kcrfws.on_disk_scn+0 ):R:8:0x762b7e/7744382()
kcsaj8+105:0x0000000060016290(Fixed Size(pgsz:2048k)|+90768 fixed sga|var:kcrfsg_+104 shared pool|x_kcrfws.on_disk_scn+0 ):W:8:0x762b7f/7744383()
(these are only the relevant lines from the memory trace for the function kcsaj8)

What this means is that the LWN SCN that was obtained at te beginning of the write cycle using the kcsnew3 function, now is written into the on disk SCN too, despite anything actually being written to disk. The obvious question at this point is ‘if the on disk SCN is increased without anything actually written, would there be a SCN value that actually tracks the SCN of truly written (“on disk”) redo? To spoil the next blogpost about actual redo writing, this is administered in a SCN value called the ‘real redo SCN’.

After kcrfw_post returns, the execution returns from kcrfw_do_null_write and kcrfw_redo_write_driver, ending the ‘null write’.

	Data Virtualizer… on Getting to know Oracle wait ev…
	Frits Hoogland on Oracle 12.2 wait event ‘…
	Andrew Markiewicz on Oracle 12.2 wait event ‘…
	YugabyteDB Off Campu… on My reasons for moving to Yugab…
	Harold Zubieta on Control file inconsistencies d…

—Frits Hoogland Weblog

IT Technology; Yugabyte, Postgres, Oracle, linux, TCP/IP and other stuff I find interesting

A look into into Oracle redo, part 4: the log writer null write

Leave a comment Cancel reply

Share this:

Leave a comment Cancel reply