Recently I was investigating the inner working of Oracle. One of the things that is fundamental to the Oracle database is the SCN (system change number). SCNs are used to synchronise changes in the database. There is one source for SCNs in every instance (kcbgscn; the global or current SCN in the fixed SGA), and there are multiple tasks for which Oracle keeps track of synchronisation using SCNs. A few of these tasks for which Oracle stores and uses SCNs to keep track of progress are on disk SCN and lwn SCN.
This blogpost is about some oddities I found when using gdb (the GNU debugger) to watch memory locations of a running Oracle database. This should not be done on a production instance, and is purely for research purposes. Only use the methods mentioned in this article if you are absolutely sure what you are doing, and/or if you are using an Oracle instance that can be crashed and can be restored.
The situation
I was researching the ‘on disk scn’ in a 11.2.0.2 database. I used the following (simplified) gdb macro:
break kcrf_commit_force
commands
print $rip
c
end
awatch *0x60022659
commands
c
end
This macro breaks on the function kcrf_commit_force essentially to understand the different phases of a commit, and uses the ‘awatch’ function (access watch; which means this watchpoint fires at both reading and writing) to look at memory location 0x60022659. That memory location is the on disk SCN for this instance.
Upon executing a commit after an insert, I saw the following output:
Breakpoint 1, 0x0000000008fb3a72 in kcrf_commit_force ()
$1 = (void (*)()) 0x8fb3a72 <kcrf_commit_force+4>
Hardware access (read/write) watchpoint 2: *0x60022659
Old value = 898
New value = 899
0x0000000008fb5ad1 in kcscur3 ()
Breakpoint 1, 0x0000000008fb3a72 in kcrf_commit_force ()
$2 = (void (*)()) 0x8fb3a72 <kcrf_commit_force+4>
Hardware access (read/write) watchpoint 2: *0x60022659
Value = 899
0x0000000008fb5ad1 in kcscur3 ()
Hardware access (read/write) watchpoint 2: *0x60022659
Value = 899
0x0000000008fb5ad1 in kcscur3 ()
Hardware access (read/write) watchpoint 2: *0x60022659
Value = 899
0x0000000008fb5ad1 in kcscur3 ()
What this shows, is the following:
– line 1 and 9: the kcrf_commit_force function is called.
– line 11, 15 and 19: the memory location *0x60022659 is read, resulting in the value 899.
– line 3: the memory location *0x60022659 is written, it contained 898, and the new value is 899.
This greatly puzzled me. As far as I understand the way Oracle works, a server process that issues ‘commit’ is supposed to READ the on disk SCN, it does not write the on disk SCN, because the only process that changes the on disk SCN is the logwriter. However, quite clearly gdb tells me the server process is writing it. Further consideration is that IF the server process is writing it, where does this information come from?
After some further investigation, I came to the conclusion that the only way to prove to myself that what I am seeing is correct, is to create a program myself mimicking the behaviour witnessed above to truly understand what is going on, because I don’t believe what I am seeing is correct.
I created two little c programs. One that starts up, creates a shared memory segment, writes a letter into the shared memory segment and then waits for a keystroke, and puts the keystroke into the same location, and one that reads the same shared memory segment, then waits for enter to pressed, and reads the shared memory segment again, and outputs if the letter read from the shared memory is different. I called the first one ‘server’, and the second one ‘client’. This way, I can attach gdb to the client executable, which very much like an Oracle server process, reads a shared memory location which I can change at will.
The code can be found on GitHub: shared memory test source. Please mind this is simple proof of concept code that is not exhaustive in error checking and other sanity checks in any way. The source files can be compiled and linked in the following way:
$ cc server.c -o server
$ cc client.c -o client
The server executable can be run in the following way:
$ ./server
Type a letter. q quits.
This creates the shared memory segment, and puts the ‘a’ character in the variable c (line 42, c = ‘a’). As long as the c variable is not ‘q’, it writes the c variable into shared memory segment using the pointer s (*s = c) , and then waits for user input using the scanf function.
Once the shared memory segment is created and written into, the client program can be started:
$ ./client
client started. ^C to quit.
waiting for enter
The first time enter is pressed it will output ‘keystroke: a’ because the p variable was not initialised, and will be assigned the ‘a’ character (p = *s), the second time enter is pressed, ‘keystroke:’ is not printed, because the p variable and the memory location the pointer *s points to both contain ‘a’. What is vital, is that the client program NEVER writes, only reads the shared memory location, in three occasions: in line 50:
if( p != *s) {
In line 51:
printf("keystroke: %c\n", *s);
And in line 52:
p = *s;
Okay. So at this point we got a program that mimics what I think an Oracle server process is doing too on commit time (reading shared memory), which is the program ‘client’, and we got a program which creates and can manipulate the shared memory location the client is reading from, which is ‘server’. Startup the server program (./server), and startup the client using gdb:
$ gdb client
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-90.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/oracle/src/shm_test/client...(no debugging symbols found)...done.
(gdb)
This is different than attaching to a process using gdb -p PID, but attaching to a process using ‘-p’ is the only way that I am aware of to use gdb on Oracle database processes. Now that we started gdb with the client program, we must first run it in order to see the program’s runtime environment:
(gdb) run
Starting program: /home/oracle/src/shm_test/client
client started. ^C to quit.
waiting for enter
Now invoke gdb (using control-C) and look at the memory mappings to see where the shared memory segment is:
(gdb) info proc mappings
process 3069
cmdline = '/home/oracle/src/shm_test/client'
cwd = '/home/oracle/src/shm_test'
exe = '/home/oracle/src/shm_test/client'
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x400000 0x401000 0x1000 0 /home/oracle/src/shm_test/client
0x600000 0x601000 0x1000 0 /home/oracle/src/shm_test/client
0x7ffff7a48000 0x7ffff7bd2000 0x18a000 0 /lib64/libc-2.12.so
0x7ffff7bd2000 0x7ffff7dd2000 0x200000 0x18a000 /lib64/libc-2.12.so
0x7ffff7dd2000 0x7ffff7dd6000 0x4000 0x18a000 /lib64/libc-2.12.so
0x7ffff7dd6000 0x7ffff7dd8000 0x2000 0x18e000 /lib64/libc-2.12.so
0x7ffff7dd8000 0x7ffff7ddc000 0x4000 0
0x7ffff7ddc000 0x7ffff7dfc000 0x20000 0 /lib64/ld-2.12.so
0x7ffff7ff0000 0x7ffff7ff3000 0x3000 0
0x7ffff7ff6000 0x7ffff7ff8000 0x2000 0
0x7ffff7ff8000 0x7ffff7ff9000 0x1000 0 /SYSV0000029a (deleted)
0x7ffff7ff9000 0x7ffff7ffa000 0x1000 0
0x7ffff7ffa000 0x7ffff7ffb000 0x1000 0 [vdso]
0x7ffff7ffb000 0x7ffff7ffd000 0x2000 0x1f000 /lib64/ld-2.12.so
0x7ffff7ffd000 0x7ffff7ffe000 0x1000 0x21000 /lib64/ld-2.12.so
0x7ffff7ffe000 0x7ffff7fff000 0x1000 0
0x7ffffffea000 0x7ffffffff000 0x15000 0 [stack]
0xffffffffff600000 0xffffffffff601000 0x1000 0 [vsyscall]
Line 19 contains the shared memory segment, at memory address 0x7ffff7ff8000. To do the same as watching the on disk SCN location in Oracle, put an access watchpoint at 0x7ffff7ff8000:
(gdb) awatch *0x7ffff7ff8000
Hardware access (read/write) watchpoint 1: *0x7ffff7ff8000
(gdb) commands
Type commands for breakpoint(s) 1, one per line.
End with a line saying just "end".
>c
>end
(gdb) c
Continuing.
Now press enter, and we see the shared memory location being accessed three times:
Hardware access (read/write) watchpoint 1: *0x7ffff7ff8000
Value = 97
0x0000000000400712 in main ()
Hardware access (read/write) watchpoint 1: *0x7ffff7ff8000
Value = 97
0x000000000040071e in main ()
keystroke: a
Hardware access (read/write) watchpoint 1: *0x7ffff7ff8000
Value = 97
0x000000000040073c in main ()
waiting for enter
If you take the source code, you’ll see that these are the following lines of code at rows 50-52:
if( p != *s ) {
printf("keystroke: %c\n", *s);
p = *s;
Every line reads the shared memory location via the pointer *s. It’s also clear it’s a read because the value in the memory location (97) is displayed once.
Now go over to the server side, and press ‘b’ and enter. Now the shared memory segment contains ‘b’. Next, go back to the client program, and press enter:
waiting for enter
Hardware access (read/write) watchpoint 1: *0x7ffff7ff8000
Old value = 97
New value = 98
0x0000000000400712 in main ()
Hardware access (read/write) watchpoint 1: *0x7ffff7ff8000
Value = 98
0x000000000040071e in main ()
keystroke: b
Hardware access (read/write) watchpoint 1: *0x7ffff7ff8000
Value = 98
0x000000000040073c in main ()
waiting for enter
Hey! This shows what looks like a write with the first watchpoint, because the memory location is touched, and it reports the value at the location being changed. However, because we know the source code, we know we only read the shared memory location with the client program, not write to it! So what this shows in this case is that gdb knew what the previous value was of the memory location was, and detected that it was changed.
To see how a true write looks like, you can startup the server executable with gdb too, and set an ‘awatch’ watchpoint on the shared memory location exactly the same way as shown with the client program above. This is how that looks like:
(gdb) c
Continuing.
b
Hardware access (read/write) watchpoint 1: *0x7ffff7ff8000
Old value = 97
New value = 98
0x00000000004006e9 in main ()
So conclusion here is that you can’t see the difference between a write to a memory location or a change of value when reading when using an ‘awatch’ watchpoint.
Now that we understand how ‘awatch’ works, would there be a way to make the distinction between reading and writing? Outside of ‘awatch’ (access watchpoint), there’s the ‘watch’ command, which is a watchpoint on writing to a memory location and ‘rwatch’, which is a watchpoint on reading a memory location. This is how the output of ‘rwatch’ looks like in the client program:
$ gdb client
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-90.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/oracle/src/shm_test/client...(no debugging symbols found)...done.
(gdb) run
Starting program: /home/oracle/src/shm_test/client
client started. ^C to quit.
waiting for enter
^C
Program received signal SIGINT, Interrupt.
0x00007ffff7b23670 in __read_nocancel () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.192.el6.x86_64
(gdb) info proc map
process 3517
cmdline = '/home/oracle/src/shm_test/client'
cwd = '/home/oracle/src/shm_test'
exe = '/home/oracle/src/shm_test/client'
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x400000 0x401000 0x1000 0 /home/oracle/src/shm_test/client
0x600000 0x601000 0x1000 0 /home/oracle/src/shm_test/client
0x7ffff7a48000 0x7ffff7bd2000 0x18a000 0 /lib64/libc-2.12.so
0x7ffff7bd2000 0x7ffff7dd2000 0x200000 0x18a000 /lib64/libc-2.12.so
0x7ffff7dd2000 0x7ffff7dd6000 0x4000 0x18a000 /lib64/libc-2.12.so
0x7ffff7dd6000 0x7ffff7dd8000 0x2000 0x18e000 /lib64/libc-2.12.so
0x7ffff7dd8000 0x7ffff7ddc000 0x4000 0
0x7ffff7ddc000 0x7ffff7dfc000 0x20000 0 /lib64/ld-2.12.so
0x7ffff7ff0000 0x7ffff7ff3000 0x3000 0
0x7ffff7ff6000 0x7ffff7ff8000 0x2000 0
0x7ffff7ff8000 0x7ffff7ff9000 0x1000 0 /SYSV0000029a (deleted)
0x7ffff7ff9000 0x7ffff7ffa000 0x1000 0
0x7ffff7ffa000 0x7ffff7ffb000 0x1000 0 [vdso]
0x7ffff7ffb000 0x7ffff7ffd000 0x2000 0x1f000 /lib64/ld-2.12.so
0x7ffff7ffd000 0x7ffff7ffe000 0x1000 0x21000 /lib64/ld-2.12.so
0x7ffff7ffe000 0x7ffff7fff000 0x1000 0
0x7ffffffea000 0x7ffffffff000 0x15000 0 [stack]
0xffffffffff600000 0xffffffffff601000 0x1000 0 [vsyscall]
(gdb) rwatch *0x7ffff7ff8000
Hardware read watchpoint 1: *0x7ffff7ff8000
(gdb) commands
Type commands for breakpoint(s) 1, one per line.
End with a line saying just "end".
>c
>end
(gdb) c
Continuing.
Hardware read watchpoint 1: *0x7ffff7ff8000
Value = 97
0x0000000000400712 in main ()
Hardware read watchpoint 1: *0x7ffff7ff8000
Value = 97
0x000000000040071e in main ()
keystroke: a
Hardware read watchpoint 1: *0x7ffff7ff8000
Value = 97
0x000000000040073c in main ()
waiting for enter
Hardware read watchpoint 1: *0x7ffff7ff8000
Value = 98
0x000000000040071e in main ()
keystroke: b
Hardware read watchpoint 1: *0x7ffff7ff8000
Value = 98
0x000000000040073c in main ()
waiting for enter
At line 1 gdb is started with the ‘client’ program. Once inside gdb, the executable is run (line 12). Once it’s running we interrupt in order to get to gdb (line 16). In gdb, the info proc map command is used (line 20) to look at the memory mappings. From the memory mappings, we see address 0x7ffff7ff8000 is the shared memory segment. A read watch is set and a c (continue) command is added to the read watchpoint (line 46 and 48). Then the execution is continued with the c command (line 53).
Now we are running in the executable again. After continuing, I pressed enter to go through the loop in the client executable, and the watchpoint is triggered three times (lines 56, 60 and 68), as we did see earlier, for source code lines 50, 51 and 52.
Now I changed the letter in the shared memory segment from ‘a’ to ‘b’ with the server executable, and pressed enter at line 60. The interesting things that happens is that the read watchpoint is triggered TWO times. This should be three times (!!).
There is a way to make the now invisible read visible again, using the following command in gdb:
(gdb) watch *0x7ffff7ff8000
Hardware watchpoint 2: *0x7ffff7ff8000
(gdb) commands
Type commands for breakpoint(s) 2, one per line.
End with a line saying just "end".
>c
>end
(gdb) c
Continuing.
Yes, that’s a WRITE watchpoint. After adding a write watchpoint, this is how the output looks like when I change the letter from ‘b’ to ‘c’:
Hardware watchpoint 2: *0x7ffff7ff8000
Old value = 98
New value = 99
0x0000000000400712 in main ()
Hardware read watchpoint 1: *0x7ffff7ff8000
Value = 99
0x000000000040071e in main ()
keystroke: c
Hardware read watchpoint 1: *0x7ffff7ff8000
Value = 99
0x000000000040073c in main ()
waiting for enter
There we have the tree reads visible again, however, one is ‘disguised’ as a write. To conclude: it seems gdb is not able to make a distinction between reads and writes for watchpoints if the watchpoint memory location is changed outside of the executable active in the debugger. Gdb seems to determine memory access a write if the value found is different from the last it was read.
Bonus information, not applicable to Oracle:
When you have the source code of an application, you can make debugging even better by adding in debug information for gdb in the executable. This is done by adding the ‘-g’ flag to the cc command:
$ cc -g client.c -o client
With the debug information available, it’s easier to debug:
$ gdb client
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-90.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/oracle/src/shm_test/client...done.
(gdb) tbreak 1
Temporary breakpoint 1 at 0x40067c: file client.c, line 1.
(gdb) run
Starting program: /home/oracle/src/shm_test/client
Temporary breakpoint 1, main () at client.c:22
22 key = 666;
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.192.el6.x86_64
(gdb) info local
shmid = 0
key = 0
shm = 0x400590 "1\355I\211\321^H\211\342H\203\344\360PTI\307\300P\a@"
s = 0x7fffffffe550 "\001"
p = 0 '\000'
(gdb) awatch s
Hardware access (read/write) watchpoint 2: s
(gdb) c
Continuing.
Hardware access (read/write) watchpoint 2: s
Old value = 0x7fffffffe550 "\001"
New value = 0x7ffff7ff8000 "a"
main () at client.c:45
45 printf("client started. ^C to quit.\n");
(gdb) c
Continuing.
client started. ^C to quit.
waiting for enter
Hardware access (read/write) watchpoint 2: s
Value = 0x7ffff7ff8000 "a"
0x000000000040070f in main () at client.c:50
50 if( p != *s ) {
(gdb) c
Continuing.
Hardware access (read/write) watchpoint 2: s
Value = 0x7ffff7ff8000 "a"
0x000000000040071b in main () at client.c:51
51 printf("keystroke: %c\n", *s);
(gdb) c
Continuing.
keystroke: a
Hardware access (read/write) watchpoint 2: s
Value = 0x7ffff7ff8000 "a"
0x0000000000400739 in main () at client.c:52
52 p = *s;
(gdb) c
Continuing.
waiting for enter
Above you see starting up the executable with gdb at line 1. After gdb is started, I first set a temporal breakpoint on line 1 (of the source code, shown on line 12 above), to have gdb ‘enter’ the executable and stop when I run it. If I wouldn’t do that, the program would ‘wait’ in the getchar() function, which is a function in libc, and the libc library would be my scope, line 19 tells that I have not loaded debug symbols for glibc. After setting the breakpoint, I run the executable, which immediately breaks on the temporal breakpoint (line 14).
Now that the executable is active, I can list the variables used using the info local command (line 20). Also, I can simply set a watchpoint on the variable s (line 26), instead of having to search the memory address myself. When the watchpoint is triggered, it now shows the line number in the source file, and the actual source code on that line (lines 34, 35, 44, 45, etc).
Like this:
Like Loading...