How to obtain semaphore information in gdb when the symbols are missing
This post was created when trying to understand how the Oracle executable works. Specifically the logwriter, which, if it is posted by a process, which is done using semop(), signals that process back using semop() if the logwriter happens to be in post/wait mode, and is not using the ‘scalable logwriter mode’, which means it is not using additional worker processes.
To be more specific, I tried investigating something that is not Oracle specific, but specific to the usage of semaphores on linux with an executable for which you do not have the source code and is not compiled with debugging symbols.
I attached to the process using gdb, and put a break on semop:
$ gdb -p 1000 ... (gdb) break semop Breakpoint 1 at 0x7fb92b0410c0: file ../sysdeps/unix/syscall-template.S, line 81. (gdb) c Continuing.
A word here: you probably will not see “file ../sysdeps/unix/syscall-template.S, line 81.”. This is because I installed the following debuginfo packages:
kernel-uek-debuginfo-4.14.35-1902.9.2.el7uek.x86_64
nss-softokn-debuginfo-3.44.0-5.0.1.el7.x86_64
kernel-uek-debuginfo-common-4.14.35-1902.9.2.el7uek.x86_64
glibc-debuginfo-common-2.17-292.0.1.el7.x86_64
libaio-debuginfo-0.3.109-13.el7.x86_64
numactl-debuginfo-2.0.12-3.el7_7.1.x86_64
glibc-debuginfo-2.17-292.0.1.el7.x86_64
When using Oracle linux (version 7), this is actually really easy, you add the debug info packages repo by adding the file /etc/yum.repos.d/debug.repo, and put this in the file:
[ol7_debuginfo] name=Oracle Linux 7 debuginfo baseurl=http://oss.oracle.com/ol7/debuginfo gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=1
You now even can use the ‘debuginfo-install’ executable that gdb tells you to do. A word of warning too: this repository is not very closely maintained by oracle (sadly, I blogged about this in the past), so things might be missing. For example, the debuginfo package for the libgcc on my system can not be found by yum. Another issue I encountered, was that when I tried installing the debuginfo package for my kernel, I couldn’t just say debuginfo-install kernel-uek, because that installed the debuginfo package for the latest kernel. So I had to specifically point it to my exact kernel version. When installing the kernel debuginfo package, which is very bulky, another word of warning: the repo (at least for me) is limited to a very low bandwidth, so downloading the file (+200MB) took a long time.
I installed this with the idea to have all system variables, like the ones for semaphores, present, so I could look into them. This turned out not to be the case:
Continuing. Breakpoint 1, semop () at ../sysdeps/unix/syscall-template.S:81 81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) (gdb)
Gdb broke execution because it encountered semop. Now let’s investigate. Because of the pseudo system call handler, we can only indirectly investigate the semop call. But how to know what to investigate? That’s where the manpages come in:
$ man semop SEMOP(2) Linux Programmer's Manual SEMOP(2) NAME semop, semtimedop - System V semaphore operations SYNOPSIS #include <sys/types.h> #include <sys/ipc.h> #include <sys/sem.h> int semop(int semid, struct sembuf *sops, unsigned nsops); int semtimedop(int semid, struct sembuf *sops, unsigned nsops, struct timespec *timeout); Feature Test Macro Requirements for glibc (see feature_test_macros(7)): semtimedop(): _GNU_SOURCE
So, semop takes 3 arguments, the semid as integer, a struc sembuf that holds the actual operation to be executed and the number of operations in the sembuf.
We still can investigate this, by knowing how the arguments are passed to a function:
– The first argument is in the CPU register $rdi
– The second argument is in the CPU register $rsi
– The third argument is in the CPU register $rdx
Well, let’s look at our session:
(gdb) p $rdi $1 = 229376 (gdb) p $rsi $2 = 140721769476432 (gdb) p $rdx $3 = 1
So, the simple information is available directly, the semid is 229376, and there is 1 operation.
Let’s look at semid 229376 (warning: you have to have access to the semaphore array to be able to see it):
$ ipcs -si 229376 Semaphore Array semid=229376 uid=54321 gid=54321 cuid=54321 cgid=54321 mode=0600, access_perms=0600 nsems = 250 otime = Sun Jan 19 16:36:30 2020 ctime = Sun Jan 19 15:52:22 2020 semnum value ncount zcount pid 0 1 0 0 3524 1 9065 0 0 3524 2 13900 0 0 3524 3 32766 0 0 3524 4 0 0 0 0 5 0 0 0 0 6 0 1 0 9340 7 0 1 0 9347 8 0 1 0 9356 9 0 0 0 0 10 0 1 0 10146 11 0 1 0 10163 12 0 1 0 30940 13 0 1 0 10189 14 0 1 0 10189 15 0 1 0 0 ...and so on...
Okay, so in order to understand what that semop call does, we need to look into the struct.
But this is what gdb says:
(gdb) p $rsi $4 = 140721769476432
Wait a minute, didn’t the main page say: struct sembuf *sops? That asterisk (‘*’) means it’s a pointer. Let’s try that:
(gdb) p * $rsi $5 = 65574
Well…not sure what that means…
(gdb) ptype *$rsi type = int
Ah…it thinks it’s an integer, and displays that… That’s not very helpful.
You can cast (declare a variable to be of a certain type, not the magician type of thing) a variable, so let’s try that:
(gdb) p (struct sembuf *) $rsi No struct type named sembuf.
Mhhh, despite installing all these debuginfo packages, it turns out the struct definition is not available.
But I really want to know the semaphore information!
I found this gdb feature:
(gdb) help add-symbol-file Load symbols from FILE, assuming FILE has been dynamically loaded. Usage: add-symbol-file FILE ADDR [-s <SECT> <SECT_ADDR> -s <SECT> <SECT_ADDR> ...] ADDR is the starting address of the file's text. The optional arguments are section-name section-address pairs and should be specified if the data and bss segments are not contiguous with the text. SECT is a section name to be loaded at SECT_ADDR.
So, I can add symbols from a file, provided that file is dynamically loadable. What if I create a mini file with the definition of sembuf? Would that work??
First create a very small c program that only defines a sembuf variable:
$ cat semh.c #include <sys/sem.h> struct sembuf mysembuf;
That’s two lines, that really is small, isn’t it?
Then compile it, but do not link it, we only need the object file:
$ gcc -c -g semh.c -o semh.o
(the ‘-c’ switch makes it only compile, not linking)
Now we got an object file semh.o. Let’s try to “side-load” that:
(gdb) add-symbol-file semh.o 0 add symbol table from file "semh.o" at .text_addr = 0x0 (y or n) y Reading symbols from /home/oracle/pin-3.11-97998-g7ecce2dac-gcc-linux/semh.o...done. (gdb)
(you have to say ‘y’ for it to load the symbol table at address 0x0)
Now let’s try casting again:
(gdb) print (struct sembuf *) $rsi $3 = (struct sembuf *) 0x7ffc5714f150
And now we can ask gdb to print the casted variable:
(gdb) p *$3 $5 = {sem_num = 38, sem_op = 1, sem_flg = 0}
And that’s because it knows how it looks like:
(gdb) ptype *$3 type = struct sembuf { unsigned short sem_num; short sem_op; short sem_flg; }
Now this information can be used to find the process the semop call is executed for:
$ ipcs -si 229376 | grep ^38 38 0 1 0 32324
So process 32324.
Pingback: Once again about GDB | dmitry remizov's weblog