Installing systemtap on OEL5, update 5
Systemtap is a scripting language for analyzing linux systems. Systemtap needs debuginformation to be able to know what is going on. Systemtap is considered the “answer” to Sun/Oracle’s DTrace. Systemtap and DTrace differ, most notably because DTrace doesn’t need additional software (debug information) for both kernel and userspace.
Let’s see how Systemtap can be installed on Oracle Enterprise Linux (OEL) 5! According to the documentation, we need the following packages to use systemtap to be able to use Systemtap to profile the linux kernel:
- systemtap, systemtap-runtime
- kernel-devel
- kernel-debuginfo, kernel-debuginfo-common
Systemtap and systemtap-runtime are packages which are part of OEL, kernel-devel too, but kernel-debuginfo and kernel-debuginfo-common are not. So, does this mean you can not use Systemtap on OEL? No! Luckily, Redhat provides the kernel-debuginfo and kernel-debuginfo-common packages:
(note: these RPM’s work, but the best method is by using the Oracle provided debuginfo packages at: http://oss.oracle.com/el5/debuginfo/)
Install systemtap, systemtap-runtime and kernel-devel using Yum:
# yum install kernel-devel systemtap systemtap-runtime
That’s easy, now install the debuginfo packages using a redhat mirror:
# rpm -Uvh ftp://ftp.pbone.net/mirror/ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/x86_64/Debuginfo/kernel-debuginfo-common-2.6.18-194.el5.x86_64.rpm
# rpm -Uvh ftp://ftp.pbone.net/mirror/ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/x86_64/Debuginfo/kernel-debuginfo-2.6.18-194.el5.x86_64.rpm
Please mind the debuginfo-packages must STRICTLY match your kernel!!
Because Oracle doesn’t provide debug information with the Oracle database executable, we can’t do much in “userspace”.
Still, nice things can be done with systemtap! Here is a script that counts the number of times a process is running on the CPU, the time it has spend running, and the total time:
#! /usr/bin/env stap
#
global total_time, process_time, timekeeper, runcounter
probe begin {
printf("Begin time measuring of process: %d\n",target())
total_time = gettimeofday_us()
}
probe scheduler.cpu_on {
if (pid() == target() ) {
timekeeper = gettimeofday_us()
runcounter++
}
}
probe scheduler.cpu_off {
if (pid() == target() ) {
process_time += gettimeofday_us()-timekeeper
}
}
probe end {
total_time = gettimeofday_us()-total_time
printf("Total time : %010d\n",total_time)
printf("Process time: %010d\n",process_time)
printf("Got on CPU %d times\n",runcounter)
}
Let see how that works:
First get the process ID of my server process:
SQL> select distinct spid from v$session a, v$process b, v$mystat c
2 where a.paddr=b.addr
3 and a.sid = (select distinct sid from v$mystat );
SPID
----
5313
SQL>
Next run the above systemtap-script for the process ID gotten above:
# stap runtime.stp -x 5313
Begin time measuring of process: 5313
Okay, systemtap is activated, now issue a statement in the Oracle session:
SQL> select * from dual;
Now go to the systemtap session, and press CTRL-C:
Total time : 0004245420
Process time: 0000000911
Got on CPU 4 times
Okay, so my server process is able to return the result to the ‘select * from dual’ query in 911 us (microseconds), and needed to go on the CPU 4 times. The systemtap script ran 4245420 us (4.2 seconds)
Happy systemtapping!
I just got notified Oracle does provide debuginfo packages, available here: http://oss.oracle.com/el5/debuginfo/
For completeness: installing Red Hat’s kernel-debuginfo packages for a differently patched kernel built by someone else is not going to work , for several different reasons.
While the Oracle database, being 100% proprietary goodness, doesn’t ship with debugging data, Oracle should consider inserting sys/sdt.h probes. The newest versions of this from systemtap will function without debug data.
I am currently using the redhat’s kernel-debuginfo packages on a OEL system. It works. Probably because the kernel version of the release I am using (5u5) is the same as redhat’s, which *could* differ. So point taken: you absolutely should use the debuginfo packages provided for 100% the same kernel, thus on http://oss.oracle.com/el5/debuginfo/. Oracle should make an extra yum channel for that!
It would be very handy if systemtap would be able to see the functions in userspace. print_ubacktrace() consistently is empty. Probably it’s something which is worked on at this moment.
It goes beyond the rpm-level versioning. Systemtap checks debuginfo applicability with the binary build-id hash code of the raw executable vs. the stripped debuginfo, and some other ways. Unless you’re using a binary copy of a Red Hat kernel build, or something fishy is going on, they should not match.
Could you send me a URL to the kernel AND kernel-debuginfo rpms you’ve found to work together?
the kernel I am using is here: http://public-yum.oracle.com/repo/EnterpriseLinux/EL5/5/base/x86_64/kernel-2.6.18-194.el5.x86_64.rpm
the debuginfo packages are listed above in the article.
OK, I think I see why these happen to work. On RHEL5, the compiler toolchain does not use build-id hashes, and by default systemtap skips another verification step (debuglink CRCs). Oracle seems to use a similar enough compiler version & source code for that particular rebuild to be close enough. It’s not something to rely on in general.
“It’d be very handy if systemtap would be able to see the functions in userspace. print_ubacktrace() consistently is empty. Probably it’s something which is worked on at this moment.”
User-space backtracing relies on unwinding data that systemtap can extract from specified user-space binaries at translate time. See the “-d” option. Try git systemtap if able; it has some significant improvements in this area. A release-1.3 is due soon.
Could you elaborate on that? If I use gdb, I can see the functions which are called by the Oracle server processes (in userspace, thus functions inside the oracle binary). Of course that is something different, but this means the information is available.
Doesn’t need systemtap need to investigate the process stack to find the addresses of the functions? Or am I way off here?
The difference is that systemtap requires a-priori identification of the interesting processes / shared libraries, because it insists on doing backtraces and other things basically non-invasively. A gdb backtrace can take seconds of time, and lots of information being paged in both in the target process and within gdb. In systemtap, they key subset of the unwind information is uploaded into the kernel ahead of time.
Hi,
I was testing systemtap on CentOS and OEL. On CentOS I was able to do
basic user process backtrace but it require a utrace patch to be compiled into kernel. When I downloaded a kernel sources of OEL those patch was not included.Probably this is way user backtrace is not working on OEL.
regards,
Marcin
Marcin, considering that OEL and CentOS v5 attempt to be identical clones of RHEL, I’m sure they both have the same utrace patches already applied.
Pingback: Blogroll Report 09/07/2010 – 16/07/2010 « Coskan’s Approach to Oracle