Installing systemtap on OEL5, update 5

Systemtap is a scripting language for analyzing linux systems. Systemtap needs debuginformation to be able to know what is going on. Systemtap is considered the “answer” to Sun/Oracle’s DTrace. Systemtap and DTrace differ, most notably because DTrace doesn’t need additional software (debug information) for both kernel and userspace.

Let’s see how Systemtap can be installed on Oracle Enterprise Linux (OEL) 5! According to the documentation, we need the following packages to use systemtap to be able to use Systemtap to profile the linux kernel:

  • systemtap, systemtap-runtime
  • kernel-devel
  • kernel-debuginfo, kernel-debuginfo-common

Systemtap and systemtap-runtime are packages which are part of OEL, kernel-devel too, but kernel-debuginfo and kernel-debuginfo-common are not. So, does this mean you can not use Systemtap on OEL? No! Luckily, Redhat provides the kernel-debuginfo and kernel-debuginfo-common packages:
(note: these RPM’s work, but the best method is by using the Oracle provided debuginfo packages at:

Install systemtap, systemtap-runtime and kernel-devel using Yum:

# yum install kernel-devel systemtap systemtap-runtime

That’s easy, now install the debuginfo packages using a redhat mirror:

# rpm -Uvh
# rpm -Uvh

Please mind the debuginfo-packages must STRICTLY match your kernel!!

Because Oracle doesn’t provide debug information with the Oracle database executable, we can’t do much in “userspace”.

Still, nice things can be done with systemtap! Here is a script that counts the number of times a process is running on the CPU, the time it has spend running, and the total time:

#! /usr/bin/env stap
global total_time, process_time, timekeeper, runcounter

probe begin {
printf("Begin time measuring of process: %d\n",target())
total_time = gettimeofday_us()

probe scheduler.cpu_on {
if (pid() == target() ) {
timekeeper = gettimeofday_us()

probe scheduler.cpu_off {
if (pid() == target() ) {
process_time += gettimeofday_us()-timekeeper

probe end {
total_time = gettimeofday_us()-total_time
printf("Total time : %010d\n",total_time)
printf("Process time: %010d\n",process_time)
printf("Got on CPU %d times\n",runcounter)

Let see how that works:

First get the process ID of my server process:

SQL> select distinct spid from v$session a, v$process b, v$mystat c
2 where a.paddr=b.addr
3 and a.sid = (select distinct sid from v$mystat );



Next run the above systemtap-script for the process ID gotten above:

# stap runtime.stp -x 5313
Begin time measuring of process: 5313

Okay, systemtap is activated, now issue a statement in the Oracle session:

SQL> select * from dual;

Now go to the systemtap session, and press CTRL-C:

Total time : 0004245420
Process time: 0000000911
Got on CPU 4 times

Okay, so my server process is able to return the result to the ‘select * from dual’ query in 911 us (microseconds), and needed to go on the CPU 4 times. The systemtap script ran 4245420 us (4.2 seconds)

Happy systemtapping!

    • Frank Ch. Eigler said:

      For completeness: installing Red Hat’s kernel-debuginfo packages for a differently patched kernel built by someone else is not going to work , for several different reasons.

      While the Oracle database, being 100% proprietary goodness, doesn’t ship with debugging data, Oracle should consider inserting sys/sdt.h probes. The newest versions of this from systemtap will function without debug data.

      • I am currently using the redhat’s kernel-debuginfo packages on a OEL system. It works. Probably because the kernel version of the release I am using (5u5) is the same as redhat’s, which *could* differ. So point taken: you absolutely should use the debuginfo packages provided for 100% the same kernel, thus on Oracle should make an extra yum channel for that!

        It would be very handy if systemtap would be able to see the functions in userspace. print_ubacktrace() consistently is empty. Probably it’s something which is worked on at this moment.

  1. Frank Ch. Eigler said:

    It goes beyond the rpm-level versioning. Systemtap checks debuginfo applicability with the binary build-id hash code of the raw executable vs. the stripped debuginfo, and some other ways. Unless you’re using a binary copy of a Red Hat kernel build, or something fishy is going on, they should not match.

    Could you send me a URL to the kernel AND kernel-debuginfo rpms you’ve found to work together?

      • Frank Ch. Eigler said:

        OK, I think I see why these happen to work. On RHEL5, the compiler toolchain does not use build-id hashes, and by default systemtap skips another verification step (debuglink CRCs). Oracle seems to use a similar enough compiler version & source code for that particular rebuild to be close enough. It’s not something to rely on in general.

  2. Frank Ch. Eigler said:

    “It’d be very handy if systemtap would be able to see the functions in userspace. print_ubacktrace() consistently is empty. Probably it’s something which is worked on at this moment.”

    User-space backtracing relies on unwinding data that systemtap can extract from specified user-space binaries at translate time. See the “-d” option. Try git systemtap if able; it has some significant improvements in this area. A release-1.3 is due soon.

    • Could you elaborate on that? If I use gdb, I can see the functions which are called by the Oracle server processes (in userspace, thus functions inside the oracle binary). Of course that is something different, but this means the information is available.

      Doesn’t need systemtap need to investigate the process stack to find the addresses of the functions? Or am I way off here?

      • Frank Ch. Eigler said:

        Marcin, considering that OEL and CentOS v5 attempt to be identical clones of RHEL, I’m sure they both have the same utrace patches already applied.

      • Hi,

        I was testing systemtap on CentOS and OEL. On CentOS I was able to do
        basic user process backtrace but it require a utrace patch to be compiled into kernel. When I downloaded a kernel sources of OEL those patch was not included.Probably this is way user backtrace is not working on OEL.


      • Frank Ch. Eigler said:

        The difference is that systemtap requires a-priori identification of the interesting processes / shared libraries, because it insists on doing backtraces and other things basically non-invasively. A gdb backtrace can take seconds of time, and lots of information being paged in both in the target process and within gdb. In systemtap, they key subset of the unwind information is uploaded into the kernel ahead of time.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 2,650 other followers

%d bloggers like this: