Archive

Tag Archives: linux profiling systemtap debug process running redhat oracle

Systemtap is a scripting language for analyzing linux systems. Systemtap needs debuginformation to be able to know what is going on. Systemtap is considered the “answer” to Sun/Oracle’s DTrace. Systemtap and DTrace differ, most notably because DTrace doesn’t need additional software (debug information) for both kernel and userspace.

Let’s see how Systemtap can be installed on Oracle Enterprise Linux (OEL) 5! According to the documentation, we need the following packages to use systemtap to be able to use Systemtap to profile the linux kernel:

  • systemtap, systemtap-runtime
  • kernel-devel
  • kernel-debuginfo, kernel-debuginfo-common

Systemtap and systemtap-runtime are packages which are part of OEL, kernel-devel too, but kernel-debuginfo and kernel-debuginfo-common are not. So, does this mean you can not use Systemtap on OEL? No! Luckily, Redhat provides the kernel-debuginfo and kernel-debuginfo-common packages:
(note: these RPM’s work, but the best method is by using the Oracle provided debuginfo packages at: http://oss.oracle.com/el5/debuginfo/)

Install systemtap, systemtap-runtime and kernel-devel using Yum:

# yum install kernel-devel systemtap systemtap-runtime

That’s easy, now install the debuginfo packages using a redhat mirror:

# rpm -Uvh ftp://ftp.pbone.net/mirror/ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/x86_64/Debuginfo/kernel-debuginfo-common-2.6.18-194.el5.x86_64.rpm
# rpm -Uvh ftp://ftp.pbone.net/mirror/ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/x86_64/Debuginfo/kernel-debuginfo-2.6.18-194.el5.x86_64.rpm

Please mind the debuginfo-packages must STRICTLY match your kernel!!

Because Oracle doesn’t provide debug information with the Oracle database executable, we can’t do much in “userspace”.

Still, nice things can be done with systemtap! Here is a script that counts the number of times a process is running on the CPU, the time it has spend running, and the total time:

#! /usr/bin/env stap
#
global total_time, process_time, timekeeper, runcounter

probe begin {
printf("Begin time measuring of process: %d\n",target())
total_time = gettimeofday_us()
}

probe scheduler.cpu_on {
if (pid() == target() ) {
timekeeper = gettimeofday_us()
runcounter++
}
}

probe scheduler.cpu_off {
if (pid() == target() ) {
process_time += gettimeofday_us()-timekeeper
}
}

probe end {
total_time = gettimeofday_us()-total_time
printf("Total time : %010d\n",total_time)
printf("Process time: %010d\n",process_time)
printf("Got on CPU %d times\n",runcounter)
}

Let see how that works:

First get the process ID of my server process:

SQL> select distinct spid from v$session a, v$process b, v$mystat c
2 where a.paddr=b.addr
3 and a.sid = (select distinct sid from v$mystat );

SPID
----
5313

SQL>

Next run the above systemtap-script for the process ID gotten above:

# stap runtime.stp -x 5313
Begin time measuring of process: 5313

Okay, systemtap is activated, now issue a statement in the Oracle session:

SQL> select * from dual;

Now go to the systemtap session, and press CTRL-C:

Total time : 0004245420
Process time: 0000000911
Got on CPU 4 times

Okay, so my server process is able to return the result to the ‘select * from dual’ query in 911 us (microseconds), and needed to go on the CPU 4 times. The systemtap script ran 4245420 us (4.2 seconds)

Happy systemtapping!

Follow

Get every new post delivered to your Inbox.

Join 2,060 other followers

%d bloggers like this: