Oracle has done a great job with the wait interface. It has given us the opportunity to profile the time spend in Oracle processes, by keeping track of CPU time and waits (which is time spend not running on CPU). With every new version Oracle has enhanced the wait interface, by making the waits more detailed. Tuning typically means trying to get rid of waits as much as possible.
But what if your execution is optimised to the point that there are (almost) no waits left? Before you think this is theoretical: this is possible, especially with Oracle adaptive direct path reads (which are non Oracle cached IOs), visible by the wait “direct path read”. Of course I am talking about the omission of waits, which happen with adaptive direct path reads if your system is able to provide the request results fast enough. There isn’t a wait because if the IO request result is returned fast enough, the process doesn’t have to wait. Whilst this sounds very obvious, the “traditional” Oracle IO requests (visible with the waits “db file sequential read” and “db file scattered read”) do always generate a wait, no matter how fast the IO requests where.
Here is a trace excerpt from a fill table scan where the IO was fast enough not to generate only a few waits:
PARSE #140145843472584:c=0,e=28,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3321871023,tim=1385308947947766 EXEC #140145843472584:c=0,e=31,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3321871023,tim=1385308947947823 WAIT #140145843472584: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=75579 tim=1385308947947871 WAIT #140145843472584: nam='asynch descriptor resize' ela= 1 outstanding #aio=0 current aio limit=1562 new aio limit=1592 obj#=75579 tim=1385308947947969 WAIT #140145843472584: nam='direct path read' ela= 428 file number=5 first dba=28418 block cnt=126 obj#=75579 tim=1385308947989097 FETCH #140145843472584:c=161976,e=174323,p=20941,cr=20944,cu=0,mis=0,r=1,dep=0,og=1,plh=3321871023,tim=1385308948122218 WAIT #140145843472584: nam='SQL*Net message from client' ela= 249 driver id=1413697536 #bytes=1 p3=0 obj#=75579 tim=1385308948122600 FETCH #140145843472584:c=0,e=2,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,plh=3321871023,tim=1385308948122689 WAIT #140145843472584: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=75579 tim=1385308948122709 WAIT #140145843472584: nam='SQL*Net message from client' ela= 210 driver id=1413697536 #bytes=1 p3=0 obj#=75579 tim=1385308948122938 CLOSE #140145843472584:c=0,e=15,dep=0,type=3,tim=1385308948555460
The most interesting part of the raw trace file is between the EXEC line and the first FETCH line. There is first a ‘SQL*Net message to client’ wait, then a ‘asynch descriptor resize’ wait, and then a single ‘direct path read’ wait. This is a single wait line for doing IO, while the fetch line shows that 20941 blocks are read by doing physical IO. The fetch line shows that most of the elapsed time (e) is spend on running on cpu (c). This means that details about how those 20941 blocks where read are (except for the single ‘direct path read’ wait) not available.
But what if you want to understand more about what the process is doing here? Except for a few wait lines, all the processing details that waits give are gone. It’s more or less only the PARSE/EXEC/FETCH lines, where the first fetch line contains more than 99% of all the time.
The answer to that on linux is perf. Perf is a profiler that is embedded in the linux kernel (since 2.6.32). I’ve written more about perf, use the search field on this blog find articles on how to setup and use perf. Now let’s see what is happening in this situation: what is Oracle doing to execute the above mentioned SQL (select count(*) from t2)?
I’ve ran perf on the session above with ‘perf record -g -e cpu-clock -p PID’, and the result (with ‘perf report’) is shown below:
67.58% oracle [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore | --- _raw_spin_unlock_irqrestore | |--99.19%-- mptspi_qcmd | scsi_dispatch_cmd | scsi_request_fn | __blk_run_queue | queue_unplugged | blk_flush_plug_list | blk_finish_plug | generic_file_read_iter | generic_file_aio_read | aio_rw_vect_retry | aio_run_iocb | io_submit_one | do_io_submit | sys_io_submit | system_call_fastpath | io_submit | --0.81%-- __wake_up fsnotify_add_notify_event inotify_handle_event send_to_group fsnotify vfs_write sys_write system_call_fastpath __write_nocancel 4.40% oracle oracle [.] sxorchk
What is shown here, is that 68% of the time the process ran on CPU, it was spending it’s time in kernel mode ([k]), on a function called _raw_spin_unlock_irqrestore. This function was called in two different ways, but in 99% of the time it came from mptspi_qcmd. This is the device specific kernel driver. What is even more remarkable, is that when we follow the backtrace up (by reading down), that the process was in fact issuing IO’s (the io_submit system call)!
This means that instead of spending time on waiting for IOs to finish, this system is spending time on spinning on a spin lock (alike what is latch in Oracle) for issuing commands to a SCSI device.
The next function in which the Oracle process spend time, is an Oracle function (visible by [.], which means user land function), called sxorchk. This function is a xor check (governed by the db_block_checking parameter).
As a summary: does this means the Oracle wait interface is useless? Of course not. But if the wait interface simply does not provide enough information, like when 99% of the time is only visible as CPU time, you need to step to another layer and investigate there. Perf opens up the CPU time, and is able to tell you how the CPU time is composed.