Using R and Oracle tracefiles
If you are not familiar with R, this is the description from the R site: R is a language and environment for statistical computing and graphics. I encountered R at the Erasmus university, where I am working on a project with DNA data which is put in an Oracle database (see: http://huvariome.erasmusmc.nl/ (It’s down at the moment)).
R is about using statistics, and helps greatly because of the ability to make graphics of statistical data, which you can use to correlate figures. An example is shown here by Greg Rahn (Structured data) and here an example of Stefan Moeding (blog, the name is a guess, I’ve composed it from the author’s name of the articles on the blog, and I’ve guessed the name of the site (www.moeding.net) is the surname).
I probably will blog about using database contents and R in the future (most probably active session history (ash) data). This blog post is about using R and Oracle database sqltrace with waits (also known as 10046 level 8 traces or 10046/8 traces).
Getting hands-on
If you want to experiment with this, the first you need is R itself. My advice would be to get and use Rstudio, which is a free and open source R integrated development environment (IDE).
Next you need a real Oracle tracefile. If you just want to experiment and don’t have a database at hand with which you can generate one, I have a sample trace file: v11203_ora_19295.trc. I tried to read the tracefile into R (R is very good at using CSV data), and process it to make it usable, but that was very hard. In order to make the tracefile in a CSV-like format, I use a small sed (stream editor; available on Mac OS-X, Linux, well…on most modern platforms except for Windows) script called waits.sed. This script outputs all WAIT lines in the tracefile in a way it can be easily used as CSV file.
This is the content of the waits.sed script:
# format WAIT rows from an oracle 10046/sql trace to
# 1:WAIT 2:cursor number 3:wait name 4:elapsed time 5:p1 6:p2 7:p3 8:object number 9:time
s/^\(WAIT\)\ #\([0-9]*\):\ nam='\(.*\)'\ ela=\ *\([0-9]*\)\ [0-9a-z\ #|]*=\([0-9]*\)\ [0-9a-z\ #|]*=\([0-9]*\)\ [0-9a-z\ #|]*=\([0-9]*\)\ obj#=\([0-9\-]*\)\ tim=\([0-9]*\)$/\1|\2|\3|\4|\5|\6|\7|\8|\9/
In order to make a file with only waits, use the following command:
sed -f waits.sed v11203_ora_19295.trc | grep ^WAIT > v11203_ora_19295.trc.waits
This creates a new file called v11203_ora_19295.trc.waits with only the waits in a CSV format.
Now we are at the point where we get to use R! Open RStudio, and open this R scipt: tracewaits.R. Now choose ‘Edit’,’Run code’,’Run all’. If the v11203_ora_19295.trc.waits file is in the correct directory, the ‘plots’ tab on the right lower side will display the following picture:
Alright. Not much to see yet! This tells us we got two cursors with wait lines in this tracefile. To choose the waits of the one cursor of interest in this tracefile, uncomment (remove the hash) in the following line in the R script:
#trc <- subset(trc, cursor == "47388340751480")
Now run the script again. Now the graph show something far more interesting:
What do we see here? The Y axis shows the value in the ‘p3’ column of the waits. The most waits are ‘direct path read’ (light blue), which means ‘p3’ is the number of blocks read per IO call. The X axis numbers are the seconds of the timestamp in the ‘tim’ column.
We see the number of blocks read per IO call is low for the first few IO calls, then builds up to 126, and later becomes 128 (the maximum in this case). What we see here is automatically allocated extents (of which the first few allocated extents are very small, then bigger, and so on)
Now let’s add the time waited for the waits to the graph, and see if there’s a correlation between blocks read per wait and time waited per wait:
Uncomment the following line:
#p <- p + geom_line(aes(tim, ela,colour="elapsed time")) + stat_smooth(aes(y=ela))
And run the script again. This adds the time waited (ela column) to the graph as a line, and a trendline (stat_smooth):
The “ela” or “elapsed time” line shows why the bar’s in the graph do not have a constant distance between each other: the IO time heavily fluctuates (please mind this is because of my overloaded laptop :)). Also, when looking at the beginning, I can see the IO times (on my laptop at least) for the small IO’s is lower than the bigger IO’s:
(add the following line to the script to zoom in to the beginning:)
trc <- subset(trc, p2 <= 642)
Which gives this graph:
This clearly shows the bigger (126 blocks) IO’s have a higher response time.
I hope this encourages people to use R and to extend the waits.sed script (it’s a very simple script now, and ONLY handles waits, not parse, execute, fetch, etc.), and extend the R script. Using ‘facet_grid’ you can put multiple graphs of tracefiles/waits which are exactly aligned which each other, so you can see any potential correlation between the two. Also you can overlay multiple graphs, like shown in this example: a graph showing the number of blocks per IO and a graph show the elapsed time per IO.
Have fun using and experimenting with R!
Awesome Frits. Thanks for the step by step explanation!
– Kyle Hailey
Pingback: Log Buffer #255, A Carnival of the Vanities for DBAs | The Pythian Blog
Pingback: DB Optimizer » ASH Visualizations: R, ggplot2, Gephi, Jit, HighCharts, Excel ,SVG
Thanks Frits. I’ve always thought R/R Studio was just for our statisticians. No excuses for not looking at R Studio now.