Oracle 12.1 big table caching IO code path
Recently I was triggered about the ‘automatic big table caching’ feature introduced in Oracle version 12.1.0.2 with Roger Macnicol’s blogpost about Oracle database IO and caching or not caching (https://blogs.oracle.com/smartscan-deep-dive/when-bloggers-get-it-wrong-part-1 https://blogs.oracle.com/smartscan-deep-dive/when-bloggers-get-it-wrong-part-2). If you want to read something about the feature in general, search for the feature name, you’ll find several blogposts about it.
If you are not familiar with automatic big table caching, it’s a feature which allows you to dedicate a percentage of the buffer cache for scanning big tables. Big tables are tables that are not small :-), which technically means tables bigger (a higher number of blocks) than the “_small_table_threshold” parameter, which defaults to 2% of the buffer cache or 20 blocks, whichever is bigger. The choice to store blocks in the big table cache area is dependent on the segment “temperature”, which essentially is a kind of touch count. There is one other caveat of using this feature, which I currently find weird: once a database is operating in clustered mode (RAC), serial scans can not use the automatic big table caching anymore, the use of it then exclusively is for caching parallel query workers blocks scanned.
The purpose of this blogpost is to investigate the IO implementation of using the big table caching feature. The available general read code paths that I am aware of in Oracle are:
– Single block reads, visible via the wait event ‘db file sequential read’, which always go to the buffer cache.
– Multiple single block reads, visible via the wait ‘db file parallel read’, which always go to the buffer cache.
– Multi block reads, visible via the wait event ‘db file scattered read’, which always go to the buffer cache.
– Multi block reads, visible via the wait event ‘direct path read’, which go to the process’ PGA.
– Multi block reads, visible via the wait event ‘direct path read’, which go to the in-memory area.
For full scans of a segment, essentially the choices are to go buffered, or go direct. The question that I asked myself is: what code path is used when the automatic big table caching feature is used?
The reason for asking myself this is because the difference between a cached multi block read and a direct path read is significant. The significance of this decision is whether multiple IO requests can be submitted for multiple extents leveraging asynchronous IO, which is possible in the direct path read code path, or only the IO requests can be submitted for a single extent, which is what is true for the db file scattered read code path. To summarise the essence: can I only execute a single IO request (scattered read) or multiple IO requests (direct path read).
The answer is very simple to find if you enable the big table caching feature, and then read a table that is considered big with an empty big table cache. A table that is read into the big table area, shows the ‘db file scattered read’ wait event. What is interesting when a table is scanned that does not fit in the big table area: when a big table is not read for reading into the big table area, the wait event switches to ‘direct path read’. Or actually I should say: the general rules are applied for choosing between buffered or direct path reads, potentially being a direct path read.
I want to emphasise again on the importance of the IO code path decision. If you have an Oracle database server that uses modern high performance hardware for its disk subsystem, not choosing the correct multi block read codepath can severely influence throughput.
For example, if your IO subsystem has an average IO time of 1ms, the maximal theoretical throughput for buffered IO is 1/0.001*1MB=1GBPS (this will be way lower in reality). Direct path reads start off with 2 IO requests, which obviously means that that’s double the amount of buffered IO, but this can, depending on an Oracle internal resource calculation, grow up to 32 IO requests at the same time.