Extra huge database IOs, part 2

This is part 2 of a number of blogposts about huge Oracle database IO’s.
If you landed on this blogpost and did not read part 1, please read part 1 here.

In part 1 I showed how database IOs of a full table scan could be bigger than 1MB by increasing the db_file_multiblock_read_count parameter to a number beyond 1MB expressed in Oracle blocks. These bigger IOs only happen with direct path reads, not with buffered multiblock reads.

But how much bigger can these IOs be? In part 1 I showed Oracle IOs of 1020 blocks. Is that the limit? To investigate this, I created a much bigger table (table T2 in part 1 had a maximum extent size of 1024 blocks, which meant that the 1020 is the biggest IO possible from this table).

For the sake of this investigation I created a much bigger table to get larger extents:

---------- ----------
	 0	    8
       263	 8192
       264	 8192
       265	 8192

266 rows selected.

So with my new, big table I got extents up to 8192 blocks. Let’s do the same test on this table as I did in the previous post, which means setting db_file_multiblock_read_count to 2000 with SQL trace at level 8 with throttled IOPS to get waits!

The IOs with the “small” extents of 8 blocks and 128 blocks are done exactly the same as the previous post. The IOs done against the extents with a size of 1024 blocks are the same too, simply because reading the 1020 blocks in one IO is as efficient as it could be! But let’s take a look at the point where the extents become 8192 blocks:

WAIT #140281084141152: nam='direct path read' ela= 7996378 file number=5 first dba=174468 block cnt=1020 obj#=76227 tim=1373104660882677
WAIT #140281084141152: nam='direct path read' ela= 7995820 file number=5 first dba=175489 block cnt=1023 obj#=76227 tim=1373104668882345
WAIT #140281084141152: nam='direct path read' ela= 7996472 file number=5 first dba=176520 block cnt=632 obj#=76227 tim=1373104676882677
WAIT #140281084141152: nam='direct path read' ela= 7998049 file number=5 first dba=177152 block cnt=1024 obj#=76227 tim=1373104684883512
WAIT #140281084141152: nam='direct path read' ela= 7995472 file number=5 first dba=178176 block cnt=1024 obj#=76227 tim=1373104692882932
WAIT #140281084141152: nam='direct path read' ela= 7993677 file number=5 first dba=179200 block cnt=1024 obj#=76227 tim=1373104700880106
WAIT #140281084141152: nam='direct path read' ela= 7996969 file number=5 first dba=180224 block cnt=1024 obj#=76227 tim=1373104708880891
WAIT #140281084141152: nam='direct path read' ela= 5998630 file number=5 first dba=181248 block cnt=1024 obj#=76227 tim=1373104714882889
WAIT #140281084141152: nam='direct path read' ela= 9996459 file number=5 first dba=182272 block cnt=1024 obj#=76227 tim=1373104724882545

I’ve included one IO of 1020 blocks, after which Oracle issues an IO request of 1023 blocks, then 632 blocks, and then it issues a number of requests for 1024 blocks.

It seems that the maximal amount of blocks of multiblock IO requests done by Oracle “unbuffered” or “direct path” is 1024 blocks, which is 8MB! Please mind I have tested this ONLY with Oracle version 64 bits on Linux X64 on ASM, with a database and segment blocksize of 8kB.

There’s another peculiarity I want to introduce. Let’s take a look at the dba and block cnt only, together with block_id and blocks (number of blocks in the extent, taken from dba_extents):

                                  BLOCK_ID	  BLOCKS         EXTENT_ID
dba=174468 block cnt=1020           174464          1024               197
dba=175489 block cnt=1023           175488          1024               198
dba=176520 block cnt=632            176512          8192               199
dba=177152 block cnt=1024
dba=178176 block cnt=1024
dba=179200 block cnt=1024
dba=180224 block cnt=1024
dba=181248 block cnt=1024
dba=182272 block cnt=1024
dba=183296 block cnt=1024 

There are two anomalies here: the first one is at dba 175489. It seems that dba 175489/extent id 198 has only one BMB, so Oracle can read 1023 blocks of the extent, instead of the 1020 blocks in the other extents with the size of 1024 blocks.

The second anomaly is much more interesting: extent id 199 starts at block id 176512. This is an extent consisting of 8192 blocks. The first IO starts at block id 176520. This means there are 176520-176512=8 blocks not read, which are probably BMB’s. But this is in line with my expectation. What is odd, is the first IO issued for datablocks in this extent which is 8192 blocks in length is “only” 632 blocks!

After the oddly sized IO request at the beginning of the extent (632), the other IO requests are the full 1024 blocks, which probably is the maximum size of direct path Oracle IO requests.

I wondered about the oddly sized request at the beginning. I looked at the dba’s of the requests, and noticed the dba’s of the 1024 blocks sized requests looked vaguely familiar. Then I spotted the familiarity and a possible reason for the “small” request at the beginning of the extent:

SYS@v11203 AS SYSDBA> select mod(176520,1024) from dual;


SYS@v11203 AS SYSDBA> select mod(177152,1024) from dual;


SYS@v11203 AS SYSDBA> select mod(178176,1024) from dual;


ONLY when the direct path read requests can potentially read “the full” 1024 blocks in one IO, Oracle first issues an IO to align the later requests to a block id being a multiplication of 1024, after which it will issue IO requests of 1024 blocks.

I don’t know why the alignment takes place. I can imagine this alignment can line up with ASM, thus avoiding a single Oracle IO request touching more than one ASM allocation unit. But I am open to other people’s ideas and suggestions about potential reasons!

Stay tuned for further updates about extra huge database IOs!

  1. Yury Pudovchenko said:

    Very interesting research, Frits !
    Thank you,

  2. Hi Frits

    First of all thank you very much for sharing the result of your tests!

    > It seems that the maximal amount of blocks of multiblock IO requests done by Oracle “unbuffered” or “direct path” is
    > 1024 blocks, which is 8MB! Please mind I have tested this ONLY with Oracle version 64 bits on Linux X64 on
    > ASM, with a database and segment blocksize of 8kB.

    IIRC the AWR report I showed you, here are the key information it contains:
    – Linux x86 64-bit
    – with ASM
    – db_block_size = 8192
    – Tablespace IO Stats, Av Blks/Rd = 2606
    – The statement used for the tests was executed in parallel

    So, it seems that more than 1024 is possible.


    • Thank you for reading and responding Chris!

      That’s interesting information. I’ll incorporate it in my investigation.

  3. Chris, I’ve done some testing, here’s what I think is happening:

    With limited specific testing outside of my lab, my guess is still that Oracle issues IO requests up to 1024 blocks during direct path reads.
    Oracle starts off with 2 IO requests in parallel with direct path reads, and can increase the number of concurrent direct path requests (based on wallclock time spend between submit and reap, CPU time and IO time), up to 32 requests in parallel per process.

    After the submission of 2 or more IO requests potentially for 1024 adjacent blocks, Oracle peeks in the completion queue 4 times for all outstanding IO requests (Oracle, 2 times for Oracle If all outstanding IO requests are ready, Oracle reaps them. This means it can reap multiple IO requests of 1024 blocks. If it can’t, the process will wait for a single request to be ready (and only then produce a ‘direct path read’ wait).

    So my assumption (because it explains it all) is that Av Blks/Rd is a measurement of reaped Oracle blocks during a single cycle of io_submit(), completion queue probes, potentially waiting for an IO request and processing the results (after which it starts over with io_submit()).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: