Oracle memory troubleshooting using analysis on heapdumps

This blogpost is about analysing Oracle heap dumps. It is an extension to earlier work, Tanel Poder’s heap dump analyzer. So hat tip to Tanel, he’s done the hard work, I merely increased the presentation options. The heap analyser script that I wrote analyses Oracle heapdumps from the trace file that the dump was written to by the Oracle database. Because the heap dump representation is the same between PGA and SGA memory, it can work on both. The reason for this is that memory management is done by the same memory manager, and is commonly called ‘kgh’ (kernel generic heap) managed memory.

Please mind that for PGA analysis, not all memory is managed by the kgh memory manager. For example memory used for networking (sqlnet) is allocated totally outside of the kgh memory manager.

Let’s take the output of a PGA heap dump at level 29 (PGA, UGA, CGA, top call heaps, call heaps, session heap; executed via ‘alter session set events ‘immediate trace name heapdump level 29”):

$ ./heap_analyze.awk /u01/app/oracle/diag/rdbms/tt/tt/trace/tt_ora_6515.trc
Heap dump analyzer v0.1 by Frits Hoogland
heap size totals (all sizes in bytes)
==============================================================================================
heap name                       total size
----------------------------------------------------------------------------------------------
top call heap                       393096
callheap                             36200
pga heap                           1016816
top uga heap                        262048
session heap                        261744
------------------------------------------
total                              1969904

top 5 allocation by total size per alloc reason per heap
==============================================================================================
heap             alloc reason            #chunks       total size
----------------------------------------------------------------------------------------------
top call heap                                  5           354456
top call heap    callheap                     11            37616
top call heap    perm                          2             1024
heap             alloc reason            #chunks       total size
----------------------------------------------------------------------------------------------
callheap                                      10            35920
callheap         ctxtcds: kksQui               1              160
callheap         perm                          1              120
heap             alloc reason            #chunks       total size
----------------------------------------------------------------------------------------------
pga heap         perm                         44           290256
pga heap         Fixed UGA heap                1           226232
pga heap         diag pga                     21           214048
pga heap                                       7            62616
pga heap         kfkio bucket                  1            40984
heap             alloc reason            #chunks       total size
----------------------------------------------------------------------------------------------
top uga heap     session heap                  4           261928
top uga heap     perm                          1              120
heap             alloc reason            #chunks       total size
----------------------------------------------------------------------------------------------
session heap     perm                          3            63608
session heap                                   8            63528
session heap     koh-kghu sessi                5            28288
session heap     kxsFrame4kPage                6            24912
session heap     kxsFrame8kPage                3            24744

top 5 allocations by total size per alloc reason, type, chunk size per heap
==============================================================================================
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
top call heap                                  1       free           131048           131048
top call heap                                  2       free            65512           131024
top call heap                                  1       free            63496            63496
top call heap                                  1       free            28888            28888
top call heap    callheap                      6   freeable             4224            25344
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
callheap                                       6       free             4184            25104
callheap                                       2       free             4192             8384
callheap                                       1       free             1784             1784
callheap                                       1       free              648              648
callheap         ctxtcds: kksQui               1   freeable              160              160
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
pga heap         Fixed UGA heap                1   recreate           226232           226232
pga heap         perm                          1       perm            71720            71720
pga heap         diag pga                      1   freeable            55048            55048
pga heap         diag pga                      2   freeable            27456            54912
pga heap                                       1       free            47296            47296
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
session heap     perm                          1       perm            43584            43584
session heap                                   1       free            36544            36544
session heap     kxsFrame4kPage                6   freeable             4152            24912
session heap     kxsFrame8kPage                3   freeable             8248            24744
session heap                                   1       free            20352            20352
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
top uga heap     session heap                  3   freeable            65512           196536
top uga heap     session heap                  1   recreate            65392            65392
top uga heap     perm                          1       perm              120              120

top 5 allocations by total size per heap, alloc reason, type, chunk size
==============================================================================================
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
pga heap         Fixed UGA heap                1   recreate           226232           226232
top uga heap     session heap                  3   freeable            65512           196536
top call heap                                  1       free           131048           131048
top call heap                                  2       free            65512           131024
pga heap         perm                          1       perm            71720            71720

I figured that the first thing you want to see, are the heaps that are in the dump, and their sizes. That’s what is visible in rows 2-12.
I summed the heap sizes, which might make sense, or it might not.
In this case, with a heap dump that includes PGA, UGA and CGA plus session and call heaps, it means there are heaps in the dump that are part of another heap that is in the dump. So the total size here is bogus. This means that you need to have an understanding of what is actually dumped.

The next section, top 5 allocations by total size per alloc reason per dump, shows a per-heap summary by the allocation reason in the dump. If there’s no alloc reason, it’s free memory. Because this is a heap dump of a session that has done nothing (I just started sqlplus and ran the dump of its own PGA memory), you see that a lot of memory chunks are free memory. If you look closely to the allocation reasons, you can see that the ‘top call heap’ has a memory section that is called ‘callheap’ which is slightly larger than the ‘callheap’ section in the heap totals, and the ‘top uga heap’ section has a memory section called ‘session heap’ that is slightly larger than the ‘session heap’ section in the heap totals. In this case, it means that you can actually see the subheads in the parent heap allocation totals. The subheap size must be slightly larger in the parent heap because of headers in the memory allocations which are needed memory management. Please mind that this is based on my knowledge of how process memory is created, the only way to be absolutely sure is that a heap is part of another heap is to look at the memory addresses. This output only shows the heap names, not the addresses. The purpose of this section is to have an understanding of where memory is allocated to in a heap.

The following section, top 5 allocations by total size per alloc reason, type, chunk size per heap shows a per-heap summary by reason, type and chunksize, so you can investigate if specific types and sizes of chunks are causing issues or weird behaviour.

The last section is the same as the previous section, but doesn’t do it per heap. This is identical to what Tanel’s heapdump_analyzer shows.

You can find the heap_analyze.awk script here: https://gitlab.com/FritsHoogland/oracle_memory_analyze/blob/master/heap_analyze.awk

This is the output of a dump of the SGA of my small test database (oradebug dump heapdump 2):

$ ./heap_analyze.awk /u01/app/oracle/diag/rdbms/tt/tt/trace/tt_ora_9461.trc
Heap dump analyzer v0.1 by Frits Hoogland
heap size totals (all sizes in bytes)
==============================================================================================
heap name                       total size
----------------------------------------------------------------------------------------------
sga heap(1,0)                    285210904
sga heap(1,3)                     83885560
------------------------------------------
total                            369096464

top 5 allocation by total size per alloc reason per heap
==============================================================================================
heap             alloc reason            #chunks       total size
----------------------------------------------------------------------------------------------
sga heap(1,0)    perm                         22        188068904
sga heap(1,0)                                141         17975320
sga heap(1,0)    SO private sga               17         14268008
sga heap(1,0)    KQR PO                     8271          7568912
sga heap(1,0)    KGLHD                      9381          7023960
heap             alloc reason            #chunks       total size
----------------------------------------------------------------------------------------------
sga heap(1,3)                                 51         14120288
sga heap(1,3)    SQLA^6d9b8a7e               337          1380352
sga heap(1,3)    SQLA^31cc505b               167           684032
sga heap(1,3)    SQLA^aab93e92               162           663552
sga heap(1,3)    PLDIA^191e0a8d              155           634880

top 5 allocations by total size per alloc reason, type, chunk size per heap
==============================================================================================
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
sga heap(1,0)    perm                          1       perm         15937496         15937496
sga heap(1,0)    perm                          1       perm         15931312         15931312
sga heap(1,0)    perm                          1       perm         15811464         15811464
sga heap(1,0)    perm                          1       perm         15741952         15741952
sga heap(1,0)    perm                          1       perm         15723584         15723584
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
sga heap(1,3)                                  1       free          9883208          9883208
sga heap(1,3)                                  4     R-free           839480          3357920
sga heap(1,3)    SQLA^6d9b8a7e               336  freeableU             4096          1376256
sga heap(1,3)                                  1     R-free           839360           839360
sga heap(1,3)    SQLA^31cc505b               166  freeableU             4096           679936

top 5 allocations by total size per heap, alloc reason, type, chunk size
==============================================================================================
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
sga heap(1,0)    perm                          1       perm         15937496         15937496
sga heap(1,0)    perm                          1       perm         15931312         15931312
sga heap(1,0)    perm                          1       perm         15811464         15811464
sga heap(1,0)    perm                          1       perm         15741952         15741952
sga heap(1,0)    perm                          1       perm         15723584         15723584

It’s interesting to see only subpool 1 sub-sub pool 1 and 3 are used. Subpool 1,1 contains a lot of permanent allocations and a lot of allocations that might have a more permanent nature, like KQR (dictionary cache) allocations, Subpool 1,3 seems to have allocations that are deemed more transient in nature, like SQLA (sql area) allocations. This might be wildly different in databases that are actually heavily used, this is an idle lab database.

Please mind it’s important to understand that dumping the shared pool requires obtaining the respective latches, so doing that on a live production system might lead to (severe) issues. Only do this if you know what you are doing. For the PGA there can only be one using process by definition, but be careful there too, if you request a PGA dump you are interacting with memory that is deemed private by the process that is using that.

If you require more rows to be shown than the 5 that are shown, set the ‘group_sum_nr’ variable to the amount you need on row 4 of the script.

1 comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: