Archive

Tag Archives: dump

In the article oracle memory troubleshooting using analysis on heap dumps I introduced heap_analyze.awk.

The reason the tool exists is because I am using it myself. Therefore, I ran into additional things that I wanted the tool to do. I added some stuff, which is that significant, that I decided to make another blogpost to introduce the new features.

1. Percentages
In order to get an idea of the relative size of the summarised topic, I added a percentage. For example:

top 5 allocation by total size per alloc reason per heap
==================================================================================================
heap             alloc_reason            #chunks       total_size   %
--------------------------------------------------------------------------------------------------
sga heap(1,0)    perm                         19        182947456  72
sga heap(1,0)                                 43         14488960   5
sga heap(1,0)    SO private sga               18         14284168   5
sga heap(1,0)    KGLHD                      5787          4318400   1
sga heap(1,0)    KSFD SGA I/O b                1          4190416   1

2. Enhanced perm (permanent) memory descriptions
It seems that for PGA heap dumps, sometimes there is a description for memory area’s that are perm (permanent memory, memory allocated for the lifetime of the process). This is how that’s visible in the dump:

PERMANENT CHUNKS:
  Chunk     7fcbcad6c020 sz=    20632    perm      "perm           "  alo=20632
            7fcbcad6c020 sz=    20632    cpmlst    "callback hsn   "

I must say I don’t know what ‘cpmlst’ means, so if anyone knows or has a good guess, please let me know. However, the two addresses and sizes are an exact match, so I now change the alloc_reason for the cpmlst text.
This is helpful because there is a quite some memory allocated as perm. Sadly, this is not done for SGA dumps.

3. Excel mode
If you want to store and compare different dumps, one way of doing that is pasting the output in Microsoft excel. Once you set the ‘Text To Columns’ option to space as a separator, it will put the information in its own cells. But there are a few problems with that, one of them is that the heap names and alloc_reasons can have spaces in them, so that the placement of the figures can vary. I created excel mode for that.
If you set excel mode (set the variable excel_mode to 1 on line 5):

#!/bin/awk -f
BEGIN {
  printf "Heap dump analyzer v0.2 by Frits Hoogland\n";
  group_sum_nr=5;
  excel_mode=1;
}

In this mode, the horizontal lines (with ‘-‘ and ‘=’) are omitted in the output, and all spaces are changed to underscores, so a table stays consistent when pasted in excel:

heap             alloc_reason            #chunks       total_size   %
sga_heap(1,0)    perm___________              19        182947456  72
sga_heap(1,0)    _______________              43         14488960   5
sga_heap(1,0)    SO_private_sga_              18         14284168   5
sga_heap(1,0)    KGLHD__________            5787          4318400   1
sga_heap(1,0)    KSFD_SGA_I/O_b_               1          4190416   1

4. New table which shows memory by type
Another way of looking at memory in a heap is by grouping it by type. This allows you to very quickly see if a certain type of chunk is dominating a heap:

heap                        type    #chunks         min_size         max_size       total_size   %
--------------------------------------------------------------------------------------------------
top call heap               free          7            16408            65512           390064  99
top call heap           recreate          2              992              992             1984   0
top call heap               perm          2              120              904             1024   0

This is an overview of the top call heap of a session that is not active, and therefore most of it should be empty, which is true for this dump.

Once again, get the awk script here: https://gitlab.com/FritsHoogland/oracle_memory_analyze/blob/master/heap_analyze.awk

Every DBA working with the Oracle database must have seen memory dumps in tracefiles. It is present in ORA-600 (internal error) ORA-7445 (operating system error), system state dumps, process state dumps and a lot of other dumps.

This is how it looks likes:

Dump of memory from 0x00007F06BF9A9E00 to 0x00007F06BF9ADE00
7F06BF9A9E00 0000C215 0000001F 00000CC1 0401FFFF  [................]
7F06BF9A9E10 000032F3 00010003 00000002 442B0000  [.2............+D]
7F06BF9A9E20 2F415441 31323156 4F2F3230 4E494C4E  [ATA/V12102/ONLIN]
7F06BF9A9E30 474F4C45 6F72672F 315F7075 3735322E  [ELOG/group_1.257]
7F06BF9A9E40 3336382E 36313435 00003338 00000000  [.863541683......]
7F06BF9A9E50 00000000 00000000 00000000 00000000  [................]

The first column is the memory location in hexadecimal.
The second to fifth columns represent the actual memory values in hexadecimal.
The sixth column shows an ASCII representation of the memory contents. If a position does not represent an ASCII character, a dot (“.”) is printed.

Actually, the values in the second to fifth column are grouped in four columns. This is how the values in a column look like:
{hex val}{hex val}{hex val}{hex val}, for example: 00010203 means: 0, 1, 2, 3.

In the ASCII representation (sixth column) the spaces after every four values are not put in.

However, look at the following line:

7F06BF9A9E10 000032F3 00010003 00000002 442B0000  [.2............+D]

And focus on the last four characters:
“..+D” (two non-printables, plus, D)
Now look at the corresponding memory contents from the dump:
“442B0000” This is: “44 2B 00 00”, which should correspond to “. . + D”.
There is something the matter here: the plus and the D seem to be represented by “00”. That’s not correct.

Let’s see what “442B0000” actually represents in ASCI:

$ echo -e "\x44\x2B\x00\x00"
D+

Ah! That looks backwards! Let’s take a full line and see what that gives:
(This is the line with memory address 0x7F06BF9A9E20)

$ echo -e "\x2F\x41\x54\x41 \x31\x32\x31\x56 \x4F\x2F\x32\x30 \x4E\x49\x4C\x4E"
/ATA 121V O/20 NILN

So if you want to look at the actual memory contents, you need to start with the column on the left side, read the values from right to left, then go the next column, etc.

Endianness
Actual, I asked my friend Philippe Fierens for a trace file from a SPARC (big endian) platform, to see if the endianness of the platform was causing this. I test my stuff on Linux, which is little endian.

Here’s a little snippet:

Dump of memory from 0xFFFFFFFF7D977E00 to 0xFFFFFFFF7D97BE00
FFFFFFFF7D977E00 15C20000 00000001 00000000 00000104  [................]
FFFFFFFF7D977E10 F4250000 00000000 0B200400 E2EB8A3D  [.%....... .....=]
FFFFFFFF7D977E20 44475445 53540000 32F6D98B 00000590  [DGTEST..2.......]
FFFFFFFF7D977E30 00004000 00000001 00000000 00000000  [..@.............]
FFFFFFFF7D977E40 00000000 00000000 00000000 00000000  [................]

Let’s test the line from address 0xFFFFFFFF7D977E20:

[oracle@bigmachine [v12102] trace]$ echo -e "\x44\x47\x54\x45 \x53\x54\x00\x00 \x32\xF6\xD9\x8B \x00\x00\x05\x90"
DGTE ST 2� �

So, the endianness determines how the raw memory contents should be read.

%d bloggers like this: