Hey dude, where’s my memory?

This blogpost is about finding the actual amount of memory a process is taking. In order to do so, this post dives into the memory mechanisms of Linux. The examples in this article are taken from an Oracle Linux version 6.6 server, with kernel 2.6.39-400.243.1 (UEK2). This is written with the Oracle database processes in mind, but actually uses examples of a processes running ‘cat’, which means the contents of this post are absolutely not limited to Oracle database processes.

Let’s start off with a simple example. Let’s look at our own memory map. In order to do so, I use the ‘cat’ executable and the ‘maps’ entry in the proc pseudo-filesystem. This is how that is done, including the result:

$ cat /proc/self/maps
00400000-0040b000 r-xp 00000000 fc:00 2605084                            /bin/cat
0060a000-0060b000 rw-p 0000a000 fc:00 2605084                            /bin/cat
0060b000-0060c000 rw-p 00000000 00:00 0
0139d000-013be000 rw-p 00000000 00:00 0                                  [heap]
7f444468d000-7f444a51e000 r--p 00000000 fc:00 821535                     /usr/lib/locale/locale-archive
7f444a51e000-7f444a6a8000 r-xp 00000000 fc:00 3801096                    /lib64/libc-2.12.so
7f444a6a8000-7f444a8a8000 ---p 0018a000 fc:00 3801096                    /lib64/libc-2.12.so
7f444a8a8000-7f444a8ac000 r--p 0018a000 fc:00 3801096                    /lib64/libc-2.12.so
7f444a8ac000-7f444a8ad000 rw-p 0018e000 fc:00 3801096                    /lib64/libc-2.12.so
7f444a8ad000-7f444a8b2000 rw-p 00000000 00:00 0
7f444a8b2000-7f444a8d2000 r-xp 00000000 fc:00 3801089                    /lib64/ld-2.12.so
7f444aacd000-7f444aad1000 rw-p 00000000 00:00 0
7f444aad1000-7f444aad2000 r--p 0001f000 fc:00 3801089                    /lib64/ld-2.12.so
7f444aad2000-7f444aad3000 rw-p 00020000 fc:00 3801089                    /lib64/ld-2.12.so
7f444aad3000-7f444aad4000 rw-p 00000000 00:00 0
7fff51980000-7fff519a1000 rw-p 00000000 00:00 0                          [stack]
7fff519ff000-7fff51a00000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

Most people know that in order to execute ‘cat’, the shell forks and executes the cat command in a new process.

What we see, is the executable (/bin/cat), the heap ([heap]), two dynamic libraries (/lib64/ld-2.12.so and /lib64/libc-2.12.so), the stack ([stack]), two entries called [vdso] (virtual dynamically linked shared objects) and [vsyscall] (virtual syscall), and anonymous memory allocations (00000000 00:00 0 allocations without a marker to indicate a process function).

In order to understand the libraries, we first need to know something about the executable itself, using the command ‘file’:

$ file /bin/cat
/bin/cat: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped

There actually is a great deal of information to be seen from this one line:
This is an ELF format executable. Another executable format is COFF, which is used on the Windows platform.
The executable is 64 bits (x86-64).
The most important part for this article: the executable is dynamically linked, and uses shared libraries.
The last word ‘stripped’ deserves some explanation too: the executables is stripped, which means a lot of symbolic information (function names for example) are removed from the executable. One reason for doing so is to make the executable smaller. If you look at the oracle executable, you will see it’s not stripped. Except for the Oracle XE executable, which is stripped.

We now established this is a dynamically linked executable. The next step is to see the libraries it is using. This is done with the ‘ldd’ (loader dependencies) executable:

$ ldd /bin/cat
	linux-vdso.so.1 =>  (0x00007fffa9388000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f6ffa3dd000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f6ffa772000)

Please mind these are the dependencies for executing /bin/cat, which does not necessarily means you see all those in the address space of the process executing the /bin/cat executable. Two you don’t see is linux-vsdo.so.1, which is used for virtual dynamic shared objects, which essentially means from the oracle perspective that some system calls can be executed fully in userspace, most notably for Oracle engineers: gettimeofday(). Fellow Oaky James Morle wrote a nice article explaining this.
The other one is /lib64/ld-linux-x86-64.so.2, which is the dynamic loader needed for executing the executable. The last one is libc.so.6, which “truly” is a library that is dynamically loaded for use during execution. The other ones discussed earlier are necessary for the instantiating the execution, not so much during the execution.

If we now look back to the maps output, we see the libc.so.6 library, and we see a ld-2.12.so library. The ld-2.12.so library is the dynamic loader library, to provide dynamic loading function on runtime.

Okay, we now gained a some understanding on executables and the libraries. Now let’s look a bit more in detail to the executable in maps:

00400000-0040b000 r-xp 00000000 fc:00 2605084                            /bin/cat
0060a000-0060b000 rw-p 0000a000 fc:00 2605084                            /bin/cat

Why is the executable mentioned twice?
The answer is this is because of the way an ELF object is built up. In order to look deeper into this ELF binary, we can use the ‘readelf’ executable:

$ readelf -l /bin/cat

Elf file type is EXEC (Executable file)
Entry point 0x401850
There are 8 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x00000000000001c0 0x00000000000001c0  R E    8
  INTERP         0x0000000000000200 0x0000000000400200 0x0000000000400200
                 0x000000000000001c 0x000000000000001c  R      1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x000000000000a204 0x000000000000a204  R E    200000
  LOAD           0x000000000000a208 0x000000000060a208 0x000000000060a208
                 0x0000000000000648 0x0000000000001000  RW     200000
  DYNAMIC        0x000000000000a3e8 0x000000000060a3e8 0x000000000060a3e8
                 0x0000000000000190 0x0000000000000190  RW     8
  NOTE           0x000000000000021c 0x000000000040021c 0x000000000040021c
                 0x0000000000000044 0x0000000000000044  R      4
  GNU_EH_FRAME   0x0000000000009414 0x0000000000409414 0x0000000000409414
                 0x000000000000027c 0x000000000000027c  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     8

 Section to Segment mapping:
  Segment Sections...
   00
   01     .interp
   02     .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
   03     .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss
   04     .dynamic
   05     .note.ABI-tag .note.gnu.build-id
   06     .eh_frame_hdr
   07

When ‘readelf’ is invoked with the -l option, readelf displays the information that is contained in the ELF segment headers of the executable. This is a lot of detail into which I don’t go too deep.

The biggest distinction between the two entries of the executable in maps is the first entry is readonly executable (r-xp), and the second entry is read-write (rw-p). If we now look at the readelf -l output, we see the same flags in the FLAGS column, now indicated by R, W and E for Read, Write and Execute. If we look in the segment sections part, we see section 02 containing a lot of entries, and if we look up at the headers, and we count from zero, we see “LOAD”, and the entry being flagged as ‘R E’. This memory area often is referred to as the code segment.

The some of the types of memory allocations in section 02 are:
– Dynamic linking information (.interp, .gnu.hash, .dynsyms, dynstr)
– C runtime code (.init, .fini)
– Relocation information (.rela.dyn, .rela.plt)
– String constants (.rodata)
– Machine instructions (.text)
– Procedure Linkage Table (.plt)

The next section that deserves attention is section 03. If we look in the headers, count from zero, we see this section is a “LOAD” section too, but the entry being flagged as ‘RW ‘. This memory area often is referred to as the data segment.

The some of the types of memory allocations in section 03 are:
– Dynamic linking into (.dynamic)
– Relocated pointer values to external symbols (.got)
– Procedure linkage table of the global offset table (.got.plt)
– C runtime data (.ctors, .dtors)

As you can see with these two, section 02 contains information which can actually be readonly (the executable itself is never changed on runtime), which is why it’s readonly. Section 03 contains information which by the nature of dynamic linking, really can’t be readonly, because we don’t know how the library looks like which the executable needs to dynamically load via the dynamic loader; the only thing that is required from the library is that it contains the functions the executable needs from that library.

At this point I think it should be clear why there are two sections for a single executable. But how about the libraries?

Actually, simplified, a library is an executable with only functions, and not a program to run any of these functions. Libraries also can use other shared libraries for its functions. This means you can use the above mentioned executables (file, ldd, readelf) to examine libraries, and get quite similar results. The programmer can choose how the memory segments look like and how these are divided, which is the reason the dynamic loader library (ld-2.12.so) has 3 entries, and the libc library (libc-2.12.so) has 4. Although 4 or 3 sections seems to be extremely common for libraries, as well as 2 for executables.

Okay, now that we gained some understanding on the memory segments for a process, let’s continue on actual memory usage. When a process is forked, Linux tries to do the least possible and save as much memory as possible. To do so, the process gets its own virtual memory address space which is a duplicate of the process which called fork, including its memory. However, what really happens is the memory areas are the ones from the forking process, linked via pointers. Only when a change is needed, the process starts truly allocating its own memory, of course not using the memory from the forking process. This technique is often referred to as COW (Copy On Write). The ‘pointer trick’ is made possible by the kernel, and the use of virtual memory: every process allocates at the same (virtual) memory addresses, which are kept separate from other processes by translating them to different physical locations, or the same locations for copy on write. Of course the read-only sections as we saw in maps will never change, and as a result only will be in memory at a single place, and potentially referenced/used by a lot of processes. But even the writable sections are only physically allocated as soon as there truly is a write action.

The ‘just in time’ principle of memory allocation is also used for anonymous memory allocations via mmap(), which are most of the Oracle database processes PGA allocations. When a process allocates memory via mmap(), it results in an anonymous memory section, and is visible in maps. However, it isn’t allocated yet, only when the memory truly is getting used it is physically allocated.

I hope you start to see at this point that you can see the memory areas a process is using, but you can not tell how much of that is really, uniquely allocated for that process. Also, and maybe even more importantly, just adding memory figures of Oracle databases processes will do (a lot of) double counting. In a future blogpost or posts I will dive into how bits and pieces of the working can actually be seen from for example the /proc/PID/smaps meta-file, and how this knowledge can be applied for sizing Oracle databases.

Advertisements
1 comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: