This post is about memory management on the operating system level of an Oracle database. The first question that might pop in your head is: isn’t this a solved problem? The answer is: yes, if you use Oracle’s AMM (Automatic Memory Management) feature, which let’s you set a limit for the Oracle datababase’s two main memory area’s: SGA and PGA. But in my opinion any serious, real life, usage of an Oracle database on Linux will be (severely) constrained in performance because of the lack of huge pages with AMM, and I personally witnessed very strange behaviour and process deaths with the AMM feature and high demand for memory.
This means that I strongly advise customers to use Oracle’s ASMM (Automatic Shared Memory Management) feature. In the newer versions of 11.2 I found this to be working very well. Earlier versions like 10.2 could suffer from an ever growing shared pool (which also means an ever shrinking buffer cache), especially when bind variables weren’t used. This still could happen, but it seems the SGA memory management feature in 11.2 handles this well in most cases. The ASMM feature means a fixed memory area is allocated for the SGA. SGA allocation has always been fixed outside of the AMM feature, as far as I know.
When ASMM doesn’t work, meaning the memory areas are getting sized wrong and performance is influenced by that, the last option is to size the memory area’s yourself. However, since version 184.108.40.206 Oracle will resize when the memory manager thinks it’s feasible. See Kurt van Meerbeek’s article about that.
That leaves the PGA (Process Global Area) as a memory area on itself. Most databases are using the automatic PGA memory management, which is enabled once the PGA_AGGREGATE_TARGET parameter is set to a non zero value. A common misunderstanding is this setting is actually limiting the overall PGA usage of an instance. The truth is automatic PGA memory management will make attempts to adhere to the PGA_AGGREGATE_TARGET value. These are the actual words in the official Oracle documentation: ‘attempts to adhere’!
This means sort memory, hash memory and bitmap memory will be actively limited in size per process by automatic PGA memory management, any attempt to allocate more than automatic PGA memory management allows will result in moving some contents of these memory areas to the assigned temporary tablespace of the database user, to make room for new data.
However, there are more memory area’s allocatable per process, which are never swapped to disk, thus always will stay in memory, and these could not be limited in an officially supported way prior to Oracle version 12. Two structures which are allocated in PGA and never swapped to disk are PL/SQL collections and PL/SQL tables. Creating and filling these requires the usage of PL/SQL (hence their names); the reason for mentioning this is that if your database is not used by PL/SQL but only SQL, you almost certainly will not run into the problem I describe below.
You might be thinking: wait a minute! Does this mean a developer can just create such a structure, and allocate whatever he/she likes, with all the consequences that it can have, like the operating system starting to swap, and can do that for every single process? Yes, this is what this means. This is why Oracle introduced a parameter called PGA_AGGREGATE_LIMIT with Oracle 12, to effectively limit the overall PGA heap size.
In case you wonder what this means, or even doubting my words, I have written a little program to demonstrate this behaviour.
This is the source code to create my test table T2:
exec dbms_random.seed('abracadabra'); create table t2 as with generator as ( select rownum id from dual connect by rownum <= 1000 ) select rownum id, trunc((rownum-1)/50) clustered, mod(rownum,20000) scattered, trunc(dbms_random.value(0,20000)) randomized, trunc(sysdate) + dbms_random.value(-180, 180) random_date, dbms_random.string('l',6) random_string, lpad(rownum,10,0) vc_small, rpad('x',100,'x') vc_padding from generator g1, generator g2 where rownum <= 1000000 ; exec dbms_stats.gather_table_stats(null,'T2');
This is a very smart way to generate a table. I actually borrowed this from Jonathan Lewis.
Next up, I created a small anonymous PL/SQL block to take the contents from the T2 table, and store them in a collection until I hit the limit in the variable ‘grow_until’.
declare type sourcetab is table of t2%ROWTYPE; c_tmp sourcetab; c_def sourcetab := sourcetab(); v_b_p number := 0; v_c_p number := 0; v_b_u number := 0; v_c_u number := 0; grow_until number := 700000000; p_a_t number; begin select value into v_b_p from v$mystat m, v$statname n where m.statistic#=n.statistic# and name = 'session pga memory max'; select value into v_b_u from v$mystat m, v$statname n where m.statistic#=n.statistic# and name = 'session uga memory max'; select value into p_a_t from v$parameter where name = 'pga_aggregate_target'; select * bulk collect into c_tmp from t2; while v_c_p < grow_until loop for c in c_tmp.first .. c_tmp.last loop c_def.extend(1); c_def(c_def.last) := c_tmp(c); select value into v_c_p from v$mystat m, v$statname n where m.statistic#=n.statistic# and name = 'session pga memory max'; select value into v_c_u from v$mystat m, v$statname n where m.statistic#=n.statistic# and name = 'session uga memory max'; if v_c_p >= grow_until then exit; end if; end loop; end loop; dbms_output.put_line('vbp : '||v_b_p); dbms_output.put_line('vcp : '||v_c_p); dbms_output.put_line('vbu : '||v_b_u); dbms_output.put_line('vcu : '||v_c_u); dbms_output.put_line('pat : '||p_a_t); end; /
Please mind the session needs to have create table, create session granted, enough quota in the default tablespace and select on v_$mystat, v_$parameter and v_$statname granted.
This is run on an Oracle 220.127.116.11 database:
TS@v12102 > @pga_filler vbp : 3535368 vcp : 700051976 vbu : 1103192 vcu : 4755704 pat : 524288000 PL/SQL procedure successfully completed.
The begin sizes of the UGA (vbu) and PGA (vbp) are 1’103’192 and 3’535’368. The PGA_AGGREGATE_TARGET size is set to 524’288’000 (500MB). I did set the grow_until variable to 700’000’000 (roughly 700MB), which is more than PGA_AGGREGATE_TARGET. After running this, it’s easy to spot the values of vcu (UGA allocation) and vcp (PGA allocation). vcu grew to 4’755’704 during the run, however vcp grew to 700’051’976, a little more than 700MB! This shows that the collection is stored in the PGA, and that the collection grew beyond the value set with PGA_AGGREGATE_TARGET.
This behaviour is consistent in versions 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52 and 184.108.40.206.
Let me emphasise once again that the above proof of concept code managed to allocate more memory than was set for the overall PGA usage of the entire instance. This can have an enormous, devastating impact on a consolidated database setup (meaning having multiple instances running on a single machine). Typically, once memory consumption of all the processes exceeds physically available memory, the operating system tries to use the swap device, to which it will swap memory pages in and out depending on memory usage of active (=on CPU) processes. Mild swapping shows as severely slowed-down processing (because a number of memory pages for processing need to be read from the swap device and placed in memory, from which the former contents need to be written to the swap device), heavy swapping shows as the machine coming down to a standstill.
Please mind that a diagnosis on the state of memory usage (alias swapping), just by looking at the amount of used swap (as can be seen in the ‘top’ output, or ‘swapon -s’) could be misleading. It’s also important to look at actual swapping in and out, as can be seen with ‘vmstat 1′ (si/so columns) or swap -W. I’ve found several systems which had been running for some time (approximately longer than a month) that had swap usage, sometimes up to 40%, while no ‘active swapping’, so memory pages being transfered to and from the swap device, was happening.
Luckily, starting with Oracle 12 you can actually limit overall PGA usage using the parameter PGA_AGGREGATE_LIMIT. The default value is the greater of (list from Oracle documentation):
b) 200% of PGA_AGGREGATE_TARGET parameter (or lower if 200% > (90% of physical memory – total SGA size) but not below 100%)
c) 3MB * PROCESSES parameter
The parameter can not set below it’s default value, except when set in a pfile or spfile.
Let’s set the PGA_AGGREGATE_LIMIT to 600MB and see what happens when we start doing a large allocation again:
SQL> alter system set pga_aggregate_limit=600m scope=spfile; System altered. SQL> startup force;
Okay, let’s run the pga_filler.sql script again, and try to allocate 900MB. This means the “grow_until” variable must be set to 900000000.
PLEASE MIND this is done as a regular user, the SYS user and background processes other than job queue processes are not subject to the limiting.
TS@v12102 > @pga_filler declare * ERROR at line 1: ORA-01423: error encountered while checking for extra rows in exact fetch ORA-00039: error during periodic action ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT ORA-06512: at line 21
Great! Exactly like we expect, right?
Well…yes, but let’s look at the alert.log
Sat Dec 13 15:08:57 2014 Errors in file /u01/app/oracle/diag/rdbms/v12102/v12102/trace/v12102_ora_4147.trc (incident=46599): ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT Incident details in: /u01/app/oracle/diag/rdbms/v12102/v12102/incident/incdir_46599/v12102_ora_4147_i46599.trc Sat Dec 13 15:09:07 2014 Dumping diagnostic data in directory=[cdmp_20141213150907], requested by (instance=1, osid=4147), summary=[incident=46599]. Sat Dec 13 15:09:09 2014 Sweep [inc]: completed Sweep [inc2]: completed
Okay, essentially, this tells us nothing interesting, except for the tracefile. Let’s look in/u01/app/oracle/diag/rdbms/v12102/v12102/trace/v12102_ora_4147.trc, being the tracefile as indicated in the above alert.log snippet:
*** 2014-12-13 15:08:57.351 Process may have gone over pga_aggregate_limit Just allocated 65536 bytes Dumping short stack in preparation for potential ORA-4036 ----- Abridged Call Stack Trace ----- ksedsts()+244<-ksm_pga_limit_short_stack()+1016<-ksm_check_over_limit()+469<-ksmarfg()+574<-kghgex()+1376<-kghfnd()+361<-kghalo()+4422<-kghgex()+414<-kghfnd()+361<-kghalo()+4422<-kghgex()+414<-kghfnd()+361<-kghalo()+4422<-kghgex()+414<-kghalf()+1003<-klmalf()+103 <-kllcqas()+194<-kcblasm()+108<-kxhfNewBuffer()+607<-qerhjSplitBuild()+632 ----- End of Abridged Call Stack Trace ----- ======================================= PRIVATE MEMORY SUMMARY FOR THIS PROCESS --------------------------------------- ****************************************************** PRIVATE HEAP SUMMARY DUMP 781 MB total: 781 MB commented, 646 KB permanent 208 KB free (0 KB in empty extents), 779 MB, 2 heaps: "koh-kghu call " 57 KB free held ------------------------------------------------------ Summary of subheaps at depth 1 779 MB total: 778 MB commented, 110 KB permanent 63 KB free (0 KB in empty extents), 667 MB, 42786 chunks: "pmuccst: adt/record " 83 MB, 5333 chunks: "pl/sql vc2 "
Actually, this is the end of the tracefile. It seems that the pga limit dump (the text in between “Process may have gone over pga_aggregate_limit” to the private memory summary heap dumps) occurs several times before an actual ORA-4036 is triggered. In my private test instance, where I am obviously the only user process doing something, I get a pga limit dump approximately 20 times before the ORA-4036 is actually triggered:
sending 4036 interrupt Incident 46599 created, dump file: /u01/app/oracle/diag/rdbms/v12102/v12102/incident/incdir_46599/v12102_ora_4147_i46599.trc ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT
Did you actually spot the oddity here?
Remember the PGA_AGGREGATE_LIMIT was set to 600M. Now look at the process’ PGA/Private heap summary dump above: it says 781M. Please mind the 781M is the PGA heap of a SINGLE process! When looking at the total PGA allocated for the entire instance, it’s even more:
SYS@v12102 AS SYSDBA> select value/power(1024,2) "MB" from v$pgastat where name = 'maximum PGA allocated'; MB ---------- 1041.16699
So…despite PGA_AGGREGATE_LIMIT set to 600M, according to the v$pgastat view, there’s 1041MB allocated for PGA. Please mind I haven’t looked into how accurate v$pgastat is, but I tend to believe this.
I’ve seen PGA_AGGREGATE_TARGET being used as a calculation value for actual PGA usage of an instance. This is simply wrong. The actual amount of PGA memory allocated by the instance is highly depended on what is done, and can be less than PGA_AGGREGATE_TARGET, or more. Automatic PGA can control three per process memory area’s: the sort, hash and bitmap memory area’s. These are sized based on the setting of PGA_AGGREGATE_TARGET and the actual PGA memory usage instance wide. If more memory is needed for sort, hash or bitmap memory than is made available by the memory manager, excess memory needed is allocated in the temporary tablespace. Any other PGA memory allocation is always done, regardless of the setting of PGA_AGGREGATE_TARGET.
Starting with Oracle 12, it seems the actual PGA allocation now can actually be limited with the new parameter PGA_AGGREGATE_LIMIT. However, during some simple testing it shows that actually more memory is allocated than set with PGA_AGGREGATE_LIMIT as limit. I haven’t tested it in more situations, this post is meant to grow awareness that the actual limit as set by PGA_AGGREGATE_LIMIT might not be that hard as you would expect.
Please mind, PGA_AGGREGATE_LIMIT seems to truly limit PGA usage instance wide, not limit the PGA heap per process, as event 10251 (PGA usage limiting way for Oracle 11.2) does. However, once again: PGA_AGGREGATE_LIMIT seems to try to be smart and actually does not limit at the exact size set, but beyond that.
The next post will introduce a way to limit PGA usage in Oracle 11.2. Stay tuned!