Archive

Tag Archives: security

Actually, this is a follow up post from my performance deep dive into tablespace encryption. After having investigated how tablespace encryption works, this blogpost is looking at the other encryption option, column encryption. A conclusion that can be shared upfront is that despite they basically perform the same function, the implementation and performance consequences are quite different.

Column encryption gives you the ability to choose to encrypt per individual column, that’s kind of obvious. However, having to choose which columns to encrypt is what I see as the biggest downside of this encryption option. In most cases, especially with boxed applications, it is quite hard to try to figure out which columns you exactly want to encrypt in order to protect your sensitive data. Which columns do exactly contain your primary sensitive data, and which columns do contain secondary sensitive data (data derived from sensitive data). Do you, when you have to apply encryption, know what EXACTLY is defined as sensitive data, and what isn’t? I bet there isn’t a clear technical description.

A logical reaction then would be ‘couldn’t I then just encrypt all columns’? Well, that is what tablespace encryption is for, isn’t it? To summarise this: I do think the correct use of column encryption in reality is hard to implement and this very limited in usefulness, in most cases tablespace encryption should be used.

Okay…for this test I created a table with two columns, of which one is encrypted:

SQL> create table column_encryption (id number, a varchar2(10) encrypt);
SQL> insert into column_encryption values (1, 'AAAAAAAAAA');
SQL> commit;

The same table, but without encryption:

SQL> create table no_column_encryption (id number, a varchar2(10) );
SQL> insert into no_column_encryption values (1, 'AAAAAAAAAA');
SQL> commit;

And the same table with a lot of rows:

SQL> create table column_encryption_large (id number, a varchar2(10) encrypt);
SQL> begin
 	     for counter in 1..32000000 loop
 		     insert into column_encryption_large values ( counter, dbms_random.string('l',10) );
 	     end loop;
 end;
/

Let’s follow the path of the previous TDE post, and profile the execution of a SQL on the big table to see the impact of column encryption. The first test is a ‘select count(*) from column_encryption_large’ in one session, and ‘perf record -g -p PID’ in another. If you need more explanation on how to run it, please look at the previous blogpost. This is the output of ‘perf report –sort comm –max-stack 2’:

# perf report --sort comm --max-stack 2
# To display the perf.data header info, please use --header/--header-only options.
#
# Samples: 1K of event 'cycles'
# Event count (approx.): 1418165467
#
# Children      Self  Command
# ........  ........  ...............
#
   100.00%   100.00%  oracle_6919_aob
            |--29.21%-- kdstf00000010000100kmP
            |--12.58%-- kdbulk
            |--3.32%-- gup_pte_range
            |--2.58%-- kdst_fetch0
            |--2.54%-- kcbgtcr
            |--2.25%-- __blk_bios_map_sg
            |--2.21%-- kcbhvbo
            |--2.18%-- unlock_page
            |--1.98%-- ktrgcm
            |--1.93%-- do_direct_IO
            |--1.86%-- kcbldrget
            |--1.52%-- kcoapl

This shows IO related functions, both Oracle and operating system level; kdstf is kernel data scan table full for example, gup_pte_range, do_direct_IO, unlock_page and __blk_bios_map_sg are Linux kernel functions. Most notably there are no encryption related functions, which is a big difference with tablespace encryption!
This is actually very logical if you understand the differences between column encryption and tablespace encryption. First let’s look at a block dump from a data block from segment in an encrypted tablespace:

Block dump from cache:
Dump of buffer cache at level 4 for pdb=0 tsn=5 rdba=907
Block dump from disk:
Encrypted block <5, 907> content will not be dumped. Dumping header only.
buffer tsn: 5 rdba: 0x0000038b (1024/907)
scn: 0x0.4e9af4 seq: 0x01 flg: 0x16 tail: 0x9af40601
frmt: 0x02 chkval: 0xf23a type: 0x06=trans data

Yes…you read that right: the block is encrypted, so it will not be dumped. Luckily, you can set the undocumented parameter “_sga_clear_dump” to true to make Oracle dump the block:

SQL> alter session set "_sga_clear_dump"=true;
SQL> alter system dump datafile 5 block 907;

This will make Oracle dump the block. The dump will show the decrypted version of the tablespace level encrypted block:

Block header dump:  0x0000038b
 Object id on Block? Y
 seg/obj: 0x17bc3  csc: 0x00.4e9aed  itc: 2  flg: E  typ: 1 - DATA
     brn: 0  bdba: 0x388 ver: 0x01 opc: 0
     inc: 0  exflg: 0

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0007.01d.000001d0  0x00000987.0390.27  --U-    1  fsc 0x0000.004e9af4
0x02   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
bdba: 0x0000038b
data_block_dump,data header at 0x7f140f335374
===============
tsiz: 0x1f98
hsiz: 0x14
pbl: 0x7f140f335374
     76543210
flag=--------
ntab=1
nrow=1
frre=-1
fsbo=0x14
fseo=0x1f8a
avsp=0x1f76
tosp=0x1f76
0xe:pti[0]      nrow=1  offs=0
0x12:pri[0]     offs=0x1f8a
block_row_dump:
tab 0, row 0, @0x1f8a
tl: 14 fb: --H-FL-- lb: 0x1  cc: 1
col  0: [10]  41 41 41 41 41 41 41 41 41 41
end_of_block_dump

For the count(*), there is no need to read the data, the only thing needed is to read the row directory to fetch the number of rows (row 19). However, to do that, the block must be decrypted.

Now look at a block dump of a column encrypted data block:

Block header dump:  0x0000032b
 Object id on Block? Y
 seg/obj: 0x1821d  csc: 0x00.676d7e  itc: 2  flg: E  typ: 1 - DATA
     brn: 0  bdba: 0x328 ver: 0x01 opc: 0
     inc: 0  exflg: 0

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x000a.007.000078a9  0x00000117.2246.07  --U-    1  fsc 0x0000.00676d7f
0x02   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
bdba: 0x0000032b
data_block_dump,data header at 0x7f140f333264
===============
tsiz: 0x1f98
hsiz: 0x14
pbl: 0x7f140f333264
     76543210
flag=--------
ntab=1
nrow=1
frre=-1
fsbo=0x14
fseo=0x1f5d
avsp=0x1f49
tosp=0x1f49
0xe:pti[0]      nrow=1  offs=0
0x12:pri[0]     offs=0x1f5d
block_row_dump:
tab 0, row 0, @0x1f5d
tl: 59 fb: --H-FL-- lb: 0x1  cc: 2
col  0: [ 2]  c1 02
col  1: [52]
 fd e0 87 66 55 f7 e6 43 de be 31 f6 71 4f 7f 4e f1 75 fb 88 98 9d 13 ed 8e
 cb 69 02 bc 29 51 bd 21 ea 22 04 6b 70 e9 ec 01 9d d6 e4 5a 84 01 1d 90 b0
 e9 01
end_of_block_dump

The block and the row directory can be read normally without any need for decryption. The only thing encrypted is the column (“a”). That perfectly explains the absence of any functions that indicate decryption, because there isn’t any decryption taking place!

Now let’s rewrite the SQL to touch the data, and thus involve decryption: ‘select avg(length(a)) from column_encryption_large’. This way the row needs to be decrypted and read. This is how the output of a perf recording looks like:

# perf report --sort comm --max-stack 2
# To display the perf.data header info, please use --header/--header-only options.
#
# Samples: 65K of event 'cycles'
# Event count (approx.): 229042607170
#
# Children      Self  Command
# ........  ........  ...............
#
   100.00%   100.00%  oracle_6919_aob
            |--24.73%-- ztchsh1h
            |--14.91%-- ztchsh1n
            |--6.10%-- y8_ExpandRijndaelKey
            |--5.90%-- ownGetReg
            |--5.50%-- __intel_ssse3_rep_memcpy
            |--4.99%-- ztchsh1f
            |--4.28%-- ztcxi
            |--2.60%-- ipp_is_GenuineIntel
            |--1.88%-- _intel_fast_memcpy
            |--1.74%-- _intel_fast_memcpy.P
            |--1.52%-- kspgip
            |--1.16%-- kgce_init

The functions starting with ‘ztc’ are probably related to security (“zecurity”), and also probably related to decryption. The function name “y8_ExpandRijndaelKey” is clearly related to encryption. When you look up the function address of “ownGetReg”, it’s close to the “y8_ExpandRijndaelKey” function. The last group of functions are memcpy related functions, that seems consistent with decrypting: moving data.

On the performance side, it’s clear that the majority of the time is spend in the functions ztchsh1h and ztchsh1n. In order to understand more about these functions, let’s expand the stack:

# perf report --sort comm
# To display the perf.data header info, please use --header/--header-only options.
#
# Samples: 65K of event 'cycles'
# Event count (approx.): 229035032972
#
# Children      Self  Command
# ........  ........  ...............
#
   100.00%   100.00%  oracle_6919_aob
            |
            |--25.01%-- ztchsh1h
            |          |
            |          |--99.63%-- ztchsh1n
            |          |          |
            |          |          |--50.85%-- ztchsh1f
            |          |          |          ztchf
            |          |          |          ztcxf
            |          |          |          ztcx
            |          |          |          kztsmohmwl
            |          |          |          kztsmhmwl
            |          |          |          kzekmetc
            |          |          |          kzecsqen
            |          |          |          kzecctex
            |          |          |          evaopn2
            |          |          |          evaopn2
            |          |          |          qesaAggNonDistSS
            |          |          |          kdstf00001010000000km
            |          |          |          kdsttgr
            |          |          |          qertbFetch
            |          |          |          qergsFetch
            |          |          |          opifch2
            |          |          |          kpoal8
------------------------------------------------------
            |--14.90%-- ztchsh1n
            |          |
            |          |--85.25%-- ztchsh1f
            |          |          ztchf
            |          |          ztcxf
            |          |          ztcx
            |          |          kztsmohmwl
            |          |          kztsmhmwl
            |          |          kzekmetc
            |          |          kzecsqen
            |          |          kzecctex
            |          |          evaopn2
            |          |          evaopn2
            |          |          qesaAggNonDistSS
            |          |          kdstf00001010000000km
            |          |          kdsttgr
            |          |          qertbFetch
            |          |          qergsFetch
            |          |          opifch2
            |          |          kpoal8

I fetched the stack of the two functions in which the most time was spend. The most important thing to see is that the encryption now takes place as part of processing the fetched data (qesaAggNonDistSS probably has something to do with aggregating data, evaopn2 probably is a function to evaluate operands) rather than performing the (logical) IO; mind the absence of the kcbgtcr function.

The reason for doing the decryption during operand evaluation rather than during doing the IO is because the data is stored encrypted in the block, and thus also in the buffer cache. So during IO time, there is no need to decrypt anything. This also has another rather important consequence: for every row that has an encrypted column that is processed, decryption needs to take place. There does not seem to be any caching of the decrypted value for column encryption, which is logical from a security point of view, but has a severe performance consequence.

When doing a pin tools debugtrace on the above SQL for the processing of a single row (the table ‘column_encryption’, rather than ‘column_encryption_large’), applying the sed filters, and then grepping for a selective set of functions, this is how the processing looks like:

 | | | | > qergsFetch(0x294512030, 0x7f871c9fa2f0, ...)
 | | | | | > qeaeAvg(0x7f8717ce9968, 0xe, ...)
 | | | | | < qeaeAvg+0x000000000063 returns: 0  | | | | | > qertbFetch(0x294512178, 0x7f871ca08a68, ...)
 | | | | | | | | | | > kcbgtcr(0x7ffe2f9b3ae0, 0, ...)
 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | > pread64@plt(0x100, 0x1f428c000, ...)
 | | | | | | | | | | < kcbgtcr+0x000000003221 returns: 0x1f428c014  | | | | | | | | | | | | | | > kcbgtcr(0x7ffe2f9b35d0, 0, ...)
 | | | | | | | | | | | | | | < kcbgtcr+0x0000000009a1 returns: 0x1f428c014  | | | | | | > kdsttgr(0x7f871c9f9918, 0, ...)
 | | | | | | | > kdstf00001010000000km(0x7f871c9f9918, 0, ...)
 | | | | | | | | > kdst_fetch(0x1, 0x7f871c9f9918, ...)
 | | | | | | | | | > kdst_fetch0(0x1, 0x7f871c9f9918, ...)
 | | | | | | | | | | | > kcbgtcr(0x7f871c9f9930, 0, ...)
 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | > pread64@plt(0x100, 0x2b1115000, ...)
 | | | | | | | | | | | < kcbgtcr+0x000000003221 returns: 0x1e4aa6014
 | | | | | | | | | < kdst_fetch0+0x0000000004d0 returns: 0x1e4aa6076
 | | | | | | | | < kdst_fetch+0x000000000048 returns: 0x1e4aa6076  | | | | | | | | > qesaAggNonDistSS(0x7ffe2f9b45d0, 0x7fff, ...)
 | | | | | | | | | > evaopn2(0x294511ef0, 0x294512030, ...)
 | | | | | | | | | | > evaopn2(0x294511e68, 0x294512030, ...)
 | | | | | | | | | | | | | | | | | | | > ztchsh1n(0x7ffe2f9b1ef8, 0x11c4e8d0, ...)
 | | | | | | | | | | | | | | | | | | | > ztchsh1f(0x7ffe2f9b1ef8, 0x7ffe2f9b3100, ...)
 --> 168 times in total of ztchsh1n or ztchsh1f
 | | | | | | | | | | < evaopn2+0x0000000002dc returns: 0x7f871c9fa2c0  | | | | | | | | | | > evalen(0x294511ef0, 0x7f871c9fa2c0, ...)
 | | | | | | | | | | < evalen+0x000000000147 returns: 0x2
 | | | | | | | | | < evaopn2+0x0000000002dc returns: 0x7f871c9fa2d0  | | | | | | | | | > qeaeAvg(0x7f8717ce9968, 0xb, ...)
 | | | | | | | | | < qeaeAvg+0x000000000063 returns: 0x7f8717ce99c9
 | | | | | | | | < qesaAggNonDistSS+0x000000000193 returns: 0x7fff  | | | | | | | | > kdst_fetch(0x1, 0x7f871c9f9918, ...)
 | | | | | | | | | > kdst_fetch0(0x1, 0x7f871c9f9918, ...)
 | | | | | | | | | | | > kcbgtcr(0x7f871c9f9930, 0, ...)
 | | | | | | | | | | | < kcbgtcr+0x0000000009a1 returns: 0x1dec30014
 | | | | | | | | | < kdst_fetch0+0x0000000004d0 returns: 0x1dec30072
 | | | | | | | | < kdst_fetch+0x000000000048 returns: 0x1dec30072  | | | | | | | | > kdst_fetch(0x1, 0x7f871c9f9918, ...)
 | | | | | | | | | > kdst_fetch0(0x1, 0x7f871c9f9918, ...)
 | | | | | | | | | | | > kcbgtcr(0x7f871c9f9930, 0, ...)
 | | | | | | | | | | | < kcbgtcr+0x0000000009a1 returns: 0x1deca4014
 | | | | | | | | | < kdst_fetch0+0x0000000004d0 returns: 0x1deca4072
 | | | | | | | | < kdst_fetch+0x000000000048 returns: 0x1deca4072  | | | | | | | | > kdst_fetch(0x1, 0x7f871c9f9918, ...)
 | | | | | | | | | > kdst_fetch0(0x1, 0x7f871c9f9918, ...)
 | | | | | | | | | | | > kcbgtcr(0x7f871c9f9930, 0, ...)
 | | | | | | | | | | | < kcbgtcr+0x0000000009a1 returns: 0x1e4be0014
 | | | | | | | | | < kdst_fetch0+0x0000000004d0 returns: 0x1e4be0072
 | | | | | | | | < kdst_fetch+0x000000000048 returns: 0x1e4be0072  | | | | | | | | > kdst_fetch(0x1, 0x7f871c9f9918, ...)
 | | | | | | | | | > kdst_fetch0(0x1, 0x7f871c9f9918, ...)
 | | | | | | | | | | | > kcbgtcr(0x7f871c9f9930, 0, ...)
 | | | | | | | | | | | < kcbgtcr+0x0000000009a1 returns: 0x1dedb2014
 | | | | | | | | | < kdst_fetch0+0x0000000004d0 returns: 0x1dedb2072
 | | | | | | | | < kdst_fetch+0x000000000048 returns: 0x1dedb2072  | | | | | | | | > kdst_fetch(0x1, 0x7f871c9f9918, ...)
 | | | | | | | | | > kdst_fetch0(0x1, 0x7f871c9f9918, ...)
 | | | | | | | | | < kdst_fetch0+0x0000000011c9 returns: 0
 | | | | | | | | < kdst_fetch+0x000000000048 returns: 0
 | | | | | | | < kdstf00001010000000km+0x00000000035d returns: 0x7fff
 | | | | | | < kdsttgr+0x00000000085f returns: 0x7fff
 | | | | | < qertbFetch+0x000000001301 returns: 0x7fff  | | | | | > qeaeAvg(0x7f8717ce9968, 0x294511f78, ...)
 | | | | | < qeaeAvg+0x000000000063 returns: 0x2  | | | | | | > evaopn2(0x294511f78, 0, ...)
 | | | | | | < evaopn2+0x0000000002dc returns: 0x7f871c9fa2e0
 | | | | < qergsFetch+0x000000000f25 returns: 0

This is how the explain plan of the ‘select avg(length(a)) from column_encryption’ SQL:

----------------------------------------------------------------------------------------
| Id  | Operation	   | Name	       | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |		       |       |       |     3 (100)|	       |
|   1 |  SORT AGGREGATE    |		       |     1 |    53 |	    |	       |
|   2 |   TABLE ACCESS FULL| COLUMN_ENCRYPTION |     1 |    53 |     3	 (0)| 00:00:01 |
----------------------------------------------------------------------------------------

If you look back to the grepped debugtrace, and the execution plan:
Line 1: the sort aggregate rowsource (qergsFetch).
Line 4: the table acces full (qertbFetch).
Line 5: here a logical read (kcbgtcr) is issued, and because the block didn’t exist in the cache, was physically read (line 6: pread64). This is the segment header, the “real” scan of data blocks has not been started yet.
Line 11: this is the ultra fast full table scan (kdstf00001010000000km). My guess is this function is a full table scan function with certain decisions hard coded, instead of choices made on runtime, so the in-CPU execution prediction runs into less branch mispredictions.
Line 12: this is the part of the full table scan for fetching (loading) the data (kdst_fetch). What is special here is that a multiblock read is done, the kcbgtcr function triggers a single physical read for multiple logical blocks, which are later fetched per block (kdst_fetch and kcbgtcr functions starting from line 32, 38, etc).
Line 19: this function executes row based functions and aggregates the results per block/fetch (qesaAggNonDistSS).
Line 20: as part of fetching the row and executing functions, the row value is evaluated first (evaopn2).
Line 21/22: here the column is decrypted (made visible by the ztchsh1n/ztchsh1f functions, not necessarily the decryption functions theirselves).
Line 26/29: here probably the length (evalen) and average (qeaeAvg) row functions are executed.
Line 32: the next block is processed, but no rows are found, and thus no need to execute rowsource (qe*) functions afterwards.

So, what do we know at this point regarding column encryption?
– Columns that are encrypted are stored encrypted in the block in the buffer cache.
– Which means they have to be decrypted every time the column values are read, which is different from tablespace encryption, for which a block is encrypted, and is decrypted whenever a block is read into the buffer cache.
– Functions related to column encryption specifically (different functions than seen with tablespace encryption) take roughly 40% of the time in my case.

Can the time spend on column decryption be optimised?
There are multiple ways you can change the way Oracle applies column encryption. There are four encryption types: 3DES168, AES128, AES192 and AES256. The default is AES192.
Here are query timings of doing a select avg(length(a)) from TABLE on my “large” table with 32 million rows:

3DES168 4:53
AES256 1:09
AES192 1:06
AES128 1:03

A way to optimise column encryption is to specify not to use an extra verification by specifying ‘nomac’ at the encryption definition of the column. This saves space (by default, extra space is used for the message abstract that is used by the verification for every column). These are the timings with the ‘nomac’ addition to disable encryption verification:

3DES168 3:59
AES256 0:26
AES192 0:23
AES128 0:22

This shows a significant reduction of time. However, if no encryption at all is applied to the column, the query timing is 0:03.

Internals background information
The functions ztchsh1n/ztchsh1 are related to verification (again, if you read the tablespace encryption blogpost, where the most time consuming functions were verification too). Once ‘nomac’ is specified with the encryption definition of the column, the ztchsh* function vanish, and the top time consuming functions are y8_ExpandRijndaelKey and ownGetReg, which clearly are directly related to decryption. The performance gain of ‘nomac’ is lesser with 3DES168 encryption.

Conclusion
I think tablespace encryption is the encryption method of choice for a normal implementation. In most cases it will be too much work to exactly figure out which columns to encrypt. If you still consider column encryption, you also should be aware that the column value is stored encrypted in the block and (as a consequence) in the cache. Every use of the encrypted column involves encryption or decryption, for which the overhead is significant, even with ‘nomac’ specified to disable (additional) verification.

Recently, I was trying to setup TDE. Doing that I found out the Oracle provided documentation isn’t overly clear, and there is a way to do it in pre-Oracle 12, which is done using ‘alter system’ commands, and a new-ish way to do it in Oracle 12, using ‘administer key management’ commands. I am using version 12.1.0.2.170117, so decided to use the ‘administer key management’ commands. This blogpost is about an exception which I see is encountered in the Januari 2017 (170117) PSU of the Oracle database, which is NOT happening in Oracle 12.2 (no PSU’s for Oracle 12.2 at the time of writing) and Oracle 12.1.0.2 April 2016 and October 2016 PSU’s.

In order to test the wallet functionality for TDE, I used the following commands:

SQL> select status, wrl_parameter from v$encryption_wallet;

STATUS
------------------------------
WRL_PARAMETER
--------------------------------------------------------------------------------
NOT_AVAILABLE
/u01/app/oracle/admin/test/wallet

SQL> !mkdir /u01/app/oracle/admin/test/wallet

SQL> administer key management create keystore '/u01/app/oracle/admin/test/wallet' identified by "this_is_the_keystore_password";

keystore altered.

SQL> administer key management set keystore open identified by "this_is_the_keystore_password";

keystore altered.

SQL> administer key management set key identified by "this_is_the_keystore_password" with backup;
administer key management set key identified by "this_is_the_keystore_password" with backup
*
ERROR at line 1:
ORA-28374: typed master key not found in wallet

SQL> select status, wrl_parameter from v$encryption_wallet;

STATUS
------------------------------
WRL_PARAMETER
--------------------------------------------------------------------------------
CLOSED
/u01/app/oracle/admin/test/wallet

SQL> administer key management set keystore open identified by "this_is_the_keystore_password";

keystore altered.

SQL> select status, wrl_parameter from v$encryption_wallet;

STATUS
------------------------------
WRL_PARAMETER
--------------------------------------------------------------------------------
OPEN
/u01/app/oracle/admin/test/wallet

Notes:
Line 1-10: The DB_UNIQUE_NAME of the instance is ‘test’, and therefore the default wallet location is /u01/app/oracle/admin/test/wallet (ORACLE_BASE/admin/DB_UNIQUE_NAME/wallet). The wallet directory doesn’t exist by default, so I created it (line 10).
Line 12: Here the keystore/wallet is created with a password.
Line 16: After the wallet is created without auto-login, the wallet must be opened using the ‘set keystore open’ command.
Line 20: After the wallet has been created, it does not contain a master key. This is done using the ‘set key’ command. However, this throws an ORA-28374 error.
Line 26: After an error involving the wallet has occurred, the wallet closes.
Line 35: The wallet can simply be opened using the earlier used ‘set keystore open’ command.
Line 39: This is where the surprise is: after opening, the master key “magically” appeared (visible by the status ‘OPEN’, without a master key this would be ‘OPEN_NO_MASTER_KEY’).

I yet have to start creating encrypted table spaces. There might be more surprises, I can’t tell at this moment because I didn’t try it. However, once I discovered this oddity, I talked to my colleague Matt who gave me his own runbook for enabling TDE, which turned out to be the exact same list of commands as I compiled, however he did not encounter the ORA-28374 which I did. I tested the same sequence of commands on 12.2.0.1, 12.1.0.2.161018 (October 2016) and 12.1.0.2.160419 (April 2016) and there the ORA-28374 was not raised during execution of the ‘set key’ command.

Update!
Reading through My Oracle Support note Master Note For Transparent Data Encryption ( TDE ) (Doc ID 1228046.1), I found the following text:

All the versions after 12.1.0.2

=====================

As of 12.1.0.2 If the key associated with the SYSTEM, SYSAUX or UNDO tablespaces is not present in the wallet you cannot associate a new master key with the database (i.e. you cannot activate that master key for the database) unless you set a hidden parameter :

SQL> administer key management use key ‘AUQukK/ZR0/iv26nuN9vIqcAAAAAAAAAAAAAAAAAAAAAAAAAAAAA’ identified by “welcome1” with backup;
administer key management use key ‘AUQukK/ZR0/iv26nuN9vIqcAAAAAAAAAAAAAAAAAAAAAAAAAAAAA’ identified by “welcome1” with backup
*
ERROR at line 1:
ORA-28374: typed master key not found in wallet

alter system set “_db_discard_lost_masterkey”=true;

SQL> administer key management use key ‘AUQukK/ZR0/iv26nuN9vIqcAAAAAAAAAAAAAAAAAAAAAAAAAAAAA’ identified by “welcome1” with backup;

The heading and first line read weird, the heading indicates the paragraph is about ‘all the versions after 12.1.0.2’ (which to me means 12.2), and the first line in the paragraph says ‘as of 12.1.0.2’, which very clearly says this is about version 12.1.0.2 and higher. However, a little further it shows the exact error (ORA-28374) I encountered, and explains that if a current key is used in the data dictionary (mind data dictionary, not wallet), you must set “_db_discard_lost_masterkey” to true before you can create and use another master key for a wallet if you start over (wipe or move the wallet directory).

This makes sense to me now! I tried dropping and creating new wallets in my current 170117 PSU instance, and only tried creating an encryption wallet in a brand new freshly created instance. So if I would have EXACTLY done the same in the instances with the other PSU’s, which is repeatedly create and drop a wallet for TDE, I would have encountered the same ORA-28374 error. Well…I see this as a safety mechanism, be it not a very obvious one, not exuberant documented, and probably causing more grief than it would save if you run into the need the change the master key.

Recently I was asked to analyse the security impact of the snmp daemon on a recent Exadata. This system was running Exadata image version 12.1.2.1.3. This blog article gives you an overview of a lot of the things that surround snmp and security.

First of all what packages are installed doing something with snmp? A list can be obtained the following way:

# rpm -qa | grep snmp
net-snmp-utils-5.5-54.0.1.el6_7.1.x86_64
net-snmp-libs-5.5-54.0.1.el6_7.1.x86_64
net-snmp-5.5-54.0.1.el6_7.1.x86_64
sas_snmp-14.02-0103.x86_64

Essentially the usual net-snmp packages and a package called ‘sas_snmp’.

A next important thing is how the firewall is configured. However, the default setting of the firewall on the compute nodes with exadata is the firewall turned off:

# iptables -L -v
Chain INPUT (policy ACCEPT 437M packets, 216G bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 343M packets, 748G bytes)
 pkts bytes target     prot opt in     out     source               destination

So if there is something running that listens to a network port that can benefit ‘attackers’, there is no firewall to stop them.

Next obvious question is what snmp processes are actually running:

# ps -ef |grep snmp
root       7088      1  0 Aug16 ?        00:51:32 /usr/sbin/snmpd -LS0-6d -Lf /dev/null -p /var/run/snmpd.pid
root      33443      1  0 03:14 ?        00:00:49 /usr/sbin/lsi_mrdsnmpagent -c /etc/snmp/snmpd.conf
root      33454  33443  0 03:14 ?        00:00:00 /usr/sbin/lsi_mrdsnmpagent -c /etc/snmp/snmpd.conf

The snmpd process is the net-snmp snmp daemon. However, there are two additional processes running with the name ‘snmp’ in them, one is owned by init, and a processes that this process has spawned. The name ‘lsi_mrdsnmpagent’ probably means LSI MegaRaid SNMP agent. That gives a fair hint this processes is doing something snmp related specifically for the LSI MegaRaid adapter, which is the disk controller.

Are there any open ports related to snmp processes?

# netstat -anp | grep snmp
tcp        0      0 127.0.0.1:199               0.0.0.0:*                   LISTEN      7088/snmpd
udp        0      0 0.0.0.0:161                 0.0.0.0:*                               7088/snmpd
udp        0      0 0.0.0.0:22917               0.0.0.0:*                               7088/snmpd

1. tcp port 199
This is support for the SMUX protocol (RFC 1227) to communicate with SMUX-based subagents. This is a deprecated feature in favour of AgentX. It is considered a bug (https://bugzilla.redhat.com/show_bug.cgi?id=110931) the daemon is still using this port. However the port is opened on localhost (127.0.0.1) and as such not reachable from outside of the machine, which means it is not a direct security problem.

2. udp port 161
This is the default snmpd port. This port is open to the outside world on the compute node, which can be seen from the address 0.0.0.0 in the above ‘source’ column. The port being open can be verified using another machine and the ‘nmap’ tool:

$ sudo nmap -Pn -sU -p 161 311.1.1.1
Password:

Starting Nmap 6.47 ( http://nmap.org ) at 2016-10-26 15:00 CEST
Nmap scan report for 311.1.1.1
Host is up (0.087s latency).
PORT    STATE SERVICE
161/udp open  snmp

The status ‘open’ shows this udp port does respond to requests.

3. udp port 22917 (in this case; this port number is random)
This is a random port that gets set for trapsink directive set in the /etc/snmp/snmpd.conf. A trap sink is the destination for snmp traps that get triggered. Although this udp port is in use, it does not respond to network traffic:

$ sudo nmap -Pn -sU -p 22917 311.1.1.1
Password:

Starting Nmap 6.47 ( http://nmap.org ) at 2016-10-26 15:22 CEST
Nmap scan report for 311.1.1.1
Host is up.
PORT      STATE         SERVICE
22917/udp open|filtered unknown

The status ‘open|filtered’ does mean the udp port does not respond to requests.

Now let’s look how the actual configuration file of the snmp daemon looks like on exadata. The configuration file is /etc/snmp/snmpd.conf:

snmp daemon configuration file:
trapcommunity public
trapsink 127.0.0.1 public
rocommunity public 127.0.0.1
rwcommunity public 127.0.0.1

access  RWGroup         ""      any       noauth    exact all all all
com2sec snmpclient      127.0.0.1               public
group   RWGroup                 v1                              snmpclient

pass .1.3.6.1.4.1.4413.4.1 /usr/bin/ucd5820stat
pass .1.3.6.1.4.1.3582 /usr/sbin/lsi_mrdsnmpmain

syscontact Root <root@localhost> (configure /etc/snmp/snmp.local.conf)
syslocation Unknown (edit /etc/snmp/snmpd.conf)

view    all             included      .1                80

The snmpd.conf file shows:
– trapsink destination (127.0.0.1, localhost) and community string (public).
– the ro and rw communities are set to ‘public 127.0.0.1’
In general it is advised to change the community strings to something unique to avoid being easy guessable. However, in this case there’s also a network description following the ro and rw community, which is: 127.0.0.1. This means snmp access is restricted to localhost.
This can be verified by running snmpwalk from another machine:

$ snmpwalk -v 2c -c public 311.1.1.1
Timeout: No Response from 311.1.1.1

This means there is no way to communicate to the snmp daemon from outside of the machine. We can see from the snmp daemon configuration file that access to the snmp deamon is limited to localhost.

It seems the LSI megaraid snmp agent works together with snmpd:

root      33443      1  0 03:14 ?        00:01:01 /usr/sbin/lsi_mrdsnmpagent -c /etc/snmp/snmpd.conf
root      33454  33443  0 03:14 ?        00:00:00 /usr/sbin/lsi_mrdsnmpagent -c /etc/snmp/snmpd.conf

Obviously it reads the snmpd.conf (-c /etc/snmp/snmpd.conf as seen above), but it has got a configuration file of its own. This configuration file of ‘lsi_mrdsnmpagent’ process can be seen when looking at the files inside the sas_snmp rpm package (rpm -ql sas_snmp), but the main evidence it is using the file can be derived from looking at the open file descriptors of the lsi_mrdsnmpagent process:

# ls -ls /proc/$(pgrep -f lsi_mrdsnmpagent | head -1)/fd
total 0
0 lr-x------. 1 root root 64 Oct 26 03:14 0 -> /dev/null
0 lr-x------. 1 root root 64 Oct 26 03:14 1 -> /etc/lsi_mrdsnmp/sas/sas_TrapDestination.conf
0 l-wx------. 1 root root 64 Oct 26 03:14 2 -> /var/log/cellos/cron_daily_cellos.stderr (deleted)
0 lrwx------. 1 root root 64 Oct 26 03:14 3 -> socket:[2923149143]
0 l-wx------. 1 root root 64 Oct 26 03:14 4 -> /var/log/cellos/cellos.log (deleted)
0 l-wx------. 1 root root 64 Oct 26 03:14 5 -> /var/log/cellos/cellos.trc (deleted)
0 lr-x------. 1 root root 64 Oct 26 03:14 6 -> /etc/snmp/snmpd.conf
0 lr-x------. 1 root root 64 Oct 26 03:14 7 -> /etc/redhat-release
0 lr-x------. 1 root root 64 Oct 26 03:14 8 -> /dev/megaraid_sas_ioctl_node
0 lr-x------. 1 root root 64 Oct 26 03:14 9 -> pipe:[2919419375]

Line 4 shows ‘/etc/lsi_mrdsnmp/sas/sas_TrapDestination.conf’! Let’s look inside that configuration file:

# cat /etc/lsi_mrdsnmp/sas/sas_TrapDestination.conf
#################################################
# Agent Service needs the IP addresses to sent trap
# The trap destination may be specified in this file or
# using snmpd.conf file. Following indicators can be set
# on "TrapDestInd" to instruct the agent to pick the IPs
# as the destination.
# 1 - IPs only from snmpd.conf
# 2 - IPs from this file only
# 3 - IPs from both the files
#################################################
TrapDestInd 3
#############Trap Destination IP##################
# Add port no after IP address with no space after
# colon to send the SNMP trap message to custom port.
# Community is to be mentioned after IP. If no community
# is mentioned, default SNMP community 'public' shall be
# used. 'trapcommunity' token is also used in snmpd.conf.
# Alternatively, you can also use trapsink command
# in snmpd.conf to send the SNMP trap message to
# custom port, else default SNMP trap port '162' shall
# be used.
127.0.0.1	public
# 145.147.201.88:1234	public
# 145.146.180.20:3061	testComm
127.0.0.1:8162 public

It is a configuration file that works alongside the snmpd.conf configuration. What is important to see, is ‘TrapDestInd’, which is set at ‘3’, which means that traps are send to trap destinations set in the snmpd.conf file AND set in the sas_TrapDestionation.conf file. Two traps are defined in the file, 127.0.0.1 with community string public, which means it sends a trap to udp port 161 (at which the snmpd process is listening, as we saw earlier in the open ports list), but the most interesting thing here is there’s also a trap send to 127.0.0.1 at port 8162. That is a port number I do not know from the top of my head!

However, it’s simple to find out. The first thing to check is to see what process is running at port 8162:

# netstat -anp | grep 8162
udp        0      0 :::8162                     :::*                                    15233/java

That’s a java process! Let’s grep the process number to see if the full command line gives more clues what this java process is:

# ps -ef | grep 15233
dbmsvc    15233  15136  0 Aug16 ?        05:32:25 /usr/java/jdk1.7.0_80/bin/java -client -Xms256m -Xmx512m -XX:CompileThreshold=8000 -XX:PermSize=128m -XX:MaxPermSize=256m -Dweblogic.Name=msServer -Djava.security.policy=/opt/oracle/dbserver_12.1.2.1.3.151021/dbms/deploy/wls/wlserver_10.3/server/lib/weblogic.policy -XX:-UseLargePages -XX:ParallelGCThreads=8 -Dweblogic.ListenPort=7878 -Djava.security.egd=file:/dev/./urandom -Xverify:none -da -Dplatform.home=/opt/oracle/dbserver_12.1.2.1.3.151021/dbms/deploy/wls/wlserver_10.3 -Dwls.home=/opt/oracle/dbserver_12.1.2.1.3.151021/dbms/deploy/wls/wlserver_10.3/server -Dweblogic.home=/opt/oracle/dbserver_12.1.2.1.3.151021/dbms/deploy/wls/wlserver_10.3/server -Dweblogic.management.discover=true -Dwlw.iterativeDev= -Dwlw.testConsole= -Dwlw.logErrorsToConsole= -Dweblogic.ext.dirs=/opt/oracle/dbserver_12.1.2.1.3.151021/dbms/deploy/wls/patch_wls1036/profiles/default/sysext_manifest_classpath weblogic.Server

That’s java running weblogic, with the name ‘msServer’. That is something that is part of the daemons that serve dbmcli (alike the daemons that service cellcli on the cells)!

This actually makes sense. The daemons that manage the database server fetch hardware status information and hardware failures from the BMC using the IPMI device (/dev/ipmi0). However the LSI MegaRaid adapter can not provide its status in that way. So in order for the management daemons to keep track of events on the LSI MegaRaid adapter (hardware issues), a daemon that works together with the snmp daemon is setup, which sends snmp traps if something occurs. The management daemon has setup a port that listens for these traps.

I do not know if the community strings is processed by the management deamon processing the trap. However, the port number on which the daemon is listening for traps is defined in ‘/opt/oracle/dbserver/dbms/deploy/config/cellinit.ora’ with the directive BMC_SNMP_PORT.

Conclusion
As far as I can see, the reason the snmp deamon is running is to be able to run the LSI MegaRaid SNMP agent process, so it can send traps to the compute node’s management daemons. Since most Exadata compute nodes do not have the firewall enabled, udp port 161 is exposed. The settings of the snmp daemon itself limits access to localhost.

In the first post of the hardening serie, I described how to scan a website with Nikto. Another scanner, which not only scans HTTP servers, but does FTP, SMB/CIFS, telnet, ssh, Oracle listener and operating systems too is Nessus. Let me say right away: Nessus was started as an open source effort, but has been made closed source some time ago. A fork of the open source version is called OpenVAS.

Nessus is free for personal and non-commercial use. This means you can download it and use it to look how it works. If you want to scan your corporate webserver or websites, or use it as a tool for doing consultancy, you must purchase a ‘ProfessionalFeed’.

Nessus is a very advanced scanner, with a repository of checks (called plugins), which is kept up-to-date by the company which provides nessus, called ‘Tenable Network Security’. The updates can be downloaded manually (via a script), but are updated automatically every 24 hours by default.

These are the plugin families (from the nessus website):

  • CGI abuses – This plugin family checks for anything that is ‘CGI’ related, unless it is XSS (and only a XSS vulnerability), in which case it falls into the “CGI abuses : XSS” family. These checks use a combination of detection techniques, including checking version of the application and testing for the actual vulnerability. The attacks include software detection, information disclosure, XSS, SQLi, LFI, RFI, overflows and more.
  • CGI abuses : XSS – Specific CGI checks for reflective and persistent XSS vulnerabilities in common web applications.
  • Database – Typically a web server will run a database that is used by various web applications.
  • FTP – Web pages need to be updated, and FTP is a popular protocol used to allow your web developers to send files to the server.
  • Gain a Shell Remotely – If you can obtain a shell on the remote web server, testing the application is somewhat moot.
  • Gain root remotely – Same thing as above, if you gain root, resolve this problem before the application is tested.
  • General – Contains the operating system fingerprinting plugins, including ones that will identify the OS over HTTP. Identifying the underlying operating system is very important for web application testing, as it will determine the syntax of commands sent via injection (command and SQL) attacks.
  • Remote file access- Includes checks for specific web server/application vulnerabilities that lead to remote file disclosure.
  • Service detection – Contains checks for several different services, including detecting Apache running HTTPS, HTTP CONNECT proxy settings and other services that may host web applications.
  • Web servers – Plugins in this family detect approximately 300 specific vulnerabilities in popular web servers, such as Apache, IIS and generic vulnerabilities associated with the HTTP protocol itself.

Also, operating system updates are checked if Nessus is able to log on (either by credentials specified or if it found a username/password combination from an internal list: for example username ‘oracle’ with password ‘oracle’ is tested).

With all these tests it’s still very important to note that nothing beats proper design and validation of applications. Nessus (and most vulnerability checking software I know) checks for known issues. Customizations and self build applications are probably not thoroughly checked for vulnerabilities. This means Nessus could easily lead to a false sense of safety.

The Oracle HTTP Server (OHS) is a version of the Apache HTTP daemon, modified by Oracle to work with it’s Application Server suite. The Application Server suite is called ‘Fusion Middleware’ with version 11. The OHS acts/can act as a frontend to application servers like OC4J and Weblogic. In my opinion an application server should always have an http server in front of it, to act as:
– Logger
– SSL offloader
– Firewall
In my opinion it would also be very good to let the OHS be a frontend for APEX.

The OHS is only a decent firewall if you made sure it can only do what you have intended it to do. Sadly, this isn’t the case by default. Whilst the most recent (11.1) version is quite safe (stripped of functionality), it only is because it’s better than its predecessors, which where quite ‘open’.

Hardening is essentially done the same way as firewalling is: disable everything (disable all functionality) and enable only the needed functionality as limited as possible.

A decent way of checking enabled functionality, is using Nikto 2. Nikto is a web server scanner.

Here is an example:

$ ./nikto.pl -host example.com
- Nikto v2.1.1
---------------------------------------------------------------------------
+ Target IP: xxx.xxx.xxx.xxx
+ Target Hostname: example.com
+ Target Port: 80
+ Start Time: 2010-05-26 10:33:35
---------------------------------------------------------------------------
+ Server: Oracle-Application-Server-10g/10.1.2.0.2 Oracle-HTTP-Server OracleAS-Web-Cache-10g/10.1.2.0.2 (G;max-age=0+0;age=0;ecid=201564454590,0)
+ Uncommon header 'tcn' found, with contents: choice
+ ETag header found on server, inode: 3425418, size: 20042, mtime: 0x43e7685d;493cf3fe
+ Number of sections in the version string differ from those in the database, the server reports: oracle-application-server-10g/10.1.2.0.2oracle-http-serveroracleas-web-cache-10g/10.1.2.0.2(g;max-age=0+0;age=0;ecid=201564454590,0) while the database has: 10.1.3.1.0. This may cause false positives.
+ Oracle-Application-Server-10g/10.1.2.0.2Oracle-HTTP-ServerOracleAS-Web-Cache-10g/10.1.2.0.2(G;max-age=0+0;age=0;ecid=201564454590,0) appears to be outdated (current is at least 10.1.3.1.0)
+ Allowed HTTP Methods: GET, HEAD, OPTIONS, TRACE, POST, PUT, DELETE, CONNECT, PATCH, PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, UNLOCK
+ OSVDB-397: HTTP method ('Allow' Header): 'PUT' method could allow clients to save files on the web server.
+ OSVDB-5646: HTTP method ('Allow' Header): 'DELETE' may allow clients to remove files on the web server.
+ HTTP method ('Allow' Header): 'CONNECT' may allow server to proxy client requests.
+ HTTP method ('Allow' Header): 'PROPFIND' may indicate DAV/WebDAV is installed. This may be used to get directory listings if indexing is allow but a default page exists.
+ HTTP method ('Allow' Header): 'PROPPATCH' indicates WebDAV is installed.
+ OSVDB-5647: HTTP method ('Allow' Header): 'MOVE' may allow clients to change file locations on the web server.
+ OSVDB-877: HTTP TRACE method is active, suggesting the host is vulnerable to XST
+ OSVDB-27487: Apache is vulnerable to XSS via the Expect header
+ OSVDB-700: /fcgi-bin/echo?foo=alert('Vulnerable'): Fast-CGI has two default CGI programs (echo.exe/echo2.exe) vulnerable to Cross Site Scripting (XSS). http://www.cert.org/advisories/CA-2000-02.html.
+ OSVDB-3954: /fcgi-bin/echo2?foo=alert('Vulnerable'): Fast-CGI has two default CGI programs (echo.exe/echo2.exe) vulnerable to Cross Site Scripting (XSS). http://www.cert.org/advisories/CA-2000-02.html.
+ OSVDB-561: /server-status: This reveals Apache information. Comment out appropriate line in httpd.conf or restrict access to allowed hosts.
+ OSVDB-3233: /index.html.de: Apache default foreign language file found. All default files should be removed from the web server as they may give an attacker additional system information.
+ OSVDB-3233: /index.html.en: Apache default foreign language file found. All default files should be removed from the web server as they may give an attacker additional system information.
+ OSVDB-3233: /index.html.es: Apache default foreign language file found. All default files should be removed from the web server as they may give an attacker additional system information.
+ OSVDB-3233: /index.html.fr: Apache default foreign language file found. All default files should be removed from the web server as they may give an attacker additional system information.
+ OSVDB-3233: /index.html.it: Apache default foreign language file found. All default files should be removed from the web server as they may give an attacker additional system information.
+ OSVDB-3092: /fcgi-bin/echo: The FastCGI echo program may reveal system info or lead to other attacks.
+ OSVDB-3092: /fcgi-bin/echo2: The FastCGI echo2 program may reveal system info or lead to other attacks.
+ OSVDB-3233: /j2ee/: j2ee directory found--possibly an Oracle app server directory.
+ OSVDB-3233: /icons/README: Apache default file found.
+ 3823 items checked: 25 item(s) reported on remote host
+ End Time: 2010-05-26 10:44:18 (643 seconds)
---------------------------------------------------------------------------
+ 1 host(s) tested

There are all kinds of things to see here (some highlights):
-The server describes into detail what software is used. Not a problem on itself, but it gives very much information which can perfectly be used to attack the server, or which can be used to determine if the server is vulnerable:

+ Server: Oracle-Application-Server-10g/10.1.2.0.2 Oracle-HTTP-Server OracleAS-Web-Cache-10g/10.1.2.0.2

-Nikto tells there are newer versions of the OHS:

+ Oracle-Application-Server-10g/10.1.2.0.2Oracle-HTTP-ServerOracleAS-Web-Cache-10g/10.1.2.0.2(G;max-age=0+0;age=0;ecid=201564454590,0) appears to be outdated (current is at least 10.1.3.1.0)

-Nikto has determined this host is vulnerable to Cross Site Scripting (XSS):

+ OSVDB-27487: Apache is vulnerable to XSS via the Expect header

-Nikto found some files which gives away even more detailed information:

+ OSVDB-3092: /fcgi-bin/echo: The FastCGI echo program may reveal system info or lead to other attacks.

This is very much and detailed information about the configuration of this server.

%d bloggers like this: