Archive

Oracle

I posted a message on twitter saying that DBMS_PIPE is an excellent mechanism to make a session run and halt in PLSQL. One response I gotten was asking for an example of that. That is what this post is about.

DBMS_PIPE is an implementation of a pipe inside the Oracle database. A pipe is a mechanism that is not limited to the Oracle database, in fact I assume the implementation is inspired by an operating system level pipe, which can be created using the ‘mknod /path/pipename p’ on unix and unix-like systems. A quick search shows windows has got them too, but not really, or does it? Another implementation is the pipe to redirect output from one command to the next using ‘|’. The essence of a pipe that input to and output from the pipe are separated from each other, and that information is flowing from the inputting process of the pipe to the outputting process.

For the specific purpose this blogpost is about, DBMS_PIPE, I am not using the information that is sent through the pipe, I am mainly using the fact that a receiving session waits for information to be sent, and therefore halts execution.

When the Oracle database executes in PL/SQL mode, there is no simple way to make it pause and request for user input, like you would do with the ‘pause’ or ‘accept’ commands in sqlplus in SQL mode. PL/SQL runs and provides its output when the session quits PL/SQL mode, in other words: there is no interaction with the terminal that executes the PL/SQL procedure or anonymous block. So if you want to make it stop and continue, you must use something else, for which I use DBMS_PIPE. Of course there are other ways, I would encourage people to come up with other mechanisms as a comment.

The use the technique of using DBMS_PIPE, you need two sessions, one which runs the PL/SQL block that you want to pause and continue (using a database pipe), and one to make it continue by sending a message to that pipe. Another thing that you might notice is the lack of explicit setup of the pipe; I just declare waiting for the pipe, and it’ll get created implicitly.

This is how it’s done:

1. Simple example anonymous PL/SQL block:

set serverout on;
declare
 v_result number;
 v_time1 date;
 v_time2 date;
begin
  select sysdate into v_time1 from dual;
  v_result := dbms_pipe.receive_message ( 'test' );
  select sysdate into v_time2 from dual;
  dbms_output.put_line('you waited: '||(24*60*60)*(v_time2-v_time1)||' seconds');
end;
/

If you execute this, it will provide no output, and in fact be waiting for the pipe that I named ‘test’.

In order to make anonymous PL/SQL block continue, I need another session to send a message to the pipe:
2. Send empty message to the pipe:

var v_result number;
exec v_result := dbms_pipe.send_message ( 'test' );

Once you executed that, the PL/SQL block will continue, and then finish and therefore display the output:


you waited: 3 seconds

PL/SQL procedure successfully completed.

There are a couple of other caveats you need to be aware of:
– a pipe is visible to the instance, in the case of a cluster database it’s NOT visible on the other instances in the cluster.
– there can be multiple sessions waiting on the same pipe, but a single message is consumed/received by a single session, based on time.

All Oracle database professionals know the current versions of the Oracle database (12.2, 18, 19, 20 at the moment of writing), and we also know the pace Oracle corporation keeps is so high that a lot of companies are having a hard time keeping up with the current versions. A prominent one is Oracle corporation itself for their E-Business suite software, where Oracle extended the support for the database for version 12.1.0.2 and 11.2.0.4 for E-Business suite licenses only. But this blog isn’t about bitching about the pace of Oracle support and versions getting desupported.

What I do regularly encounter is that for all kinds of reasons a database version is not updated. Most of the time the versions that are encountered are 12.1.0.2 (the long term supported version of the 12.1 version of the database), 11.2.0.4 (the long term supported version of the 11.2 version of the database), and more and more seldom 11.2.0.3. If things truly have been left without updating you might encounter 11.2.0.2, and god forbid if you still have 11.2.0.1, that version had a lot if issues.

Now what if you encounter even older versions? Probably younger Oracle database consultants might never even have seen older versions than 11.2.0.1. But what if you need to work with a truly old version? Or are just interested in such an old version to see how it looks like, and what the state of the database was at that version?

For that, I created an automatic installation script to install either:
– Release 10.2 versions: 10.2.0.1, 10.2.0.2, 10.2.0.3, 10.2.0.4, 10.2.0.5.
– Release 9.2 versions: 9.2.0.4, 9.2.0.5*, 9.2.0.6*, 9.2.0.7, 9.2.0.8.
(*=the patch install is fully scripted, but linking oracle throws an error)

Yes, this is extremely old, and if you must work with it, there has been neglect and somebody not paying attention at a quite massive scale. There are also licensing implications that do not work in your favour there.

There is a huge caveat too: the installation media for these Oracle database versions is not available for download anywhere as far as I know, and some of the patches are restricted downloads on My Oracle Support too. Since it’s Oracle proprietary software, the only way to obtain it is via Oracle.

Outside of these, if you must or want to use these ancient versions, and you got the required files, you can use:
https://gitlab.com/FritsHoogland/ol48_oracle92 for installing Oracle 9.2 on Oracle Linux 4.8 or
https://gitlab.com/FritsHoogland/ol511_oracle102 for installing Oracle 10.2 on oracle Linux 5.11

Clone the repository, put the required files in the files directory, edit the Vagrantfile to your liking and then build the database server by typing ‘vagrant up’.

In case you’re wondering how the operating system images are build, this is done using ‘packer’, I have a repository where you can see how these are build too: https://gitlab.com/FritsHoogland/packer-oracle-linux-museum

In case you’re wondering: there are even older versions; the first public version of the Oracle database on linux, as far as I know, is Oracle 8.0.5. However, the linux version to use with versions like 8.1.7, RHEL/Centos 2.1, is so old that it doesn’t play nicely with VirtualBox and packer, so in al reasonability, Oracle 9.2/oracle linux 4.8 is the earliest version that can be used without severe issues.

This blogpost takes a look at the technical differences between Oracle database 19 PSU 6 (january 2020) and 7 (april 2020). This gives technical specialists an idea of the differences, and gives them the ability to assess if the PSU impacts anything.

Functions


code symbol names unique in version 19.6 versus 19.7 (top 5)

keb                                                keb                                                kernel event MMON/MMNL infrastructure                                                     4
qes                                                qes                                                query execute services                                                                    5
qos                                                qos                                                query optimizer statistics                                                                6
kpd                                                (kp)d                                              kernel programmatic interface ??                                                         10
kub                                                (ku)b                                              kernel utility ??                                                                        11

code symbol names unique in version 19.7 versus 19.6 (top 5)

ksf                                                ksf                                                kernel service  functions                                                                 9
kpd                                                (kp)d                                              kernel programmatic interface ??                                                         11
kqr                                                kqr                                                kernel query dictionary/row cache                                                        12
kzr                                                kzr                                                kernel security virtual private database                                                 20
kub                                                (ku)b                                              kernel utility ??                                                                        34

kub: most of the functions starting with kub that have been removed are starting with kubsprq: kernel utility big data sql parquet. Most functions starting with kub are actually kubscrf, so big data sql related probably, but I have no idea currently what ‘crf’ means. Suggestions are welcome. Also 4 kubsprq functions are added.
kpd: the kpd function that are gone are actually kpdb, which likely means kernel programmatic interface pluggable database. In fact, there are two ‘layers’ which are unknown to me: kpdboc and kpdbocws. The kpd functions that have been added seem to have letters added to indicate extra layers besides kpdb, but only two of them are kpdboc, the rest are quite random.
qos: the qos functions removed are actually qosd, query optimizer statistics sql plan directive. There have been no functions returned starting with ‘qosd’.
qes: most of the qes functions are actually starting with qesdpi, which according to my current insights means query execute services adaptive features internationalisation (timezones). Of course this is speculation.
keb: the keb (kernel event MMON/MMNL infrastructure) functions seem to have something to do with ‘auto tasks’. There are some keb functions returned too, but not in the top 5 of returned function groupings.
kzr: the kzr functions are actually kzra, kernel security virtual private database privileges. I don’t know what the work done here is, but it seems there has been work done.
kqr: these are quite normal dictionary cache functions, with some letters added. This looks like work done to make enhancements to the dictionary cache processing.
ksf: most of the functions with ksf are actually ksfs, kernel service OFS filesystem, and a few ksfd, kernel service functions disk IO.

Parameters

parameters unique in version 19.6 versus 19.7

NAME
--------------------------------------------------
_eighth_spare_parameter
_fifteenth_spare_parameter
_hang_bool_spare1
_ninth_spare_parameter
_one-hundred-and-fifty-ninth_spare_parameter
_one-hundred-and-sixtieth_spare_parameter
_seventh_spare_parameter
_twelfth_spare_parameter

parameters unique in version 19.7 versus 19.6

NAME
--------------------------------------------------
_asm_max_kfsg_free_heap_perc
_asm_max_kfsg_free_heap_size
_bug30186319_lm_hb_lgio_timeout
_bug30352623_asm_enable_vfreloc_on_rm
_hang_mark_some_idle_waits_terminal
_kswsas_pdb_auto_close_timeout
_link_ts_force_online
_pdb_datapatch_violation_restricted

A number of spare parameters have been used to create undocumented parameters. We see the two ASM related parameters, just like with version 18 and 12.2. Also two “bug parameters”.

Waitevents

waitevents unique in version 19.6 versus 19.7

NAME
----------------------------------------------------------------------------------------------------
first spare wait event

waitevents unique in version 19.7 versus 19.6

NAME
----------------------------------------------------------------------------------------------------
CRS get service attributes

It seems that with version 19.7, a spare parameter was changed for an actual wait event. It’s unknown to me what the wait event times currently.

DBA/CDB tables columns

dba tables columns unique to 19.6 versus 19.7

NAME												     COLUMN_NAME
---------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
DBA_APP_VERSIONS										     APP_ROOT_CLONE_NAME
DBA_WORKLOAD_CAPTURES										     ENCRYPTION_VERIFIER

cdb tables column unique to 19.7 versus 19.6

NAME												     COLUMN_NAME
---------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
CDB_WORKLOAD_CAPTURES										     ENCRYPTION_VERIFIER

One column in two different DBA tables have been removed from version 19.7, and one of them came back as a CDB table column. The weird thing here is not the switch of a column going from a DBA table to its CDB counterpart, that is something that I saw between 18.9 and 18.10, but that with Oracle 19.7 this is only one column of the two gone missing from DBA tables came back in a CDB table.

This blogpost takes a look at the technical differences between Oracle database 18 PSU 9 (january 2020) and 10 (april 2020). This gives technical specialists an idea of the differences, and gives them the ability to assess if the PSU impacts anything.

Functions

code symbol names unique in version 18.9 versus 18.10.0.0.200414

NAME                                                         RESOLVE                                                      ANNOTATION
------------------------------------------------------------ ------------------------------------------------------------ --------------------------------------------------------------------------------
kdpFreeAggVecs                                               (kdp)FreeAggVecs                                             kernel data archive compression: pcode ??
kdpLoopVisitInstr                                            (kdp)LoopVisitInstr                                          kernel data archive compression: pcode ??
kdp_generate_pcode_proj_agg_rowset                           (kdp)_generate_pcode_proj_agg_rowset                         kernel data archive compression: pcode ??
kdp_op_is_char                                               (kdp)_op_is_char                                             kernel data archive compression: pcode ??
kdp_pcode_calc_constant_area                                 (kdp)_pcode_calc_constant_area                               kernel data archive compression: pcode ??
kdp_pcode_setup_cbks                                         (kdp)_pcode_setup_cbks                                       kernel data archive compression: pcode ??
kebmtrlog_add                                                (keb)mtrlog_add                                              kernel event MMON/MMNL infrastructure ??
kgskltyp                                                     (kgsk)ltyp                                                   kernel generic service resource manager ??
kjznhcdn                                                     (kjzn)hcdn                                                   kernel lock management DIAG node layer ??
kjznhcds                                                     (kjzn)hcds                                                   kernel lock management DIAG node layer ??
kkqoreAndDriver                                              (kkqore)AndDriver                                            kernel compile query  or-expansion ??
krvxdsr                                                      (krvx)dsr                                                    kernel redo recovery extract ??
kseinpre                                                     (kse)inpre                                                   kernel service  error ??
kseipre                                                      (kse)ipre                                                    kernel service  error ??
ktspnsc_repair_endtask                                       (ktsp)nsc_repair_endtask                                     kernel transaction segment management segment pagetable ??
kubsCRCheckDirLink                                           (kubsCR)CheckDirLink                                         kernel utility big data sql consistent read (?) ??
skudmisyml                                                   (sk)udmisyml                                                 operating system dependent kernel ??

code symbol names unique in version 18.10.0.0.200414 versus 18.9

NAME                                                         RESOLVE                                                      ANNOTATION
------------------------------------------------------------ ------------------------------------------------------------ --------------------------------------------------------------------------------
kcoclnup                                                     (kco)clnup                                                   kernel cache operation ??
kfGcSubHeap                                                  (kf)GcSubHeap                                                kernel automatic storage management ??
kfSgaFrHeapParamCb                                           (kf)SgaFrHeapParamCb                                         kernel automatic storage management ??
kfSgaFrHeapPercParamCb                                       (kf)SgaFrHeapPercParamCb                                     kernel automatic storage management ??
kffFdChkSize                                                 (kff)FdChkSize                                               kernel automatic storage management KFF ??
kffmapFree                                                   (kffm)apFree                                                 kernel automatic storage management KFF map header ??
kfgGetOfflineEnqueue                                         (kfg)GetOfflineEnqueue                                       kernel automatic storage management diskgroups ??
kfgReleaseOfflineEnqueue                                     (kfg)ReleaseOfflineEnqueue                                   kernel automatic storage management diskgroups ??
kfgpnReleaseOfflineEnqueue                                   (kfg)pnReleaseOfflineEnqueue                                 kernel automatic storage management diskgroups ??
kjac_allow_tg                                                (kjac)_allow_tg                                              kernel lock management application continuity  ??
kjctccti                                                     (kjc)tccti                                                   kernel lock management communication ??
kkdlpGetObjn                                                 (kkdl)pGetObjn                                               kernel compile dictionary lookup ??
kkqtutlSelItemMatches                                        (kkqt)utlSelItemMatches                                      kernel compile query  table ??
kokaugi_iobs                                                 (koka)ugi_iobs                                               kernel objects kernel side access ??
kpoxcAppContPDBNotify                                        (kpo)xcAppContPDBNotify                                      kernel programmatic interface oracle ??
krvxgtf                                                      (krvx)gtf                                                    kernel redo recovery extract ??
krvxrsr_SkipRedo                                             (krvx)rsr_SkipRedo                                           kernel redo recovery extract ??
ksdxcmdoutmemclnt_has_output                                 (ksd)xcmdoutmemclnt_has_output                               kernel service  debug internal errors ??
ksfqsrfn                                                     (ksfq)srfn                                                   kernel service  functions sequential file io interface ??
ktsmgPublishNameSvc                                          (ktsmg)PublishNameSvc                                        kernel transaction segment management merge ??
ktsmgSearchNameSvc                                           (ktsmg)SearchNameSvc                                         kernel transaction segment management merge ??
ktsmg_mql_chore_pdb                                          (ktsmg)_mql_chore_pdb                                        kernel transaction segment management merge ??
ktsmguuf_pdb                                                 (ktsmg)uuf_pdb                                               kernel transaction segment management merge ??
kzdmchkerpna                                                 (kzd)mchkerpna                                               kernel security dictionary access support ??
kzdmqcterpna                                                 (kzd)mqcterpna                                               kernel security dictionary access support ??
kzthsm_get_pdb_ctx                                           (kz)thsm_get_pdb_ctx                                         kernel security ??
kzthsminit_load_pkcs_lib                                     (kz)thsminit_load_pkcs_lib                                   kernel security ??
kzthsminit_session                                           (kz)thsminit_session                                         kernel security ??
kzthsminit_session_for_united_mode_pdb                       (kz)thsminit_session_for_united_mode_pdb                     kernel security ??
qcsoPreNameres_ProcOpt                                       (qcso)PreNameres_ProcOpt                                     query compile semantic analysis (parser) OBJECTS ??
qerxlCanAllocMut                                             (qerxl)CanAllocMut                                           query execute rowsource vector translate ??
qkspmTravInit                                                (qkspm)TravInit                                              query kernel sql plan management ??
slrac_end_lock                                               (slrac)_end_lock                                             operating system (dependent) library verify (memory) read

This is is what I would expect from a version that is not long term supported. Some functions removed, notably a group of functions starting with kdp (kernel data archive compression, alias HCC), and some functions added, some kf (kernel ASM). Essentially nothing to exciting.

Parameters

parameters unique in version 18.9 versus 18.10.0.0.200414

NAME
--------------------------------------------------
_fourteenth_spare_parameter
_ninth_spare_parameter
_one-hundred-and-fifty-second_spare_parameter
_thirteenth_spare_parameter
_twelfth_spare_parameter

parameters unique in version 18.10.0.0.200414 versus 18.9

NAME
--------------------------------------------------
_asm_max_kfsg_free_heap_perc
_asm_max_kfsg_free_heap_size
_bug30352623_asm_enable_vfreloc_on_rm
_cvw_examine_tables_in_from_list_subqry
_second_spare_parameter

It’s clear here some spare parameters have been taken to add undocumented parameters. One of them is a ‘bugswitch’, two of them are regarding ASM, which are also added for april 2020 PSU for 12.2.0.1.

DBA/CDB tables
It seems the bug regarding DBA_REGISTRY_BACKPORTS that was found the previous period has not been implemented everywhere. This was confirmed a bug then, so I assume this view gone missing here with the new version is a bug.

dba tables unique to 18.9 versus 18.10.0.0.200414

NAME
----------------------------------------------------------------------------------------------------
DBA_REGISTRY_BACKPORTS

cdb tables unique to 18.9 versus 18.10.0.0.200414

NAME
----------------------------------------------------------------------------------------------------
CDB_REGISTRY_BACKPORTS

DBA/CDB table columns
This is something I have not seen before. This looks like one column in two DBA views has been removed, and then added to their CDB counterparts.

dba tables columns unique to 18.9 versus 18.10.0.0.200414

NAME												     COLUMN_NAME
---------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
DBA_WORKLOAD_CAPTURES										     ENCRYPTION_VERIFIER
DBA_WORKLOAD_REPLAY_FILTER_SET									     DEFAULT_ACTION

cdb tables column unique to 18.10.0.0.200414 versus 18.9

NAME												     COLUMN_NAME
---------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
CDB_WORKLOAD_CAPTURES										     ENCRYPTION_VERIFIER
CDB_WORKLOAD_REPLAY_FILTER_SET									     DEFAULT_ACTION

This blogpost takes a look at the technical differences between Oracle database 12.1.0.2 PSU 200114 (january 2020) and 200414 (april 2020). This gives technical specialists an idea of the differences, and gives them the ability to assess if the PSU impacts anything.

Parameters
The list of parameters removed (first) and parameters added (second) is remarkable long.
It’s striking that a lot of solutions for bugs made configurable (_bug[0-9]*_.*) have been removed, and probably returned back as ‘spare parameters’.
Also some in memory (_inmemory.*) parameters have been removed.
Also some documented parameters; exafusion_enabled and optimizer_adaptive_plans and optimizer_adaptive_statistics, plus some standby parameters I wasn’t aware of existing.

The new parameters are all undocumented (starting with an underscore). Most of them are essentially the undocumented named parameters returned to the pool of spare parameters. Some that are named seem to be related to ‘data protection’: (_sga_data_protection_disabled, _sga_dp_bytes_per_alloc, _sga_dp_precise_errors), I am not sure what the actual data protection feature is here. Some are in-memory related, shared IPC (ksxp_skgxp) and some others. There’s one “bugswitch” added again: _bug18411216_gcr_min_free_memory_hard_limit.

parameters unique in version 12.1.0.2.200114 versus 12.1.0.2.200414

NAME
--------------------------------------------------
_asm_cancel_alert_time
_bug12963364_spacebg_sync_segblocks
_bug19146597_df_hist_offl_override
_bug20480623_posting_ips
_bug20534663_ksxp_save_sockets_pat
_bug21214994_lm_max_lms_block_time
_bug21214994_lm_sndq_flush_int
_bug21893830_influx_scn_waits
_bug21918948_incremental_purge_size
_bug21977186_lm_hb_cfio_timeout
_bug24587802_asm_auto_online_interval
_bug24717183_lmhb_procdump_cputime_limit
_bug24717183_lmhb_procdump_runtime_limit
_bug24966594_gcs_disable_imc_preallocation
_bug28572834_xt_preproc_timeout
_bug29825525_bct_public_dba_buffer_dynresize_delay
_cache_fusion_pipelined_updates
_cell_offload_hybrid_processing
_dskm_single_instance
_force_cloud_hcc_on
_gcr_min_free_memory_hard_limit
_inmemory_journal_options
_inmemory_journal_scan
_inmemory_private_journal_maxexts
_inmemory_retention_time
_ksxp_save_sockets
_ksxp_xrc_cm_port
_memory_adi_bytes_per_alloc
_memory_adi_enabled
_memory_adi_precise_errors
_min_sys_percentage
_one-hundred-and-forty-eighth_spare_parameter
_optimizer_ads_for_pq
_optimizer_use_feedback_for_join
exafusion_enabled
optimizer_adaptive_plans
optimizer_adaptive_statistics
standby_pdb_source_file_dblink
standby_pdb_source_file_directory

parameters unique in version 12.1.0.2.200414 versus 12.1.0.2.200114

NAME
--------------------------------------------------
_bug18411216_gcr_min_free_memory_hard_limit
_eighteenth_spare_parameter
_eighty-seventh_spare_parameter
_eighty-sixth_spare_parameter
_eighty-third_spare_parameter
_external_keystore_credential_location
_fourteenth_spare_parameter
_gcs_disable_imc_preallocation
_inmemory_cu_timeout
_inmemory_jscan
_ksxp_skgxp_spare_param4
_ksxp_skgxp_spare_param5
_min_sys_percetange
_one-hundred-and-fiftieth_spare_parameter
_one-hundred-and-fifty-fifth_spare_parameter
_one-hundred-and-fifty-first_spare_parameter
_one-hundred-and-fifty-fourth_spare_parameter
_one-hundred-and-fifty-second_spare_parameter
_one-hundred-and-fifty-sixth_spare_parameter
_one-hundred-and-fifty-third_spare_parameter
_one-hundred-and-forty-ninth_spare_parameter
_one-hundred-and-forty-seventh_spare_parameter
_one-hundred-and-forty-third_spare_parameter
_one-hundred-and-thirtieth_spare_parameter
_ping_wait_for_log_force
_seventeenth_spare_parameter
_sga_data_protection_disabled
_sga_dp_bytes_per_alloc
_sga_dp_precise_errors
_sixteenth_spare_parameter
_thirteenth_spare_parameter
_twelfth_spare_parameter
_twentieth_spare_parameter
_twenty-first_spare_parameter
_twenty-fourth_spare_parameter
_twenty-second_spare_parameter
_twenty-seventh_spare_parameter
_twenty-sixth_spare_parameter
_twenty-third_spare_parameter

V$/GV$ fixed view column removal
A column named GUID has been removed from a number of BACKUP related views:

v$ tables columns unique to 12.1.0.2.200114 versus 12.1.0.2.200414

NAME												     COLUMN_NAME
---------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
V_$BACKUP_DATAFILE										     GUID
V_$BACKUP_PIECE 										     GUID
V_$BACKUP_SET											     GUID
V_$BACKUP_SPFILE										     GUID
V_$DATAFILE_COPY										     GUID
V_$PROXY_DATAFILE										     GUID

gv$ tables columns unique to 12.1.0.2.200114 versus 12.1.0.2.200414

NAME												     COLUMN_NAME
---------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
GV_$BACKUP_DATAFILE										     GUID
GV_$BACKUP_PIECE										     GUID
GV_$BACKUP_SET											     GUID
GV_$BACKUP_SPFILE										     GUID
GV_$DATAFILE_COPY										     GUID
GV_$PROXY_DATAFILE										     GUID

Statistics
Also quite remarkable are 14 statistics that have been made redundant with the april PSU, 6 in-memory related statistics, 3 cell/exadata related and 5 GCS (glocal cache services/RAC) related.

sysstat statistics unique to 12.1.0.2.200114 versus 12.1.0.2.200414

NAME
----------------------------------------------------------------------------------------------------
IM repopulate (doublebuffering) CUs
IM repopulate (doublebuffering) CUs requested
IM repopulate (incremental) CUs
IM repopulate blocks now valid (ATC)
IM repopulate optimized
IM scan CUs doublebuffering
cell XT granule IO bytes saved by HDFS tbs extent map scan
cell XT granule IO bytes saved by storage index
cell physical IO bytes eligible for smart IOs
gcs affinity lock failures
gcs affinity lock grants
gcs anti-locks created
gcs read-mostly lock failures
gcs read-mostly lock grants

sysstat statistics unique to 12.1.0.2.200414 versus 12.1.0.2.200114

NAME
----------------------------------------------------------------------------------------------------
IM zzzz spare1
IM zzzz spare2
IM zzzz spare3
IM zzzz spare4
IM zzzz spare5
IM zzzz spare6
cell statistics spare1
cell statistics spare2
cell statistics spare3
spare statistic 1
spare statistic 2
spare statistic 3
spare statistic 4
spare statistic 5

DBA/CDB tables
There are also 3 DBA/CDB tables that have been removed. These seem to be dealing with big data, the name HIVE suggests hive metadata.

dba tables unique to 12.1.0.2.200114 versus 12.1.0.2.200414

NAME
----------------------------------------------------------------------------------------------------
DBA_HIVE_COLUMNS
DBA_HIVE_DATABASES
DBA_HIVE_TABLES

cdb tables unique to 12.1.0.2.200114 versus 12.1.0.2.200414

NAME
----------------------------------------------------------------------------------------------------
CDB_HIVE_COLUMNS
CDB_HIVE_DATABASES
CDB_HIVE_TABLES

DBA/CDB table columns
There is one column: TBS_INHERITED_FROM that is added to DBA_ILMOBJECTS.

dba tables columns unique to 12.1.0.2.200414 versus 12.1.0.2.200114

NAME												     COLUMN_NAME
---------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
DBA_ILMOBJECTS											     TBS_INHERITED_FROM

And it is removed from CDB_ILMNOBJECTS

cdb tables column unique to 12.1.0.2.200114 versus 12.1.0.2.200414

NAME												     COLUMN_NAME
---------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
CDB_ILMOBJECTS											     TBS_INHERITED_FROM

Functions
I grouped the removed and added functions by the first 3 letters, otherwise the list would be too long to be meaningful.
I think a fair conclusion is there still has been improvements made to 12.1.0.2 between the january and april PSU.
The main area’s where the removed and added functions are visible for are:
Removed:
– jzn: JSON; jzn0Dom/jznBson/jznDom/jznEng/jznOson/jznParser/jznoct, unsure what these sub-categories are.
– kdz: archive compression (HCC); kdzcol (column?)
– ktm: kernel transaction in memory
– kub: kernel utility ‘b’; this is actually ‘kubs’: kernel utility big data sql
– kdm: kernel data in memory; this is actually kdmo: kernel data in-memory data layer optimizer
Added:
– jzn: JSON; jznEng/jznt/jzntok (tokeniser?)
– kts: kernel transaction segment management; ktsima (in memory?)/ktsl (lob)
– skg: system kernel generic; this is actually skgzib: infiniband code
– kdm: kernel data in memory; this is actually kdmo: kernel data in-memory data layer optimizer
– kfd: kernel automatic storage management

code symbol names unique in version 12.1.0.2.200114 versus 12.1.0.2.200414

NAME                                               RESOLVE                                            ANNOTATION                                                                         COUNT(*)
-------------------------------------------------- -------------------------------------------------- -------------------------------------------------------------------------------- ----------
ctc                                                ctc                                                create table or cluster                                                                   1
def                                                def                                                ??                                                                                        1
dgl                                                dgl                                                ??                                                                                        1
hor                                                hor                                                ??                                                                                        1
kcl                                                kcl                                                kernel cache lock manager/buffer cache                                                    1
kcv                                                kcv                                                kernel cache recovery                                                                     1
kdb                                                kdb                                                kernel data block structures and manipulation                                             1
kdt                                                kdt                                                kernel data table insert                                                                  1
kfD                                                (kf)D                                              kernel automatic storage management ??                                                    1
kfc                                                kfc                                                kernel automatic storage management cache                                                 1
kff                                                kff                                                kernel automatic storage management KFF                                                   1
kfm                                                kfm                                                kernel automatic storage management node monitor interface implementation                 1
kgx                                                kgx                                                kernel generic mutex/communication                                                        1
kjx                                                kjx                                                kernel lock management cross instance                                                     1
kkb                                                kkb                                                kernel compile table                                                                      1
kkd                                                (kk)d                                              kernel compile ??                                                                         1
kkn                                                kkn                                                kernel compile analyze command                                                            1
kks                                                kks                                                kernel compile shared objects (cursor)                                                    1
kkt                                                kkt                                                kernel compile triggers                                                                   1
kmg                                                (km)g                                              kernel multi threaded/mman ??                                                             1
kmm                                                kmm                                                kernel multi threaded/mman monitor                                                        1
kpx                                                (kp)x                                              kernel programmatic interface ??                                                          1
kqr                                                kqr                                                kernel query dictionary/row cache                                                         1
krd                                                krd                                                kernel redo datafiles (media and standby)                                                 1
krf                                                krf                                                kernel redo flashback                                                                     1
ksd                                                ksd                                                kernel service  debug internal errors                                                     1
ksr                                                ksr                                                kernel service  reliable messages                                                         1
kte                                                kte                                                kernel transaction extent management                                                      1
ktg                                                (kt)g                                              kernel transaction ??                                                                     1
ktr                                                ktr                                                kernel transaction read consistency                                                       1
ktu                                                ktu                                                kernel transaction undo                                                                   1
kza                                                kza                                                kernel security audit                                                                     1
kzp                                                kzp                                                kernel security privileges                                                                1
msq                                                msq                                                generating sql statements                                                                 1
ozi                                                ozi                                                ??                                                                                        1
pmu                                                pmu                                                ??                                                                                        1
qcs                                                qcs                                                query compile semantic analysis (parser)                                                  1
qct                                                qct                                                query compile type check operations                                                       1
qkn                                                qkn                                                query kernel allocate                                                                     1
qmu                                                (qm)u                                              query XDB ??                                                                              1
qxi                                                (qx)i                                              query execution ??                                                                        1
rfr                                                rfr                                                ??                                                                                        1
seq                                                seq                                                sequence numbers                                                                          1
slp                                                (sl)p                                              operating system (dependent) library ??                                                   1
tbs                                                tbs                                                ??                                                                                        1
upd                                                upd                                                update                                                                                    1
xsE                                                (xs)E                                              olap analytic workspace ??                                                                1
fpl                                                fpl                                                ??                                                                                        2
jsk                                                (js)k                                              job scheduing ??                                                                          2
kbc                                                kbc                                                kernel backup backup appliance containers                                                 2
kcc                                                kcc                                                kernel cache controlfile                                                                  2
kfr                                                kfr                                                kernel automatic storage management recovery                                              2
kgg                                                kgg                                                kernel generic generic routines                                                           2
kjf                                                (kj)f                                              kernel lock management ??                                                                 2
kjm                                                kjm                                                kernel lock management RAC multiple LMS                                                   2
kjz                                                kjz                                                kernel lock management DIAG                                                               2
knl                                                knl                                                kernel replication streams replication                                                    2
kql                                                kql                                                kernel query library cache                                                                2
ksp                                                ksp                                                kernel service  parameter                                                                 2
kws                                                kws                                                kernel streams advanced queuing                                                           2
kxs                                                kxs                                                kernel execution shared cursor                                                            2
pfr                                                pfr                                                PLSQL Code Execution runtime                                                              2
pld                                                (pl)d                                              PLSQL infrastructure ??                                                                   2
psd                                                psd                                                PLSQL infrastructure programmatic interfaces to/from                                      2
qcu                                                qcu                                                query compile utility                                                                     2
qka                                                qka                                                query kernel allocation                                                                   2
qxx                                                (qx)x                                              query execution ??                                                                        2
sks                                                (sk)s                                              operating system dependent kernel ??                                                      2
ssk                                                ssk                                                operating system dependent system kernel                                                  2
kcr                                                kcr                                                kernel cache recovery                                                                     3
kfn                                                kfn                                                kernel automatic storage management networking subsystem                                  3
kfx                                                (kf)x                                              kernel automatic storage management ??                                                    3
kgn                                                (kg)n                                              kernel generic ??                                                                         3
kkf                                                (kk)f                                              kernel compile ??                                                                         3
kkk                                                (kk)k                                              kernel compile ??                                                                         3
kkx                                                kkx                                                kernel compile execute                                                                    3
kti                                                kti                                                kernel transaction in memory undo                                                         3
ktt                                                ktt                                                kernel transaction tablespace                                                             3
ora                                                ora                                                ??                                                                                        3
phn                                                phn                                                PLSQL semantics                                                                           3
sku                                                (sk)u                                              operating system dependent kernel ??                                                      3
__i                                                __i                                                ??                                                                                        4
isi                                                isi                                                ??                                                                                        4
kdl                                                kdl                                                kernel data lob                                                                           4
kgh                                                kgh                                                kernel generic heap manager                                                               4
kpu                                                kpu                                                kernel programmatic interface user                                                        4
ksw                                                ksw                                                kernel service  workgroup                                                                 4
eva                                                eva                                                evaluate                                                                                  5
kfi                                                (kf)i                                              kernel automatic storage management ??                                                    5
kkz                                                kkz                                                kernel compile security                                                                   5
klx                                                (kl)x                                              kernel loader ??                                                                          5
kxf                                                kxf                                                kernel execution parallel query                                                           5
prm                                                prm                                                parameter handling with sql layer                                                         5
qcp                                                qcp                                                query compile parse                                                                       5
ins                                                ins                                                insert                                                                                    6
kkp                                                kkp                                                kernel compile compilation                                                                6
krs                                                krs                                                kernel redo standby/dataguard (?)                                                         6
krv                                                krv                                                kernel redo recovery                                                                      6
mde                                                (md)e                                              multidimension (spatial) ??                                                               6
prs                                                prs                                                parse                                                                                     6
qos                                                qos                                                query optimizer statistics                                                                6
dbg                                                dbg                                                debug and diagnostics                                                                     7
hos                                                hos                                                ??                                                                                        7
kjc                                                kjc                                                kernel lock management communication                                                      7
ktf                                                ktf                                                kernel transaction flashback support                                                      7
kud                                                (ku)d                                              kernel utility ??                                                                         7
kdp                                                kdp                                                kernel data archive compression: pcode                                                    8
kcb                                                kcb                                                kernel cache buffers                                                                      9
kfk                                                kfk                                                kernel automatic storage management KFK                                                   9
kgl                                                kgl                                                kernel generic library cache management                                                   9
kgz                                                (kg)z                                              kernel generic ??                                                                         9
kke                                                kke                                                kernel compile cost engine                                                                9
ksm                                                ksm                                                kernel service  memory                                                                    9
kwq                                                kwq                                                kernel advanced queuing                                                                  10
kdi                                                kdi                                                kernel data index                                                                        11
ksf                                                ksf                                                kernel service  functions                                                                11
kds                                                kds                                                kernel data seek/scan                                                                    12
qjs                                                (q)js                                              query ??                                                                                 12
ksz                                                ksz                                                kernel service  oracle storage server (OSS) server layer                                 13
kgf                                                kgf                                                kernel generic ASM                                                                       14
kfg                                                kfg                                                kernel automatic storage management diskgroups                                           15
ksx                                                ksx                                                kernel service  cross instance                                                           15
kcf                                                kcf                                                kernel cache file management                                                             16
kko                                                kko                                                kernel compile optimizer                                                                 16
kkq                                                kkq                                                kernel compile query                                                                     16
kna                                                kna                                                kernel replication apply                                                                 16
kpd                                                (kp)d                                              kernel programmatic interface ??                                                         16
kxd                                                kxd                                                kernel execution exadata modules                                                         16
skg                                                skg                                                operating system dependent kernel generic                                                17
kbr                                                (kb)r                                              kernel backup ??                                                                         18
sag                                                sag                                                ??                                                                                       18
krb                                                krb                                                kernel redo backup/restore                                                               20
kew                                                kew                                                kernel event AWR                                                                         21
qer                                                qer                                                query execute rowsource                                                                  22
qmx                                                qmx                                                query XDB XML Objects                                                                    22
kol                                                (ko)l                                              kernel objects ??                                                                        24
kok                                                kok                                                kernel objects kernel side                                                               25
kup                                                kup                                                kernel utility datapump                                                                  26
ksi                                                ksi                                                kernel service  instance locking                                                         27
kts                                                kts                                                kernel transaction segment management                                                    31
qes                                                qes                                                query execute services                                                                   36
kfd                                                kfd                                                kernel automatic storage management disk                                                 43
qks                                                qks                                                query kernel sql                                                                         43
kdm                                                kdm                                                kernel data in-memory data layer                                                         48
kub                                                (ku)b                                              kernel utility ??                                                                        53
ktm                                                ktm                                                kernel transaction transaction monitor (smon)                                            70
kdz                                                kdz                                                kernel data archive compression                                                         134
jzn                                                jzn                                                json                                                                                    244

code symbol names unique in version 12.1.0.2.200414 versus 12.1.0.2.200114

NAME                                               RESOLVE                                            ANNOTATION                                                                         COUNT(*)
-------------------------------------------------- -------------------------------------------------- -------------------------------------------------------------------------------- ----------
Ext                                                Ext                                                ??                                                                                        1
R_S                                                R_S                                                ??                                                                                        1
ctc                                                ctc                                                create table or cluster                                                                   1
dim                                                (di)m                                              PLSQL diana ??                                                                            1
jsv                                                (js)v                                              job scheduing ??                                                                          1
kcb                                                kcb                                                kernel cache buffers                                                                      1
kds                                                kds                                                kernel data seek/scan                                                                     1
kes                                                kes                                                kernel event SQL manageability                                                            1
kfc                                                kfc                                                kernel automatic storage management cache                                                 1
kff                                                kff                                                kernel automatic storage management KFF                                                   1
kfg                                                kfg                                                kernel automatic storage management diskgroups                                            1
kfk                                                kfk                                                kernel automatic storage management KFK                                                   1
kfr                                                kfr                                                kernel automatic storage management recovery                                              1
kfz                                                kfz                                                kernel automatic storage management zecurity subsystem                                    1
kgs                                                kgs                                                kernel generic service                                                                    1
kke                                                kke                                                kernel compile cost engine                                                                1
kko                                                kko                                                kernel compile optimizer                                                                  1
kks                                                kks                                                kernel compile shared objects (cursor)                                                    1
kpd                                                (kp)d                                              kernel programmatic interface ??                                                          1
ksl                                                ksl                                                kernel service  latching and post-wait                                                    1
ksm                                                ksm                                                kernel service  memory                                                                    1
ksx                                                ksx                                                kernel service  cross instance                                                            1
kub                                                (ku)b                                              kernel utility ??                                                                         1
kud                                                (ku)d                                              kernel utility ??                                                                         1
kxf                                                kxf                                                kernel execution parallel query                                                           1
kze                                                (kz)e                                              kernel security ??                                                                        1
kzr                                                kzr                                                kernel security virtual private database                                                  1
kzv                                                (kz)v                                              kernel security ??                                                                        1
nzh                                                (nz)h                                              network security ??                                                                       1
nzr                                                (nz)r                                              network security ??                                                                       1
pfr                                                pfr                                                PLSQL Code Execution runtime                                                              1
qcs                                                qcs                                                query compile semantic analysis (parser)                                                  1
qjs                                                (q)js                                              query ??                                                                                  1
qka                                                qka                                                query kernel allocation                                                                   1
qke                                                qke                                                query kernel expressions                                                                  1
qkn                                                qkn                                                query kernel allocate                                                                     1
qos                                                qos                                                query optimizer statistics                                                                1
qxi                                                (qx)i                                              query execution ??                                                                        1
ztc                                                (zt)c                                              security encryption ??                                                                    1
kcr                                                kcr                                                kernel cache recovery                                                                     2
kgc                                                (kg)c                                              kernel generic ??                                                                         2
kgf                                                kgf                                                kernel generic ASM                                                                        2
kgl                                                kgl                                                kernel generic library cache management                                                   2
kjz                                                kjz                                                kernel lock management DIAG                                                               2
kkq                                                kkq                                                kernel compile query                                                                      2
kng                                                (kn)g                                              kernel replication ??                                                                     2
krb                                                krb                                                kernel redo backup/restore                                                                2
qks                                                qks                                                query kernel sql                                                                          2
kbr                                                (kb)r                                              kernel backup ??                                                                          3
kfn                                                kfn                                                kernel automatic storage management networking subsystem                                  3
dbg                                                dbg                                                debug and diagnostics                                                                     4
kdz                                                kdz                                                kernel data archive compression                                                           4
kti                                                kti                                                kernel transaction in memory undo                                                         4
qcp                                                qcp                                                query compile parse                                                                       4
qer                                                qer                                                query execute rowsource                                                                   4
nzd                                                (nz)d                                              network security ??                                                                       5
krv                                                krv                                                kernel redo recovery                                                                      6
kna                                                kna                                                kernel replication apply                                                                  7
qmx                                                qmx                                                query XDB XML Objects                                                                     7
nzb                                                (nz)b                                              network security ??                                                                       8
nzo                                                (nz)o                                              network security ??                                                                       8
ktm                                                ktm                                                kernel transaction transaction monitor (smon)                                             9
kok                                                kok                                                kernel objects kernel side                                                               10
krs                                                krs                                                kernel redo standby/dataguard (?)                                                        13
kfd                                                kfd                                                kernel automatic storage management disk                                                 15
kdm                                                kdm                                                kernel data in-memory data layer                                                         20
skg                                                skg                                                operating system dependent kernel generic                                                21
kts                                                kts                                                kernel transaction segment management                                                    23
jzn                                                jzn                                                json                                                                                     56

This blogpost takes a look at the technical differences between Oracle database 11.2.0.4 PSU 200114 (january 2020) and 200414 (april 2020). This gives technical specialists an idea of the differences, and gives them the ability to assess if the PSU impacts anything.

Functions

code symbol names unique in version 11.2.0.4.200114 versus 11.2.0.4.200414

NAME                                                         RESOLVE                                                      ANNOTATION
------------------------------------------------------------ ------------------------------------------------------------ --------------------------------------------------------------------------------
kgiLRUResize                                                 (kgi)LRUResize                                               kernel generic instantiation manager ??
ktspbrkshklck                                                (ktsp)brkshklck                                              kernel transaction segment management segment pagetable ??
ktspgfbs                                                     (ktsp)gfbs                                                   kernel transaction segment management segment pagetable ??

code symbol names unique in version 11.2.0.4.200414 versus 11.2.0.4.200114

NAME                                                         RESOLVE                                                      ANNOTATION
------------------------------------------------------------ ------------------------------------------------------------ --------------------------------------------------------------------------------
kglPinClearNiv                                               (kgl)PinClearNiv                                             kernel generic library cache management ??
kgldtCheckParent                                             (kgldt)CheckParent                                           kernel generic library cache management dependency table ??
kjctr_updatetkt_in                                           (kjc)tr_updatetkt_in                                         kernel lock management communication ??
knld_EventFillInfo                                           (knl)d_EventFillInfo                                         kernel replication streams replication ??
kqltemptablod                                                (kql)temptablod                                              kernel query library cache ??
ktspgsp_main2                                                (ktspgsp_main)2                                              kernel transaction segment management segment pagetable get space for inserts ??

this is what you would expect from a version that is essentially out of support, this to me looks like a low number of functions removed and added.

Parameters

parameters unique in version 11.2.0.4.200114 versus 11.2.0.4.200414

NAME
--------------------------------------------------
_thirty-eighth_spare_parameter

parameters unique in version 11.2.0.4.200414 versus 11.2.0.4.200114

NAME
--------------------------------------------------
_disable_actualization_for_grant

It seems one spare parameter was traded for an underscore parameter. I have no idea what the purpose of this parameter is.

No other changes in the data dictionary have been found.

parameter values changed isdefault between 11.2.0.4.200114 versus 11.2.0.4.200414

parameter values unique to 11.2.0.4.200114 versus 11.2.0.4.200414

parameter values unique to 11.2.0.4.200414 versus 11.2.0.4.200114

waitevents unique in version 11.2.0.4.200114 versus 11.2.0.4.200414

waitevents unique in version 11.2.0.4.200414 versus 11.2.0.4.200114

waitevents changed parameter description between 11.2.0.4.200114 versus 11.2.0.4.200414

x$ tables unique to 11.2.0.4.200114 versus 11.2.0.4.200414

x$ tables unique to 11.2.0.4.200414 versus 11.2.0.4.200114

x$ tables columns unique to 11.2.0.4.200114 versus 11.2.0.4.200414

x$ tables columns unique to 11.2.0.4.200414 versus 11.2.0.4.200114

v$ tables unique to 11.2.0.4.200114 versus 11.2.0.4.200414

v$ tables unique to 11.2.0.4.200414 versus 11.2.0.4.200114

v$ tables columns unique to 11.2.0.4.200114 versus 11.2.0.4.200414

v$ tables columns unique to 11.2.0.4.200414 versus 11.2.0.4.200114

gv$ tables unique to 11.2.0.4.200114 versus 11.2.0.4.200414

gv$ tables unique to 11.2.0.4.200414 versus 11.2.0.4.200114

gv$ tables columns unique to 11.2.0.4.200114 versus 11.2.0.4.200414

gv$ tables columns unique to 11.2.0.4.200414 versus 11.2.0.4.200114

sysstat statistics unique to 11.2.0.4.200114 versus 11.2.0.4.200414

sysstat statistics unique to 11.2.0.4.200414 versus 11.2.0.4.200114

sys_time_model statistics unique to 11.2.0.4.200114 versus 11.2.0.4.200414

sys_time_model statistics unique to 11.2.0.4.200414 versus 11.2.0.4.200114

dba tables unique to 11.2.0.4.200114 versus 11.2.0.4.200414

dba tables unique to 11.2.0.4.200414 versus 11.2.0.4.200114

dba tables columns unique to 11.2.0.4.200414 versus 11.2.0.4.200114

dba tables columns unique to 11.2.0.4.200114 versus 11.2.0.4.200414

cdb tables unique to 11.2.0.4.200114 versus 11.2.0.4.200414

cdb tables unique to 11.2.0.4.200414 versus 11.2.0.4.200114

cdb tables column unique to 11.2.0.4.200114 versus 11.2.0.4.200414

cdb tables column unique to 11.2.0.4.200414 versus 11.2.0.4.200114

This post is about library cache SQL cursors, and how these are managed by the database instance.

Whenever an Oracle database parses a SQL, it follows the following sequence of events (events which are of importance to the topic of cursor children):
1. The SQL text is hashed to generate a hash value.
2. The hash value is used to inspect the session cursor cache of the session to find a cached child.
If a cached entry is found, the child is inspected for compatibility, and then executed.
3. The hash value is used to calculate the library cache hash bucket.
4. The hash bucket is inspected, and the pointer is followed to find the parent handle.
5. In the parent handle a pointer is followed to the parent’s heap 0.
6. In the parent heap 0 a pointer to the hash table with the cursor children followed.
7. The hash table is read from the most recent child down to the oldest child.
If a compatible child is found, that child is executed.
8. If no compatible child is found, a new child is created.

I guess this is reasonably well known.

However, there are some nuances that are interesting to understand.

The first one is that with pluggable databases, a child has the container-id in which the child is generated as a property. During the inspection of the child list, the container-id of the child must match the container-id of the executing session, otherwise the child is skipped. This prevents from getting blocked from sessions across containers. Please mind the library cache hash table and the parent handle are shared between containers, although waiting for these should be negligible. See Oracle multi-tenant and library cache isolation.

Children are inspected from most recently created to the oldest. This is logical, more recent children have the highest probability to have been optimised with the most current state of the data.

Whenever there is no compatible child found, a new child must be created.

Incomplete list insertion.
The first thing that is done to create a new child, is an entry is allocated into the child list for the new child. This is done using the function kkshinins(), and a cursor trace will show this as:

kkshinins insert child into incomplete list bi=0x7ea05290 cld=21 flg=25

This is noteworthy, because to do so, the hash table (child list) mutex must be taken in exclusive mode. If a session wants to inspect the child list, it will briefly take the hash table mutex in shared mode. If the session creating the new child does find such a session, it will wait for ‘cursor: mutex X’. The other way around: if a session wants to inspect the child list, but find it taken by a session in exclusive mode, it will wait for ‘cursor: mutex S’.

I am not sure how to read ‘incomplete list’, and whether this means that the current child list is containing a child not yet complete, or hinting at a separate list of incomplete cursors. What I do know, is that the function kkshinins() inserts the new at this point not yet created (“incomplete”) child as the very last entry of the child list.

This means that at this point, if another session needs to find a compatible child, it will scan the child list and if no children are found compatible, it will find the incomplete child as the last one. At this point once the session finds the incomplete child, it has to wait for it, and will wait for ‘kksfbc child completion’.

Child load.
The session that is creating a new child will continue to create the new child, for which the next step is to allocate the child’s heap 0. This is done in the function kksLoadChild(). After the child’s heap 0 is allocated, it will pin the child in exclusive mode. Sessions that were waiting for ‘kksfbc child completion’ now will wait for ‘cursor: pin S wait on X’. Any other sessions that scanned the child list and did not find any compatible child, will run into this (last) child and wait for ‘cursor: pin S wait on X’ too.

There is a good reason the sessions wait for the child creation. If a session would not wait for the child creation, it would not have found any suitable child, because the incomplete child is the last one on the list, and therefore also would start creating a child. The reason this doesn’t happen is that in such case, there is a fair chance that both sessions would create identical children.

At this point, after the child cursor is pinned in exclusive mode, the session will continue to create the child and perform all the steps it needs to do during parsing. In fact, child generation is the actual thing that is considered ‘hard parse’. This includes syntax checking, semantic checking, execution path selection, optimization, creation of the executable version of the cursor (in heap 6) etc.

Incomplete list removal.
When the child creation is finished, the just created child is taken off the incomplete list (function kkshindel()), and inserted into child list as first entry (function kkshhcins()). This requires exclusive access to the child list once again, so when there is high concurrency, this might yield ‘cursor: mutex X’ for the session creating the child, and/or ‘cursor: mutex S’ for the sessions willing to scan the child list. A cursor trace shows:

kkshindel remove child from incomplete list bi=0x7ea05290 cld=21 flg=30
kkshhcins insert child into hash table bi=0x7ea05290 cld=21 flg=38

I am not sure if it’s a change in the child’s status (perhaps the ‘flg’ field shown in the trace?) being taken off the incomplete list, but what does happen is that the child after being taken off the incomplete list is inserted as first entry on the list of children. The next step is that the exclusive pin of this child is changed to shared. Any sessions that were waiting for child creation in ‘cursor: pin S wait on X’ now will continue and can inspect the child to check compatibility, and if found compatible execute the newly created child. If the child it was waiting for was not compatible, it has run out of children to check and the session will start creating a new child.

This post is the result of a question that I got after presenting a session about Oracle database mutexes organised by ITOUG, as a response to the conference cancellations because of COVID-19. Thank Gianni Ceresa for asking me!

The library cache provides shared cursors and execution plans. Because they are shared, sessions can take advantage of the work of previous sessions of creating these. However, by having these shared, access needs to be regulated not to have sessions overwrite each other’s work. This is done by mutexes.

The question I got was (this is a paraphrased from my memory): ‘when using pluggable databases, could a session in one pluggable database influence performance of a session in another pluggable database’?

The answer I gave was that I didn’t test a multi-tenant scenario, but because the library cache is shared between the pluggable databases, it should be possible for a session in one pluggable database to block another session in another pluggable database.

So let’s test it!

I used an Oracle version 20.2 database, which automatically gives you a pluggable database. In fact, even when you don’t specify you want a multi-tenant database, it will create one. This is as expected and documented.

I created an instance called o202, and two pluggable databases, PDB1 and PDB2. Yes, I am a creative genius.

I logged in to PDB1 and executed a ‘select 8 from dual’.

SQL> show con_id

CON_ID
------------------------------
3

SQL> select 8 from dual;

         8
----------
         8

Now using another session in the root container, I dumped the library cache at level 16 to see how that looks like for the ‘select 8 from dual’ cursor:

SQL> show con_id

CON_ID
------------------------------
1

SQL> alter session set events 'immediate trace name library_cache level 16';

Session altered.

(don’t do this on a live environment!)
These are snippets of certain parts of the dump:

Bucket: #=100089 Mutex=0x71f7f8d8(627065225216, 6, 0, 6)
  LibraryHandle:  Address=0x67bcca90 Hash=372d86f9 LockMode=N PinMode=0 LoadLockMode=0 Status=VALD
    ObjectName:  Name=select 8 from dual
      FullHashValue=f18bf11763dd069341c61ce6372d86f9 Namespace=SQL AREA(00) Type=CURSOR(00) ContainerId=1 ContainerUid=1 Identifier=925730553 OwnerIdn=9
...
    Concurrency:  DependencyMutex=0x67bccb40(0, 1, 0, 0) Mutex=0x67bccbe0(146, 22, 0, 6)
...
        Child:  childNum='0'
          LibraryHandle:  Address=0x6e6b3fd8 Hash=0 LockMode=N PinMode=0 LoadLockMode=0 Status=VALD
            Name:  Namespace=SQL AREA(00) Type=CURSOR(00) ContainerId=3
...
            Concurrency:  DependencyMutex=0x6e6b4088(0, 0, 0, 0) Mutex=0x67bccbe0(146, 22, 0, 6)

This is bucket 100089, with its mutex. The counter for this mutex is the second argument, which is 6.
The library cache handle on the next line is the parent handle. The sql text and the hash value sit with the parent handle.
I am not sure how to interpret the “ContainerId” here, it says 1 (indicating the root container), while this SQL is only executed in container 3, which is shown above.
After the first ‘…’, the two mutexes in the parent handle can be seen: the dependency mutex, and the parent handle mutex.
After the second ‘…’, child number 0 is visible. Here the container is specified from which this child was actually executed.
After the third ‘…’, the concurrency information for the child is shown. The important bit is the dependency mutex is independent/unique, whilst the regular mutex is the same/shared from the parent.

Now what would happen if I execute the exact same SQL (select 8 from dual) in another container; PDB2 alias container id 4?

SQL> show con_id

CON_ID
------------------------------
4

SQL> select 8 from dual;

         8
----------
         8

And look at bucket 100089 again (using the dump of the library cache at level 16):

Bucket: #=100089 Mutex=0x73f81e58(8589934592, 4, 0, 6)
  LibraryHandle:  Address=0x656e2638 Hash=372d86f9 LockMode=N PinMode=0 LoadLockMode=0 Status=VALD
    ObjectName:  Name=select 8 from dual
      FullHashValue=f18bf11763dd069341c61ce6372d86f9 Namespace=SQL AREA(00) Type=CURSOR(00) ContainerId=1 ContainerUid=1 Identifier=925730553 OwnerIdn=9
...
    Concurrency:  DependencyMutex=0x656e26e8(0, 2, 0, 0) Mutex=0x656e2788(2, 31, 0, 6)
...
        Child:  childNum='0'
          LibraryHandle:  Address=0x656e0ed8 Hash=0 LockMode=N PinMode=0 LoadLockMode=0 Status=VALD
            Name:  Namespace=SQL AREA(00) Type=CURSOR(00) ContainerId=3
...
            Concurrency:  DependencyMutex=0x656e0f88(0, 0, 0, 0) Mutex=0x656e2788(2, 31, 0, 6)
...
        Child:  childNum='1'
          LibraryHandle:  Address=0x655fba58 Hash=0 LockMode=N PinMode=0 LoadLockMode=0 Status=VALD
            Name:  Namespace=SQL AREA(00) Type=CURSOR(00) ContainerId=4
...
            Concurrency:  DependencyMutex=0x655fbb08(0, 0, 0, 0) Mutex=0x656e2788(2, 31, 0, 6)

Executing an identical SQL statement (which is a cursor object in the library cache) in another PDB means the same library cache hash bucket is used, and container id is one of the sharing criteria for a child. Because the statement in both cases has a different container id, the cursors can not be shared. This is very logical if you think about it: both containers essentially are logically isolated databases, and therefore there is no way to tell if the tables, access rights, data, really anything is the same, so the SQL is parsed again for the other container.

This has some implications. Because of identical SQL, it hashes to the same value, which mean both containers need access to the same bucket to find the pointer to the parent handle. Both containers also need access to the same parent and the same heap 0 to obtain the child list to find their child. Both the bucket and the handle have a single mutex that serializes access.

Access to the parent heap 0 is done shared, and remains shared if a compatible child can be found.

However, as soon as a new child needs to be created for the same SQL, it will create the new child handle pinned in exclusive mode and insert it into the hash table/child list in the parent’s heap 0. The new child is pinned in exclusive mode, because it still needs to allocate the child heaps and create the child information. This new child is the newest entry, and therefore is the first one any session would find if it scans for a compatible child.

At this point multi-tenancy makes a difference:
– If a session in the same container starts parsing and follows the procedure of hashing the SQL text, obtaining the bucket, finding the parent handle and then scan the child list for a compatible child, it will find the child in exclusive mode for this container, and waits for the child creation to finish, waiting on the event: ‘cursor: pin S wait on X’.
– If a session in ANOTHER container starts parsing and follows the procedure of hashing the SQL text, obtaining the bucket, finding the parent handle and then scan the child list for a compatible child, it will find the child in exclusive mode, but because it has a different container id, it will skip the child, and continue to scan the child list for compatible children. The scanning means pinning each child that potentially can be compatible, which means has the same container id, in shared mode, and by doing that either finds a compatible child, or if it doesn’t, create one itself.

So what does that mean; the conclusion:
The library cache hash table, the parent cursor and the child list/hash table are shared between all containers. These require unique (library cache hash table/handle) or shared (child list/hash table) access for each database process for usage, but the actual time it is held in these modes is so short that it is negligible.
The main reason for waiting for a cursor/library cache entry is a child pinned in exclusive mode during creation (‘cursor: pin S wait on X’).
When an identical statement is executed in another container, it will skip a child that is created or in the process of being created in another container, even when it’s pinned in exclusive mode.

This means that the multi-tenant option in my opinion does not significantly increase the risk of waiting because of library cache concurrency, specifically because it can skip child cursors pinned exclusively in another container.

Update april 4: I tested having a child from one container being pinned in X mode and running the same SQL in another container in 12.1.0.2.0 (base release of 12.1.0.2) and it works exactly the same as described above. So whilst the library cache hash table and parent handles are shared, the cursor children are specific to a container and do not lock each other out.

This blogpost is about the Oracle database row or dictionary cache. This is a separate cache that caches database metadata, like database object properties or user properties.

There is surprising little in-depth technical detail about the row cache. To some degree I understand: issues with the row cache are rare.

I noticed a column in V$ROWCACHE called ‘FASTGETS’. Whatever FASTGETS means, in my database it is being used:

SQL> select cache#, type, parameter, gets, fastgets, getmisses from v$rowcache;
    CACHE# TYPE        PARAMETER                              GETS   FASTGETS  GETMISSES
---------- ----------- -------------------------------- ---------- ---------- ----------
         3 PARENT      dc_rollback_segments                   1000          9         12
         1 PARENT      dc_free_extents                           0          0          0
         4 PARENT      dc_used_extents                           0          0          0
         2 PARENT      dc_segments                            4374       7308       2496
         0 PARENT      dc_tablespaces                         1232      14466        178
         5 PARENT      dc_tablespace_quotas                      0          0          0
         6 PARENT      dc_files                                  5          0          5
        10 PARENT      dc_users                                953      13473        196
         8 PARENT      dc_objects                            32545      13682       3769
        17 PARENT      dc_global_oids                          723          0        165
        12 PARENT      dc_constraints                            0          0          0
        13 PARENT      dc_sequences                              9          0          9
        16 PARENT      dc_histogram_defs                     60851          0      19613

If you look at the description of FASTGETS in the documentation (currently the 20c version), it says: ‘Reserved for internal use’. That’s not very helpful.

Luckily, there is a trace event to make Oracle print additional information regarding the row cache: event 10222.

Please mind using events is an unsupported action unless you have explicit blessing from Oracle support to use it. Even then, it should only be set if you are 100% sure you need it, and you should unset it as soon as you can. Also document the exact reason for using it, and evaluate set events during patching and version upgrades.

Using the event, I found that a normal dictionary lookup looks like this:

kqrpre: start hash=6d2c659c mode=S keyIndex=1 dur=CALL opt=FALSE
kqrpre: found cached object po=0x7e5da720 flg=2
kqrmpin : kqrpre: found po Pin po 0x7e5da720 cid=10 flg=2 hash=70b9dd3f
time=3663934334
kqrpre: pinned po=0x7e5da720 flg=2 pso=0x793472c8
kqrpre: done po=0x7e5da720 cid=10 flg=2 eq=0x7e616a20 pso=0x793472c8 dur=CALL
kqrpre: keyIndex=1 hash=70b9dd3f 6d2c659c 0
kqrpre: time=3663934346
kqrpre: returnVal: TRUE
.
kqrprl: eq=0x7e616a20 fpInUse=FALSE
ksedsts()+418<-kqrprl()+1208<-kkdlGetBaseUser()+248<-kzulgt1()+220<-kksLockUserSchema()+125<-kksLoadChild()+1983<-kxsGetRuntimeLock()+2049<-kksfbc()+17007<-kkspsc0()+2222<-kksParseCursor()+123<-opiosq0()+2266<-kpoopr
x()+420<-kpoal8()+838<-opiodr()+1244<-ttcpip()+1239
<-opitsk()+1943<-opiino()+957<-opiodr()+1244<-opidrv()+1091<-sou2o()+191<-opimai_real()+455<-ssthrdmain()+423<-main()+275<-__libc_start_main()+245.

This is a bit cryptic, but this is a row cache parent read (kqrpre) and release (kqrprl), for cache id (cid) 10 (dc_users, see above v$rowcache output), for which the needed entry was found (found cached object). It even contains a short stack format dump of the current location of invocation.

Now see the following lookup:

kqrpre: start hash=70b9dd3f mode=S keyIndex=0 dur=CALL opt=TRUE
kqrpre: optimistic lookup hash=70b9dd3f
kqrpre: KQR_READ_PD_VER: versionAddr=0x628f2a20 version=10
kqrpre: optimistic lookup: success
kqrpre: done po=0x7f651cbb1bf8 cid=10 flg=0 eq=(nil) pso=0x79360288 dur=CALL
kqrpre: keyIndex=0 hash=70b9dd3f 6d2c659c 0
kqrpre: time=3663936115
kqrpre: returnVal: TRUE
.
kqrprl: eq=0x7f651cbb1bf8 fpInUse=TRUE
ksedsts()+418<-kqrprl()+1208<-kkdlGetUserId()+587<-kqlhdlod()+641<-kqlCallback()+120<-kqllod()+667<-kglLoadOnLock()+1107<-kgllkal()+861<-kglLock()+1429<-kglget()+302<-kglgob()+328<-qcdlgbo()+609<-qcdlgob()+982<-qcsfgob()+278<-qcsprfro()+553<-qcsprfro_tree()+380
<-qcsprfro_tree()+150<-qcspafq()+246<-qcspqbDescendents()+281<-qcspqb()+272<-kkmdrv()+192<-opiSem()+1962<-opiDeferredSem()+447<-opitca()+436<-kksFullTypeCheck()+86<-rpiswu2()+583<-kksLoadChild()+7969<-kxsGetRuntimeLock()+2049<-kksfbc()+17007<-kkspsc0()+2222
<-kksParseCursor()+123<-opiosq0()+2266<-kpooprx()+420<-kpoal8()+838<-opiodr()+1244<-ttcpip()+1239<-opitsk()+1943<-opiino()+957<-opiodr()+1244<-opidrv()+1091<-sou2o()+191<-opimai_real()+455<-ssthrdmain()+423<-main()+275<-__libc_start_main()+245.

This is lookup for cid 10 (dc_users) too, but now “optimistic”, as can be seen on line 2. The first line shows “opt=TRUE”, which gives the indication that optimistic reading is probably not done dynamically, but an explicit setting. Could this “optimistic” read be what is shown as FASTGET in v$rowcache?

Let’s look at the actual C functions that are called for such an optimistic row cache read:

> kqrpre1(0xa, 0x7ffc48561380, ...)
| > kqrpre2(0xa, 0x7ffc48561380, ...)
| | > kqrhsh(0x62f471c8, 0x7ffc48561380, ...)
| | | > kgghash(0x7ffc48561380, 0x82, ...)
| | | < kgghash+0x0000000000d0 returns: 0xc34886c9
| | < kqrhsh+0x00000000007d returns: 0xc34886c9
| | > KQR_READ_PD_VER(0x62f471c8, 0x628c7670, ...)
| | < KQR_READ_PD_VER+0x000000000044 returns: 0x4a
| | > kqrFastPoRead(0x62f471c8, 0x7ffc48561380, ...)
| | | > _intel_fast_memcmp(0x7ffc48561380, 0x78fc6834, ...)
| | | < _intel_fast_memcmp+0x000000000053 returns: 0
| | | > kglInstanceIdn(0x7f1bc9f099a0, 0x78fc6800, ...)
| | | <> kglHandleIdn(0x7f1bc9f099a0, 0x7c736fb0, ...)
| | | < kglHandleIdn+0x000000000011 returns: 0x1
| | | > kglPdbInstanceIdn(0x7f1bc9f099a0, 0x7c736fb0, ...)
| | | < kglPdbInstanceIdn+0x000000000024 returns: 0x1
| | < kqrFastPoRead+0x0000000001ac returns: 0x1

Here we see kqrpre1 call kqrpre2, which calls kqrhsh to calculate a hash value to lookup the row cache entry. Then we see KQR_READ_PD_VER being called, which is unique to the optimistic lookup, and then kqrFastPoRead. That seems to be a reasonable definite indication this is doing the FASTGET.

The logical question to ask then is what is the actual difference between a fastget and a (regular) get?

Looking through the function calls of a fastget versus a regular get the thing that strikes me is the absence of concurrency checking and protection gets in the fastget alias optimistic call. A regular row cache get performs:
– obtains the row cache hash bucket mutex
– obtains the row cache parent object mutex
– pins the (row cache entry) memory
– frees the row cache hash bucket mutex
– locks the row cache entry (kqrget)
– frees the row cache parent object mutex

– the row cache data is used (this is the actual purpose of the row cache get)

– obtains the row cache parent object mutex
– unpins the (row cache entry) memory
– frees the row cache parent object mutex
– frees the lock (there is no explicit unlock call, I assume it’s done via the state object)

A fastget does simply not do a lot of the concurrency protection measures, it checks the version of the entry before and after getting the entry, and apparently that’s enough to be sure about the consistency of the row cache entry. It does lock the entry, and releases it after usage. Especially since the number of row cache lookups can easily get very high, this probably saves a lot of CPU cycles.

Looking in the bigger picture, the fastget seems to be done at times when the database executable has fetched the same entry in a regular way earlier, so it’s safe to assume it will still be there, and it’s safe to do it without all the concurrency control around it.

Is there anything you can do about it? Not that I know, this is just a blog to explain something about which there is no documentation.

In case you’re wondering when this appeared in Oracle, the answer is 11.2.0.4. The FASTGETS column appeared in that function, as well as the C function kqrFastPoRead.

There’s also the undocumented parameter “_kqr_optimistic_reads”, which is TRUE in my (Oracle 18) instance.

This post is about one of the fundamentally important properties of a database: how IO is done. The test case I studied is doing a simple full table scan of a single large table. In both Oracle and postgres the table doesn’t have any indexes or constraints, which is not a realistic example, but this doesn’t change the principal topic of the study: doing a table scan.

I used a publicly available dataset from the US bureau of transportation statistics called FAF4.5.1_database.zip
The zipped file is 347MB, unzipped size 1.7GB.

In both cases Oracle Linux 7.7 (64 bit) is used, running in VirtualBox, with the storage being a USB3 SSD. Number of CPUs is 4, memory size is 6G. Filesystem type: xfs.
The Oracle version used is Oracle 19.5, the Postgresql version used is 12.1.
For Postgresql, the postgresql.conf file is not changed, except for max_parallel_workers_per_gather which is set to 0 to make postgres use a single process.
For Oracle, the parameters that I think are important: filesystemio_options=’setall’. Oracle is used filesystem based (so no ASM).

This is the table definition for Oracle:

create table faf451 (
  fr_origin varchar2(3),
  dms_orig varchar2(3),
  dms_dest varchar2(3),
  fr_dest varchar2(3),
  fr_inmode varchar2(1),
  dms_mode varchar2(1),
  fr_outmode varchar2(1),
  sctg2 varchar2(2),
  trade_type varchar2(1),
  tons number,
  value number,
  tmiles number,
  curval number,
  wgt_dist number,
  year varchar2(4)
);

This is the table definition for Postgresql:

create table faf451 (
  fr_origin varchar(3),
  dms_orig varchar(3),
  dms_dest varchar(3),
  fr_dest varchar(3),
  fr_inmode varchar(1),
  dms_mode varchar(1),
  fr_outmode varchar(1),
  sctg2 varchar(2),
  trade_type varchar(1),
  tons double precision,
  value double precision,
  tmiles double precision,
  curval double precision,
  wgt_dist double precision,
  year varchar(4)
);

In order for the data to be easy loadable into postgres using copy from, I had to remove ‘””‘ (double double quotes) for the empty numeric fields. In oracle I could say “optionally enclosed by ‘”‘”. For Oracle I used an external table definition to load the data.

Now, before doing any benchmarks, I have an idea where this is going. Oracle is using direct IO (DIO) so linux page cache management and “double buffering” are avoided. Also, oracle will be doing asynchronous IO (AIO), which means submitting is separated from waiting for the notification that the submitted IOs are ready, and on top of that oracle will submit multiple IO requests at the same time. And again on top of that, oracle does multi-block IO, which means that instead of requesting each 8K database block individually, it will group adjacent blocks and request for these in one go, up to a size of combined blocks of 1MB, which means it can requests up to 128 8K blocks in one IO. Postgres will request every block synchronous, so 1 8K block at a time, and waiting for each request to finish. That makes me have a strong idea where this is going.

It should be noted that postgres explicitly is depending on the operating system page cache for buffering as a design principle. Because of DIO, blocks that are read by oracle are not cached in the operating system page cache.

I executed my benchmark in the following way:
– A run for every size is executed 5 times.
– At the start of every run for a certain size (so before every “batch” of 5 runs), the page cache is flushed: (echo 3 > /proc/sys/vm/drop_caches).
– Before each individual run, the database cache is flushed (systemctl restart postgresql-12 for postgres, alter system flush buffer_cache for oracle).

I started off with 2G from the dataset, and then simply performed a ‘copy from’ again to load the same dataset into the table in postgres. Oracle required a bit more of work. Oracle was able to save the same data in way less blocks; the size became 1.18G. In order to have both postgres and oracle scan the same amount of data, I calculated roughly how much rows I needed to add to the table to make it 2G, and copied that table to save it as a 2G table, so I could insert that table to increase the size of the test table by 2G. This way in both oracle and postgres I could test with a 2G table and add 2G at a time until I reached 20G.

These are the results. As you can see in the legenda: oracle is orange, postgres is blue.
postgres oracle scan results(click graph to load full picture)

What we see, is that postgres is a bit slower with the first run of 5 for the smaller dataset sizes, which becomes less visible with larger datasets.
Also, postgres is way faster if the dataset fits into the page cache and it has been read into it. This is logical because postgres explicitly uses the page cache as a secondary cache, and the test is the only activity on this server, so it hasn’t been flushed by other activity.

What was totally shocking to me, is postgres is performing alike oracle and both roughly are able to perform at the maximum IO speed of my disk: 300MB/s, especially when the dataset is bigger, alias beyond the page cache size.

It wasn’t shocking that oracle could reach the total bandwidth of the disk: oracle uses all the techniques to optimise IO for bandwidth. But how can postgres do the same, NOT deploying these techniques, reading 8K at a time??

The first thing to check is whether postgres is doing something else than I suspected. This can simply be checked with strace:

poll_wait(3, [{EPOLLIN, {u32=18818136, u64=18818136}}], 1, -1) = 1
recvfrom(11, "Q\0\0\0!select count(*) from faf451"..., 8192, 0, NULL, NULL) = 34
lseek(20, 0, SEEK_END)                  = 335740928
lseek(20, 0, SEEK_END)                  = 335740928
kill(1518, SIGUSR1)                     = 0
pread64(5, "\f\0\0\0\310ILc\0\0\0\0D\1\210\1\0 \4 \0\0\0\0\230\237\312\0000\237\312\0"..., 8192, 846061568) = 8192
pread64(5, "\f\0\0\0HcLc\0\0\0\0D\1\210\1\0 \4 \0\0\0\0\230\237\312\0000\237\312\0"..., 8192, 846069760) = 8192
pread64(5, "\f\0\0\0\260|Lc\0\0\0\0D\1\210\1\0 \4 \0\0\0\0\230\237\312\0000\237\312\0"..., 8192, 846077952) = 8192
pread64(5, "\f\0\0\0000\226Lc\0\0\0\0D\1\210\1\0 \4 \0\0\0\0\230\237\312\0000\237\312\0"..., 8192, 846086144) = 8192
…etc…

The above strace output shows only 4 rows of pread64() calls, but this goes on. So no “secret” optimisation there.

Luckily, my VM has a new enough version of Linux for it to be able to use eBPF, so I can use biosnoop. Biosnoop is a tool to look at IO on one of the lower layers of the linux kernel, the block device interface (hence ‘bio’). This is the biosnoop output:

# /usr/share/bcc/tools/biosnoop
TIME(s)        COMM           PID    DISK    T  SECTOR    BYTES   LAT(ms)
0.000000000    postmaster     4143   sdb     R  66727776  708608     5.51
0.006419000    postmaster     4143   sdb     R  66731720  77824     11.06
0.006497000    postmaster     4143   sdb     R  66734432  786432    11.03
0.011550000    postmaster     4143   sdb     R  66731872  1310720   16.17
0.013470000    postmaster     4143   sdb     R  66729160  1310720   18.86
0.016439000    postmaster     4143   sdb     R  66735968  1310720   14.61
0.019220000    postmaster     4143   sdb     R  66738528  786432    15.20

Wow…so here it’s doing IOs of up to 1MB! So somewhere between postgres itself and the block device, the IOs magically grew to sizes up to 1MB…that’s weird. The only thing that sits between postgres and the block device is the linux kernel, which includes page cache management.

To get an insight into that, I ran ‘perf record -g -p PID’ during the scan, and then perf report to look at the recorded perf data. This is what is I found:

Samples: 21K of event 'cpu-clock', Event count (approx.): 5277000000
  Children      Self  Command     Shared Object       Symbol                                                                  ◆
-   41.84%     3.63%  postmaster  libpthread-2.17.so  [.] __pread_nocancel                                                    ▒
   - 38.20% __pread_nocancel                                                                                                  ▒
      - 38.08% entry_SYSCALL_64_after_hwframe                                                                                 ▒
         - 37.95% do_syscall_64                                                                                               ▒
            - 35.87% sys_pread64                                                                                              ▒
               - 35.51% vfs_read                                                                                              ▒
                  - 35.07% __vfs_read                                                                                         ▒
                     - 34.97% xfs_file_read_iter                                                                              ▒
                        - 34.69% __dta_xfs_file_buffered_aio_read_3293                                                        ▒
                           - 34.32% generic_file_read_iter                                                                    ▒
                              - 21.10% page_cache_async_readahead                                                             ▒
                                 - 21.04% ondemand_readahead                                                                  ▒
                                    - 20.99% __do_page_cache_readahead                                                        ▒
                                       + 14.14% __dta_xfs_vm_readpages_3179                                                   ▒
                                       + 5.07% __page_cache_alloc                                                             ▒
                                       + 0.97% radix_tree_lookup                                                              ▒
                                       + 0.54% blk_finish_plug                                                                ▒

If you look at rows 13-15 you see that the kernel is performing readahead. This is an automatic function in the linux kernel which looks if the requests are sequential of nature, and when that’s true performs readahead, so that the scan is made faster.