All about headroom and mandatory patching before June 2019
This post was triggered upon rereading a blogpost by Mike Dietrich called databases need patched minimum april 2019. Mike’s blogpost makes it clear this is about databases that are connected using database links, and that:
– Newer databases do not need additional patching for this issue (11.2.0.4, 12.1.0.2, 12.2 and newer).
– Recent PSU patches contain a fix for certain older versions (11.1.0.7, 11.2.0.3 and 12.1.0.1).
– This means versions 11.2.0.2 and earlier 11.2 versions, 11.1.0.6 and earlier and anything at version 10 or earlier can not be fixed and thus are affected.
But what is the actual issue?
The first link in the article: Recommended patches and actions for Oracle databases versions 12.1.0.1, 11.2.0.3 and earlier – before June 2019 (Doc ID 2361478.1) provides essentially the same information as Mike’s blogpost, however it additionaly mentions that the interoperability of database clients with database servers is not affected.
Mike’s article mentions the following:
The alert refers to an SCN issue which came up a while ago. The system change number (SCN) is a logical, internal timestamp used by the Oracle Database. SCNs order events that occur within the database, which is necessary to satisfy the ACID properties of a transaction. The database uses SCNs to query and track changes.
So I guess it has something to do with SCNs. Most of the links are about SCNs. The MOS article that is most descriptive is: System Change Number (SCN), Headroom, Security and Patch Information (Doc ID 1376995.1).
This article describes a lot of details:
– SCNs are necessary for the database to keep changes organised. I got an article that explains that SCNs are not unique to a transaction, but are “just” granular enough to keep changes organised.
– SCNs are an ever increasing number. SCNs are never decreased!
– SCNs have a hard limit, which version specific, and is based on the number of bits for the number:
– – The general limit is 2^48.
– – From version 12.2 on, with compatibility set to 12.2 or higher, the limit is 2^63.
– SCNs have a per second increasing soft limit, called ‘RSL’ or ‘reasonable SCN limit’, which is version specific:
– – The general soft limit is 16384 (16k) SCNs per second calculated by the number of second from Januari 1st, 1988 times 16384.
– – From version 12.2 on, with compatibility set to 12.2 or higher, the limit is 98304 (96k) SCNs per second calculated by the number of seconds from Januari 1st, 2008 times 98304.
– The RSL can not be exceeded, if a database session tries to go beyond the soft limit, an ORA 600 [2252] is raised and the transaction is rolled back. This will not corrupt data (but obviously the data in the transaction is not applied).
– The difference between the current SCN and the RSL SCN is known as ‘SCN headroom’.
– There have been bugs that can increase SCNs at a higher rate, decreasing the SCN headroom or even reaching the soft limit, but these have all been solved in the Januari 2012 CPU/PSU/patch bundles.
– When databases communicate which each other via a database link, the SCNs of both databases are synchronised by picking the highest of the two.
But it doesn’t really explain why patches must be applied before June 2019. However, another article is more concrete on that: Recommended patching and actions for Oracle database versions 12.1.0.1, 11.2.0.3 and earlier – before June 2019 (Doc ID 2335265.1). The first interesting thing mentioned is:
3. What is the change introduced by the patches listed above?
These patches increase the database’s current maximum SCN (system change number) limit.
At any point in time, the Oracle Database calculates a “not to exceed” limit for the number of SCNs a database can have used, based on the number of seconds elapsed since 1988.
So, this means the patched database have a change (increase) in the RSL.
These recommended patches enable the databases to allow for a higher current maximum SCN limit. The rate at which this limit is calculated can be referred to as the “SCN rate” and these patches help allow higher SCN rates to enable databases to support many times higher transaction rates than earlier releases.
And this means the RSL is increased from the 16k per second since Januari 1988 to the 96k per second since Januari 2008.
Please note that the patches only increase the max limit but the current SCN is not impacted. So, if all your databases don’t have any major change in transaction rate, the current SCN would still remain below the current maximum SCN limit and database links between newer (or patched) and unpatched databases would continue to work. The patches provide the safety measure to ensure that you don’t have any issue with dblinks independent of any possible future change in your transaction rate.
With the patches applied, this change in current maximum SCN limit will happen automatically starting 23rd June 2019.
That is important information! So with the patch applied to some databases and not applied to some other databases and patched and non-patched databases have database links between them, everything should remain working. This is true for any database at this moment, because the change of the limit will happen on the 23rd of June 2019, which at the time of writing is in the future.
Once the change of limit has happened at the 23rd of June 2019, database links between older, non-patched versions of the database and newer or patched versions of the database could be affected if after the 23rd of June 2019 the SCN rate is increased in a newer or patched database and it runs beyond the headroom available in databases with the 16k rate, because the earlier mentioned SCN synchronisation then will fail because it’s beyond the 16k rate database headroom meaning it can not synchronise with the newer dataebase.
So the problem we are talking about here is two databases talking with each other over a database link, which have a different RSL, which could lead to a situation that one database is at an SCN number which is too high for the other older or non-patched database, meaning the communication will fail, which will persist until the older or non-patched databases is able to reach the SCN number of the newer database over time, if that is possible at all.
A thing that is not clear to me at this point: the documentation to me seems to suggest that Oracle version 12.2 with compatibility set to 12.2 or higher versions already allows the higher number of SCNs per second. (the above point: – – From version 12.2 on, with compatibility set to 12.2 or higher, the limit is 98304 (96k) SCNs per second calculated by the number of seconds from Januari 1st, 2008) If that is true, the issue that is warned for could potentially already happen!
Luckily, there is are ways to investigate this:
The reported “newer” versions and the versions that are patched for the rate change have an SGA variable “kcmscnc_” that lists the current SCN rate of the database. There are 3 values that I have seen for “kcmscnc_”:
1: This is the rate of 16k per second since Januari 1st 1988.
2: ?
3: This is the rate of 96k per second since Januari 1st 2008.
SQL> select ksmfsnam, ksmfsadr, ksmfssiz from x$ksmfsv where ksmfsnam like 'kcmscnc_'; KSMFSNAM KSMFSADR KSMFSSIZ ---------------------------------------------------------------- ---------------- ---------- kcmscnc_ 000000006001579C 4 SQL> oradebug peek 0x6001579C 4 [06001579C, 0600157A0) = 00000001
So, this databases is capable of switching (because kcmscnc_ exists), and is currently running with the 16k per second threshold.
In fact, I tested this on 11.2.0.4, 12.1.0.2 and 18.3.0.0, all these version report currently (currently is before June 2019) “1” or “scheme 1”. So the above mentioned rate of 96k for 12.2 and above does currently not apply for the soft limit or SCN headroom calculation for any database, including 12.2 and 18.3.
There is a less “hardcore” way to obtain this information, via the DBMS_SCN. This package equally to the “kcmscnc_” variable only exists if the database is of a version or patch version high enough to switch:
declare v_rsl number; v_headroom_in_scn number; v_headroom_in_sec number; v_cur_scn_compat number; v_max_scn_compat number; begin dbms_scn.getcurrentscnparams(v_rsl, v_headroom_in_scn, v_headroom_in_sec, v_cur_scn_compat, v_max_scn_compat); dbms_output.put_line('reasonable scn limit (soft limit): '||to_char(v_rsl,'999,999,999,999,999,999')); dbms_output.put_line('headroom in scn : '||to_char(v_headroom_in_scn,'999,999,999,999,999,999')); dbms_output.put_line('headroom in sec : '||v_headroom_in_sec); dbms_output.put_line('current scn compatibility scheme : '||v_cur_scn_compat); dbms_output.put_line('max scn compatibility scheme : '||v_max_scn_compat); end; / SQL> / reasonable scn limit (soft limit): 16,104,861,483,008 headroom in scn : 16,104,860,802,471 headroom in sec : 982962695 current scn compatibility scheme : 1 max scn compatibility scheme : 3 PL/SQL procedure successfully completed.
This is executed in a version 18.3.0.0 database. So the conclusion here is that currently all versions up to 18.3.0.0 are still compatible, because they all use the same SCN limit per second, which is referred to as ‘scheme 1’. However, on June 23, 2019 newer versions of the database will switch to a new scheme, which is referred to as ‘scheme 3’, which allows an more headroom.
First of all, I hope a lot of databases out there have enough headroom in the first place, and a modest rate of SCNs it is using per second, which means it doesn’t advance into the RSL. In such a case, when you got older versions that can not switch to scheme 3 and newer versions that will, I can see no reason to worry at all.
The second case is when your database is close to running out of headroom currently in scheme 1, and the SCN rate in the database is also close the limit, so you truly should worry when your database switches to scheme 3, it might actually run over the 16k per second limit, and when it does that long enough to run over RSL of scheme 1, communication over a database link between the current scheme 3 database with a scheme 1 database will be disrupted until the scheme 3 database SCN is low enough again to fit the scheme 1 RSL. If the SCN rate persists in the scheme 3 database, communication will be impossible between the scheme 1 and a scheme 3 database.
So, at this point it should be clear that the most important statistic for determining issues between scheme 1 and after June 2019 scheme 3 databases is the current headroom in your databases. For all of the databases involved that will have a database link between a scheme 1 and a scheme 3 database, you should investigate their SCN number and SCN rate. If some of these databases have run into the soft limit ora-600, ora 600 [2252], these are potential candidates for running over the soft limit when they switch to scheme 3.
However, the most important statistic is to see how close the current SCN is to the scheme 1 soft limit. This can be done using the following SQL (this SQL does not need a newer version of the database, and is tested with version 11.2.0.2 and higher):
col "RSL scheme 1" format 9,999,999,999,999,999 col "current value" format 9,999,999,999,999,999 select dbms_flashback.get_system_change_number "current value", ((((to_number(to_char(sysdate,'YYYY'))-1988)*12*31*24*60*60) + ((to_number(to_char(sysdate,'MM'))-1)*31*24*60*60) + (((to_number(to_char(sysdate,'DD'))-1))*24*60*60) + (to_number(to_char(sysdate,'HH24'))*60*60) + (to_number(to_char(sysdate,'MI'))*60) + (to_number(to_char(sysdate,'SS')))) * (16*1024)) "RSL scheme 1", round(dbms_flashback.get_system_change_number/((((to_number(to_char(sysdate,'YYYY'))-1988)*12*31*24*60*60) + ((to_number(to_char(sysdate,'MM'))-1)*31*24*60*60) + (((to_number(to_char(sysdate,'DD'))-1))*24*60*60) + (to_number(to_char(sysdate,'HH24'))*60*60) + (to_number(to_char(sysdate,'MI'))*60) + (to_number(to_char(sysdate,'SS')))) * (16*1024))*100,5) "% to RSL scheme 1" from dual;
If a database is close the RSL (roughly speaking beyond 90%-95%), the next thing to do is measure if the database keeps on using SCNs and keeps on being close to the RSL. If that is true, an additional increase in SCN usage could in the current situation using scheme 1 lead to an ora-600 [2252], but if that database has switched to scheme 3 after June 2019, there will not be anything keeping that database from going beyond an SCN number that will exceed the RSL of scheme 1, which will then cause issues if that database has a database link with a scheme 1 database.
Is there anything you can do if you suspect or know a database will go over the scheme 1 RSL? Purely for this issue, the obvious solution would be to make sure you are on a version that will switch to scheme 3 on June 2019, so at least after June 2019 it will not run into ora-600 [2252].
However, if such a scheme 3 database needs to connect to an older scheme 1 database, you have two choices:
1. Potentially run over the the scheme 1 limit and disrupt the database link communication.
2. Stop a newer database from switching to scheme 3, potentially disrupt changes in the current database, but it guarantees the database link will always work.
To look into the switch to scheme 3, which oracle calls ‘auto rollover’, the following SQL can be used:
declare v_autorollover_date date; v_target_compat number; v_is_enabled boolean; begin dbms_scn.getscnautorolloverparams(v_autorollover_date, v_target_compat, v_is_enabled); dbms_output.put_line('auto rollover date : '||to_char(v_autorollover_date,'YYYY-MM-DD')); dbms_output.put_line('target scheme : '||v_target_compat); dbms_output.put_line('rollover enabled (1=yes): '||sys.diutil.bool_to_int(v_is_enabled)); end; / SQL> / auto rollover date : 2019-06-23 target scheme : 3 rollover enabled (1=yes): 1 PL/SQL procedure successfully completed.
If you want to prevent a database from rolling over to scheme 3, the procedure dbms_scn.disableautorollover can be used:
exec dbms_scn.disableautorollover;
Obviously, the procedure enableautorollover does the opposite. Please mind to contact Oracle support before doing this with your production database, this is an undocumented procedure at this time.
Also mind that if you create a new database after June 23, 2019, with a new or patched version that can switch to scheme 3, it will probably be running scheme 3 by default. If you want to be absolutely sure it will not exceed the scheme 1 limit, you can revert it to scheme 1 manually using the alter database set scn compatibility N command in mount mode:
SQL> startup mount; ORACLE instance started. Total System Global Area 1048574496 bytes Fixed Size 8665632 bytes Variable Size 281018368 bytes Database Buffers 616562688 bytes Redo Buffers 142327808 bytes Database mounted. SQL> alter database set scn compatibility 1; Database altered. SQL> alter database open; Database altered.
For this too I would strongly advise to contact Oracle support first. The purpose of this blogpost is to define the problem, show all the technical details that have to do with it, and show all the tools that are part of it. There is in no way anything in this article to tell you what you should do, it just shows everything that surrounds the switch to scheme 3 in June 2019.
Another view that might be beneficial is x$kcmscn. This view seems to be created to help looking if a scheme 3 database can connect to a scheme 1 database:
col cur_max_scn format 999,999,999,999,999 col pre_11_2_0_2_cur_max_scn format 999,999,999,999,999 select * from x$ksmscn; ADDR INDX INST_ID CON_ID CUR_MAX_RATE CUR_SCN ---------------- ---------- ---------- ---------- ------------ ---------- CUR_MAX_SCN DIFF_IN_DAYS DIFF_IN_HOURS PRE_11_2_0_2_CUR_MAX_SCN -------------------- ------------ ------------- ------------------------ PRE_11_2_0_2_DIFF_IN_DAYS ------------------------- 00007F773DEDAE10 0 1 0 0 800918 16,108,830,064,640 11379 273112 16,108,830,064,640 186444783
Because currently (before June 2019) every database by default will be in scheme 1, the cur_max_scn and pre_11_2_0_2_cur_max_scn are identical. I even believe the column naming is wrong, the first version that can switch if it is patched to a high enough PSU version is 11.2.0.3, I do believe the column name is suggesting scheme 1 databases are databases of a version lower than 11.2.0.2, not including 11.2.0.2.
Conclusion.
I think there’s been a lot of fuzz for something that in most cases is not an issue. This article is supposed to give you all the knowledge and the tools to determine how it looks like in your situation.
This might be an issue if you happen to have one or more databases that are high on SCN numbering, and continues to take a lot of SCN numbers, and will be converted to a scheme 3 database on June 29, 2019 and is suspected to increase on taking SCN numbers for whatever reason AND it has a database link to a scheme 1 database that remains scheme 1. That’s a lot of ifs.
On the other hand you only need one database to be high in SCN numbering which continues to take a lot of SCNs keeping it close to the soft limit, which will propagate its SCN to other databases if it is linked, or the required properties of the problem spread out over multiple linked databases.
Again, I do not advise anything in this article, the purpose here is to provide all the details that surround it so you can make the best decision for yourself.
Pingback: Databases need to be patched to a minimum before April 2019
Would be helpful to link to the original blog entry
Hi John, what do you mean? Mike’s post? I linked it?
I did mean Mikes post and I couldn’t see any highlighting to show a link. Apologies Frits if I missed see it. John
No problem. I’ve heard the complaint that links are hard to see before.
Pingback: You MUST patch 12.1.0.1 and 11.2.0.3 and older before June 2019
As you pointed out, SCNs can only increase.
As of 23 June, if a db link between a scheme 1 and a scheme 3 database, wants to increase the SCN to a number greater than 2^48, the transaction will fail, as you also rightfully pointed out.
But SCNs are also increased with every ‘regular’ (non-dblink) transaction (including internal db transactions).
Let’s assume the scheme 3 database negotiates an SCN just below 2^48. The transaction will be successful. But isn’t there a danger that the scheme 1 database will simply stop working soon afterwards, because it needs to increase its SCN to a value beyond 2^48 during one of the following transactions?
I.e., db links failing to work may be the least of your worries.
If a database does absolutely nothing, it will still increase it’s global SCN every 3 seconds because of the 3 second log writer timeout. However, in almost any case that I have seen so far this is not an issue, both for scheme 1 and scheme 3 databases.
A database, even a busy one, will normally be way below RSL, for both scheme 1 and 3.
I think I can very bluntly say that if a database is close to its RSL, there must have been issues in the past which artificially increased the SCN value significantly.
The solution from oracle is to change the end value for an SCN and to change the scheme, which has become scheme 3.
If a database is changed to scheme 3, and happily runs within its (scheme 3) RSL, but the SCN value is high enough to be close to the sceme 1 RSL, then a database link could fail because the scheme1 databases rejects the scheme3 SCN.
Please mind there is another mechanism in play, which is that a database can stop processing if it deems the SCN value to ramp up too quickly. This too is to prevent a database from running out of SCN numbers.
Hello Frits,
I followed the same reasoning as you, but it seems that we have been misinformed by Oracle about the current SCN limit. The Doc ID 1376995.1 suggests we are still at 16K/sec. The document has a low ID, this means is an old document. It has in my opinion been improperly revised recently.
When you read this blog, https://orainternals.wordpress.com/2012/01/19/scn-what-why-and-how/, written in 2012, then you will notice that systems (11.2.0.2 and more, as we found out) have a parameter _max_reasonable_scn_rate, which is already on 32K. I tested this out on an SE 11.2.0.2. Changing the system time to the past, shows that it can go over 16k/sec, except when I set the parameter back to 16k, then it fails at startup with ORA-600.
This means, our limit is twice as high as you thought, and there is much less risk of hitting the limit. You probably are at 45 – 49 % of the limit and not 90 – 99 %. I suppose the document was created after the hotbackup bug, and recently the notion 32K was replaced by 96K, leaving the rest untouched.
I don’t know why they are switching from 32K to 96K, but I suppose it has to do with a possible DOS (Denial Of Service) hack, if you succeed connecting to a database with a 99.999 % SCN during peak time. They probably learned from their previous switch that it should be done in all systems simultaneously and not only in a new release. Enlarging the rate will probably not be the only change, since no rate can protect you against this. It also makes no sense having a user defined rate, since it should be the same on all interconnected databases.
Regards,
Dirk Vanhaute
Thank you for your response Dirk!
Yes, Oracle made several movements in this area, and most of them are not documented, nor brought to our attention in any other way. The only thing that has been brought to our attention is the date June 2019, which I think is supposed to be Oracle’s big change to solve SCN issues “for once and all”.
Based on rumour, so don’t trust me on this, Oracle has had a couple of lawsuits of companies having issues with SCNs and their rate leading to unavailability of these databases, which made Oracle make changes to make sure that that can’t happen again. One of the things you might notice is that you can’t use the parameters or events (I don’t exactly remember which), which could make you bump up the SCN in case of SCN issues.
That’s indeed what Oracle Support told me today. My 11.2.0.2 still has the parameters because the CPUJan2012 was not installed, but they advise us not to change it and install the patches instead, with fixed 32K/Sec, I presume.
My Oracle Enterprise database 11.1.0.7.24 (PSU 24) has not such a parameter as ‘kcmscnc_’.
The other requests with “dbms_scn” shows “target scheme:3” and “rollover enabled (1=yes):1”
Is this situation safe?
11.1 does not have a scheme 3, nor has the kcmscnc_ variable, which is why it says so.
all databases that have kcmscnc_ will have target set to 3 and rollover enabled.
this means these databases can have an higher SCN value and the older databases can’t, POTENTIALLY leading to issues.
however, as I indicated to several of my clients, this ONLY is an issue of your database has database links between version 1 and 3 databases AND a really high SCN value and rate of SCN usage in a version 3 database, which runs over the version 1 limit in the version 3 database.
That is exceptional. Please do mind that because of the SCN synchronisation, you only need 1 with a high SCN value, and it’ll be replicated through all linked databases, and huge SCN usage in a version 3 database.