How an ASM diskgroup is found by the database.

The short answer is simple: this is done by the ocssd (Oracle Cluster Support Services daemon).

But how does that work more indepth?

The ocssd is needed for ASM. We all know that, because the database create assistant tells us that when we make an ASM instance. The ocssd is started from the inittab; the ocssd installation procedure ($ORACLE_HOME/bin/localconfig add) adds an entry to inittab:
h1:35:respawn:/etc/init.d/init.cssd run >/dev/null 2>&1

This entry tells us the following information:
-/etc/init.d/init.cssd is run in runlevels 3 and 5 (35)
-if init.cssd dies, init will start it up again (respawn)

The place of the init.cssd script is a little confusing; the directory /etc/init.d suggests it's a script which is used in the chkconfig start and stop scripts, which is not true, it is started via init/inittab.

After the ocssd is being started, it opens up a number of filedescriptors:
-some message files:
-- $ORACLE_HOME/srvm/mesg/procus.msb
-- $ORACLE_HOME/css/mesg/clssus.msb
-- $ORACLE_HOME/has/mesg/clsdus.msb
-some logging:
-- $ORACLE_HOME/log/hostname/cssd/cssdOUT.log (3 times)
-- $ORACLE_HOME/log/hostname/cssd/ocssd.log
-- $ORACLE_HOME/log/hostname/alertpmoracle.log
-the cluster registry disk, which can be file on the local (non-cluster) filesystem because the ocssd is the only process/daemon which uses it:
-- $ORACLE_HOME/cdata/localhost/local.ocr
-some sockets on the local filesystem:
-- /var/tmp/.oracle/shostnameDBG_CSSD
-- /var/tmp/.oracle/sOracle_CSS_LclLstnr_localhost_0
-- /var/tmp/.oracle/sOCSSD_LL_hostname_
-- /var/tmp/.oracle/sOCSSD_LL_hostname_localhost
-and some network sockets:
-- UDP [::1]:1027
-- UDP [::1]:1028
-- UDP [::1]:1029
-- UDP [::1]:1030
-- TCP
-- TCP

(this information is found in /proc/pid/fd and using 'lsof')

When the ASM instance is started, it announces itself by the ocssd using the gmon background process:

First phase:
(this sequence of communications and actions is done for every process which contacts the ocssd;
the 'authentication sequence')
-Communication is initiated using the socket /var/tmp/.oracle/sOCSSD_LL_hostname_
This means a new private socket is created for the communication between ocssd and the ASM instance.
-A directory and file is created by the ocssd: $ORACLE_HOME/css/auth/directory
The directory name is 8 digit hexadecimal, the filename is also 8 digit hexadecimal.
-The ocssd communicates the path+filename through the socket
-Next the ocssd gets two messages through the socket
The second one triggers the ocssd to read the file with the hexadecimal name.
-The ocssd reads the file which consists of 4 bytes
-The ocssd reads /etc/passwd
-The ocssd sends a message and receives one
-The directory with the 8 digit hexadecimal name is removed
-The ocssd reads a message with the PID in it of the process which starts the ASM instance.

-The ocssd writes a message with its $ORACLE_HOME
-The ocssd reads a message
-The ocssd cleans up the private socket.

Second phase:
-The authentication sequence is done
-The ocssd reads a message with the diskgroup in it
This is the ASM instance announcing it's diskgroup (one in my case).
-The ocssd writes a message with the diskgroup in it
-The socket is kept open

Third phase:
-The authentication sequence is done for the gmon process of the ASM instance
-The gmon process and the ocssd communicatie the following things:
--instance name (+ASM)
--diskgroup name(s); with and without 'DG_' in front of the diskgroup name

At this point the ASM instance is running.

The database which uses ASM does roughly the same:
(please mind the 'authentication sequence' is done for every process for every communication to ocssd)
-The startup process asks ocssd about diskgroups, $ORACLE_HOME and instance name
This is done several times using several processes
-The first background process which asks ocssd about ASM is the ASMB process
-Then a process called 'oracle_ASM instance name_asmb_database name' gets information
This process is not listed in the database instance in v$session (?). This process also looks at the registry disk.
It creates tracefiles in the trace directory of the 'normal' database.
-Then the checkpoint process
-Then the RBAL process.
First declaration of keyword 'LOCAL', presumably for declaration of local failgroup (?)
-Then a process which is gone after startup of the database
-Then dbw0
-Then lgwr
-Then server process of the sqlplus session which started the database
-Then mmon, smon, dbrm, fbda, q001
-Then 5 processes which do not exist anymore when the database opens

1 comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: