Using git for revision control for Oracle DBAs

This post is aimed at people working with code, scripts and/or any other means of textual files. I try to give my point of view on revision control and git as revision control system in particular.

The first thing you should ask yourself is: why using revision control in the first place? I think that’s a good and fair question. A lot of people I talk to see revision control as something that’s for developers in projects with multiple people working on the same code to build history and provide a single point of truth. I think revision control in that situation indeed is needed (in fact, I think it is a necessity).

Let’s now look at the situation of a DBA. Most people I work with built up their own bundle of scripts to circumvent constantly redoing stuff they’ve worked out in the past, and/or scripts they gotten or borrowed from other people (in fact, if you do not keep a copy of the excellent script bundles of Tanel Poder and Kerry Osborne you’re either brilliant enough to do this all from the top of your head, or stupid enough not to use them or know them, with a fair chance of falling in the second category).

Probably most (if not: all) people change scripts, create entirely new ones, experiment, etc. How many times where you searching for this specific version/modification you made *somewhere*, but forgot where it was? In my own situation, when doing research, experiments and investigations, I use a lot of virtual machines (with/without ASM, different operating system versions, different Oracle versions; there are huge differences between 11.2.0.1/2/3/4, and Oracle version 12), and need my scripts and what to conventiently move changes and use newly created scripts among these machines. I don’t think I even have to go in the area of seeing the history of changes to a certain script or the repository as a whole: this is something you will use when using revision control, or miss sorely if your scripts are not in revision control.

Once you or your team is convinced you need revision control the immediate second thing which always pops up (in the situations I worked in, I am aware there are other revision control systems) is subversion or git? If you look at the title of this blog you know where I end up, but it’s good to give this some thought. Subversion is a decent revision control system, with which I’ve worked with great pleasure in the past. The history I’ve read on subversion is that it was made to be an open source version of the CVS revision control system and overcome some of the problems/limitations of it. Subversion is a revision control system that works in a client/server way: there is the central repository, and clients check out the source from that. This is no problem when client and server are on the same machine or in the local network, and even less a problem if there’s only one user.

But what if there are multiple persons working with it? And these people are located at vast distances from each other? And some of the people do not have internet access all the time? The problems that arise from that are a lot of potential problems with concurrent versions, performance can be very bad, because you need to connect to the central repository, and you need a connection to the repository in the first place to commit your change. From what I’ve read from the git revision control system, these were some of the problems Linus Torvalds wanted to overcome with git.

A git repository always works local on the machine you are actually working. You can link remote git repositories and push your changes to one or more repositories, or the other way around pull changes from a remote repository to your repository. I guess most people immediately understand why this easily overcomes a lot of problems that where the result of having a single point as the repository. You always work in your own repository.

I’ve found this (git versus subversion) yet another topic people easily get into a ‘religious discussion’, which means the discussion isn’t about actual properties and pros and con’s of both revision control systems, but on personal preference and sometimes ego. Another thing which is closely related to this is there is a learning curve if you need to start with using git.

Installation
I work on Mac OSX. In order to get git, just install the Xcode package from the App Store (free). On Oracle Linux, yum install the git package. I guess on RedHat this works the same. On Windows, download the installer from http://msysgit.github.com/ and run it. (please mind the examples in this blog are on OSX, and applicable to linux)

Initial configuration
Once you’ve installed git, the next thing to do is set your credentials:

$ git config --global user.name "John Doe"
$ git config --global user.email johndoe@example.com

I’ve set some aliases (which I’ve found in articles on git), to shorten typing a bit:

$ git config --global alias.co=checkout
$ git config --global alias.ci=commit
$ git config --global alias.st=status
$ git config --global alias.hist=log --pretty=format:"%h %ad | %s%d [%an]" --graph --date=short

Okay, at this point you are set, but no repository has been created yet.

Create a repository
The next obvious step is to create a repository:

$ git init

This creates a .git directory in the current working directory, which means you are now in the root directory of the newly created repository. Please mind you can initialise a repository in a root directory of a current project. It will still be an empty repository.

One of the other ways to start using a git repository which might have caught your eyes a few times, is clone ($ git clone git://github.com/username/reponame.git); which creates a local revision controlled copy of a remote git repository.

Let’s say I got a directory which contains a file ‘a’, and a subdirectory ‘tt’ which contains ‘b’ and ‘c':

$ find .
.
./a
./tt
./tt/b
./tt/c

Now start a git repository in the root:

$ git init
Initialized empty Git repository in /Users/fritshoogland/t/.git/

Now let’s look at how our repository looks like:

$ git status
On branch master

Initial commit

Untracked files:
  (use "git add ..." to include in what will be committed)

	a
	tt/

nothing added to commit but untracked files present (use "git add" to track)

Here we see git showing us that there are no files in it, but that it sees our test files, which are currently untracked. Let’s add all the files:

$ git add *

If we look at the status of the repository, we see the files are added, but not yet committed:

$ git status
On branch master

Initial commit

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

	new file:   a
	new file:   tt/b
	new file:   tt/c

And commit the additions:

$ git commit -m 'initial commit.'
[master (root-commit) 735ed81] initial commit.
 3 files changed, 3 insertions(+)
 create mode 100644 a
 create mode 100644 tt/b
 create mode 100644 tt/c

If you commit something, git wants you to add a remark with this commit. This can be done with the git command by adding ‘-m’ and a comment within quotation marks, or by omitting this, which makes git fire up an editor, in which you can type a comment.

Changes
At this point we have our files committed to the repository, and the files and repository are completely in sync:

$ git status
On branch master
nothing to commit, working directory clean

Now let’s change something. My file ‘a’ has got one line in it, with a single ‘a’. I add a second line to the file ‘a’ reading ‘second line':

$ cat a
a
second line

Now let’s ask git for the status:

$ git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   a

no changes added to commit (use "git add" and/or "git commit -a")

Here we see that git knows about the modification in file a. The next thing we see is the line ‘Changes not staged for commit’. This is something which is different from subversion. Instead of changes which are committed, git follows a two stage approach: a changed file needs to be staged for commit first, after which it can be committed. This allows you to group changed files for a commit, instead of committing all.

Now let’s stage the change as indicated by the help provided with ‘git status’, using ‘git add’. Please mind the ‘add’ command is used both for making files version controlled and adding them to the stage list. I issued a subsequent git status to show the new status:

$ git add a
$ git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	modified:   a

We can look at the difference between the version in git and changes made by using diff –cached:

$ git diff --cached
diff --git a/a b/a
index 7898192..c4fbe32 100644
--- a/a
+++ b/a
@@ -1 +1,2 @@
 a
+second line

Now let’s commit the change, and look at the status:

$ git commit -m 'added line to a.'
[master 17ec0e8] added line to a.
 1 file changed, 1 insertion(+)
$ git status
On branch master
nothing to commit, working directory clean

Now we build a tiny bit of history, we can look at it using git log. This command can be used to look at the changes of the entire repository, or at the history of a specific file. I use the alias ‘hist’ created above:

$ git hist a
* c8ec93c 2014-05-25 | added line to a. (HEAD, master) [FritsHoogland]
* 739f11a 2014-05-25 | initial commit. [FritsHoogland]

One common thing you want to do is to see what changes are made to a file. To see what the changes are between the current version in the repository (which is called ‘HEAD’, which means ‘last commit in current branch’), use git diff, and the change hash (look at the above example of git hist):

$ git diff 739f11a a
diff --git a/a b/a
index 7898192..c4fbe32 100644
--- a/a
+++ b/a
@@ -1 +1,2 @@
 a
+second line

Of course you can also checkout that specific version (that’s the reason for having version control!):

$ git checkout 739f11a a

Now the file a is reverted to the version without the line ‘second line’ in it. Because we changed the file a now again, git considers it changed:

$ git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	modified:   a

So after checking out this version you can choose to commit this as new version, or revert it back to it’s original version using get reset HEAD (as described with the status command).

Remote repositories
However, everything described here until now are changes and version control done locally. How about collaboration with a team, like subversion has with the subversion server? Well, this is where git is inherently different. Git works in a peer-to-peer fashion, instead of a client-server way like subversion.

You can link your repository to a remote git repository, and push changes made in your local repository to a remote repository, or pull changes made in a remote repository to your own local repository. Especially if you work with a team, this is how you can centralise the source code in a very organised way (you can add a web interface for example).

These is how I create a remote repository on my synology NAS:

$ ssh Username@nas.local                                             # log on to remote server
$ mkdir burn.git                                                     # create directory for repo
$ cd $_
$ git init --bare                                                    # create empty git repository
$ git update-server-info                                             # not sure if this is needed
$ git remote add origin Username@nas.local:burn.git                  # add remote
$ git push -u origin master                                          # push local master to origin 

One of the things I use to keep track of current development version and the public client version of scripts is tags. This command tags the last commit with “prod 1.0″:

$ git tag "prod 1.0" HEAD

In order to use tags, you need to see which tags exist. This is very simple with the ‘git tag’ command. To look at the tags of a remote git repository, use:

$ git ls-remote --tags Username@nas.local:gdb_macros.git

If you want to get the whole repository for production usage, without the versioning, use the ‘git archive’ command (remove –remote and argument for archiving from a local repository). This version creates a gzipped tarball:

$ git archive master --remote Username@nas.local:gdb_marcos.git --format=tar | gzip > gdb_macros.tgz

You can also archive a tag, instead of the latest version. This version creates the files in the current directory:

$ git archive "prod 1.0" --remote Username@nas.local:gdb_macros.git --format=tar | tar xf -

I hope this blog encouraged you to put your files in a (git) repository.

About these ads
9 comments
  1. +1

    I never understood why database code (including scripts) is not handled like other code and stored in a decent revision control system. And git is a very decent revision control system.

    By the way: my only problem with Tanel’s scripts is that I can’t remember the script names – and just as with posix commands you don’t know what happens when you hit the wrong key by accident.

    • yes, the missing familiarness may be the key.

      The index in the blog post is helpful and I would even say that it is a good idea to check all the scripts under http://blog.tanelpoder.com/files/scripts/ and determine their purpose – but I am not able to memorize that for example a script named acc.sql is meant to set the parameter session_cached_cursors to a given value: short names are better to type but a little bit more verbosity would sometimes help. On the other side: of course it is his library and he may chose the names he can memorize…

    • Thank you Steve. Although yum install git did work for me on oracle Linux 6.

  2. Dax said:

    Hi Frits,
    In my company we have DB release manually and we also don’t have version control. Can you pl. suggest any tool/process for DB version control and release process ?

    • Hi Dax,

      There are several routes you can go here. What is best totally depends on the actual situation, and also on the willingness to use additional tools.

      The first point I want to make is: if there is no version control at all, there is something wrong (or, more friendly put: a lot to enhance). Even if the development team is one guy, you want to have a convenient way to see how the source code evolved. This means just put the source code on the developer workstations in version control. (of course this should be done in a way so that people are actually start using it, so they need to get a chance to learn it. the whole aim of my blog is to do just that: try to make people more aware of the possibilities!)

      You can take a look at SQL Developer, which got support for git.

      Probably the first thing is to make sure all code in the database is backed by the code to generate it on a filesystem, which you can put inside a version control system once it’s on a filesystem.

      There’s also the redgat version control for in the database (http://www.red-gate.com/products/oracle-development/source-control-for-oracle/), but I haven’t looked at it yet.

  3. Bjorn Naessens said:

    Nice article :) I’m in the process of putting all my scripts in this revision system ( used to use SVN but got a new macbook so decided to switch to git )

    Found some errors in the commands you are using for the alias creation.

    This does not work :
    $ git config –global alias.co=checkout
    $ git config –global alias.ci=commit
    $ git config –global alias.st=status
    $ git config –global alias.hist=log –pretty=format:”%h %ad | %s%d [%an]” –graph –date=short

    But this does :
    $ git config –global alias.co checkout
    $ git config –global alias.ci commit
    $ git config –global alias.st status
    $ git config –global alias.hist ‘log –pretty=format:”%h %ad | %s%d [%an]” –graph –date=short’

    You can also just add the following to .gitconfig

    [alias]
    co = checkout
    ci = commit
    st = status
    hist = log –pretty=format:\”%h %ad | %s%d [%an]\” –graph –date=short

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 2,189 other followers

%d bloggers like this: