OCFS version 1 odd behavior

Last week I’ve been involved in troubleshooting and getting a 2-node RAC database to work for a customer. (linux, RHEL3, oracle

Among other things, I’ve found corruptions on an OCFS filesystem. In order to try to fix that, I wanted to run fsck. But before that I wanted a backup (always preserve the way back!). The client’s backup suite is Tivoli Storage Manager.

It appeared Tivoli Storage Manager (TSM) only was able to do the backup at the speed of about 5 megabyte per second. This immediatly reminded me of the O_DIRECT behavior of OCFS, but it had become night, the client wanted to work the next day and not too much time was left; meaning: though luck, that is what we have.

In order to see if the filesystem was corrupt from a database’s perspective, I decided to run db verify (dbv). During the db verify (on the other node of the RAC database cluster), I decided to tar a databasefile and see what my speed was. That appeared to be around 50 megabyte per second. That’s odd…that’s ten times faster than the TSM backup did. Little later, I saw the performance of my tar session decrease to the same 5 megabyte per second as TSM did. Little later I saw the db verify be ready….

That made me think: could it be db verify influences the IO done by another session on another file? Seems unlikely, but because db verify is non-intrusive, I decided to try to run it again. The outcome struck me: I got the “old” 50 megabyte per second performance back. ???

Little later I decided to find out which database files already were backuped by TSM and do a db verify on that node too. IO performance increase of ten times makes us wait much less! The same happened there: TSM backup performance increased to around 50 megabyte per second!!

The next day I thought about how this could be. Because of the lack of time, I didn’t fetch the code (OCFS is open source) and read through it. These are my findings:

– OCFS is optimised to deliver high speed just for the oracle database, using O_DIRECT calls
– OCFS is never meant to be a “normal” filesystem
– The documentation clearly states that only O_DIRECT enabled tools should be used to do file maintenance. There are even O_DIRECT versions of tar, cp, md5sum, dd, etc. on the website
– The ability to do IO using normal/non-O_DIRECT tools seems to more of a “workaround” than normal behavior
– If some O_DIRECT tool opens a file on the OCFS filesystem, it locks THAT file, but seems to provide other non-O_DIRECT processes the ability to better optimised IO

1 comment
  1. naqi said:

    This reminds me of an issue at a customer site, where they were doing a backup of their 1.2 TB database (, RAC) using RMAN and netbackup. No matter what combination was used, be it filesperset or maxopenfiles, we were only ever to recieve a backup throughput of 50mb/s. I had a look at the efftive_bytes_per_Second column from v$backup_async_io and input wise we were only getting 50mb/s , which meant that was how much we would (at maximum) write at. Your mention of 50mb/s just reminded me of that. We were never able to push it above that, even with a closed (no users connected) database backup. is 50mb/s some kind of limit ?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: