Last week I’ve been involved in troubleshooting and getting a 2-node RAC database to work for a customer. (linux, RHEL3, oracle 10.2.0.2)
Among other things, I’ve found corruptions on an OCFS filesystem. In order to try to fix that, I wanted to run fsck. But before that I wanted a backup (always preserve the way back!). The client’s backup suite is Tivoli Storage Manager.
It appeared Tivoli Storage Manager (TSM) only was able to do the backup at the speed of about 5 megabyte per second. This immediatly reminded me of the O_DIRECT behavior of OCFS, but it had become night, the client wanted to work the next day and not too much time was left; meaning: though luck, that is what we have.
In order to see if the filesystem was corrupt from a database’s perspective, I decided to run db verify (dbv). During the db verify (on the other node of the RAC database cluster), I decided to tar a databasefile and see what my speed was. That appeared to be around 50 megabyte per second. That’s odd…that’s ten times faster than the TSM backup did. Little later, I saw the performance of my tar session decrease to the same 5 megabyte per second as TSM did. Little later I saw the db verify be ready….
That made me think: could it be db verify influences the IO done by another session on another file? Seems unlikely, but because db verify is non-intrusive, I decided to try to run it again. The outcome struck me: I got the “old” 50 megabyte per second performance back. ???
Little later I decided to find out which database files already were backuped by TSM and do a db verify on that node too. IO performance increase of ten times makes us wait much less! The same happened there: TSM backup performance increased to around 50 megabyte per second!!
The next day I thought about how this could be. Because of the lack of time, I didn’t fetch the code (OCFS is open source) and read through it. These are my findings:
– OCFS is optimised to deliver high speed just for the oracle database, using O_DIRECT calls
– OCFS is never meant to be a “normal” filesystem
– The documentation clearly states that only O_DIRECT enabled tools should be used to do file maintenance. There are even O_DIRECT versions of tar, cp, md5sum, dd, etc. on the website
– The ability to do IO using normal/non-O_DIRECT tools seems to more of a “workaround” than normal behavior
– If some O_DIRECT tool opens a file on the OCFS filesystem, it locks THAT file, but seems to provide other non-O_DIRECT processes the ability to better optimised IO