Monday, November 27, 2006

GPFS Revisited

Well I am still having issues with GPFS. It turned out the mmbackup wont work with the filesystem size either and a chat with IBM support was not encouraging. Here is what one of our System Admins found out:

The problem was eventually resolved by IBM GPFS developers. It turns out, they never thought their filesystem would be used in this configuration (i.e. 100,000,000 + inodes on a 200GB filesystem). During the time the filesystem was down, we tried multiple times to copy the data off to a different disk. Due to the sheer number of files on the filesystem, every attempt failed. For instance, I found the following commands would have taken weeks to complete:

# cd $src;find . -depth -print | cpio -pamd $dest
# cd $src; tar cf - . | (cd $dest; tar xf -)


Even with the snapshot, I dont think TSM is going to be able to solve this one. This will probably need to be done at the EMC level, where a bit level copy can be made.

So GPFS is not all it was thought to be. So pass it along and make sure you avoid GPFS for application that will produce large numbers of files.

2 comments:

  1. If there are 100 Million files in a filespace, would any human being be able to recall the name of any one specific file?

    So often, filespaces of this size are caused by bad practices, or bad coding.

    I know that the BR Admin doesn't get involved until the end, but that doesn't preclude a redesign the next time around.

    ReplyDelete
  2. It's quite possible (though not supported) to use the TSM image backup feature to back up the NSDs, assuming you can dismount the filesystem for a while.

    ReplyDelete