Tuesday, November 22, 2005

Setting Up A Secondary TSM Instance

Someone requested a post on how to setup a secondary instance of TSM on a UNIX server so here is the skinny on how that is setup:

First create a directory where the config files for the new server will be stored.

mkdir /usr/tivoli/tsm/serverb
mkdir /usr/tivoli/tsm/serverb/bin

then copy the dsmserv.opt over and modify the needed settings in it like devconfig and volhist to save in the new dir.  Then create DB and Log volumes that this instance will use. Once those are created you need to export the following environmental variables:

export DSMSERV_CONFIG=/usr/tivoli/tsm/serverb/bin/dsmserv.opt
export DSMSERV_DIR=/usr/tivoli/tsm/serverb/bin
export DSMSERV_ACCOUNTING_DIR=/usr/tivoli/tsm/serverb/bin

Now you can run the dsmserv format command to initialize the DB and Log volumes and it will create the dsmserv.dsk in the serverb directory.  Make sure you run the dsmserv runfile commands to load the scripts and webimages (even if TSM 5.3).  The final step is to create the startup script so that TSM initializes correctly.  Here is our script:


exec > /tmp/libserv.out 2>&1  #optional -> sends output to out file

ulimit -d unlimited

export DSMSERV_CONFIG=/usr/tivoli/tsm/serverb/bin/dsmserv.opt
export DSMSERV_DIR=/usr/tivoli/tsm/serverb/bin
export DSMSERV_ACCOUNTING_DIR=/usr/tivoli/tsm/serverb/bin

print "$(date '+%D %T') Starting Tivoli Storage Manager Server"
cd /usr/tivoli/tsm/serverb/bin


We use the following command to start TSM so we don’t have to deal with nohup:

echo “/usr/tivoli/tsm/serverb/bin/rc.adsmserv” | at now

This uses the at command to run the script immediately.  You can then edit the inittab and place a line in to start this instance on boot or put a script in the run level startup folders, your choice.  You should now be ready to run the second instance.  One note to those thinking of doing this and sharing a library, Tivoli recommends that you create a TSM instance to be just a library manager, no clients, no real work other than handling the library and tape mounts.  I agree with this and it has been a lot easier to manage and handle library issues.  Not knowing how large the DB could get I gave it 2GB and it is currently 3.8% utilized, and it has been in place for over a year and a half. Swapping a library manager from one system to another is not as hard as it would seem so consider it and if anyone wants docs on how to do the switch let me know I’ll post it.    

Monday, November 21, 2005

UNIX Permission Issues

Awhile back I ran into an issue with how TSM handles permissions on UNIX files and wanted to get some feedback from you readers out there on how you would handle it.  What happened was a user somehow was given root and he chown’ed the /home dir recursively.  It was made worse by the fact that he did that on Friday and didn’t alert anyone until the following Monday, and by the time it got to us another day had passed.  The customer of course wanted us to restore the directory and file permissions, but the kicker was that TSM does not back a UNIX file up again when the permissions change.  It just updates the database to reflect the permission changes (I got that directly from support and was floored; I had no idea it handled UNIX that way).  So here was our dilemma, if the file was the only version in backup I would not have any way of resetting its permissions.  Is the gravity of the situation hitting home?  Because it doesn’t backup the file again or track permissions I could not successfully restore to a point-in-time.  Sure I might get a good portion of the files fixed but there would still have been a large portion that we would be unable to get the permissions corrected.  The customer wasn’t happy and our only out was that the customer should not have been doing chown‘s as root.  I thought I once saw someone post a undocumented option you can set in the options file that will backup a file if it changes in any way, permissions included, but I can’t find it.  I thought I saw it on the new ADSM.org but am unable to locate it.  Anyone know the option or have an idea on how to approach this?  I brought it up with some Tivoli people who asked me what I thought should be added or changed in TSM, but so far I haven’t seen any change in their processing.          

Monday, November 14, 2005

Poll Results

I closed the poll covering things we would like to see in future releases of TSM. I expected the removal of the ISC interface to win hands down but actually it was spread pretty even between 5 of the selections. Although Get rid of ISC interface and Return of the old web interface were the two higher vote getters I was suprised how evenly spread the voting was. I personally voted for conversion to DB2 database but I have been whining about that for years...ask the Tivoli folks I am sure they are familiar with my whining (I'm hoping for a cheese basket this christmas from the developers...oh and please no stinky cheese the wife is pregnant and foul odors make her sick).


Many people have been complaining about the ISC+TSM AC and I have been one of them.  The concerns have been plentiful, “We have to learn a new interface”, “There’s no DRM functions”, and “It requires a server just to support it!”  All these are valid protests and the one that bothers me the most is the fact that in a DR situation you would have to rebuild the ISC system/instance along with the TSM server to have web accessibility.  This adds time to an already urgent situation.  So what good does the ISC+TSM AC provide?  For starters a single interface for accessing all your servers, a single login, more functions when it comes to hardware management, and in the event of a disaster it forces you to learn the command line.  I know the last item might frustrate a lot of people but the truth is you need to know the command line to be a proficient TSM administrator.  I love the web interface and I recommend TSMManager, but when in a DR situation you have to be able to handle the command line if you want to get back up and running. Granted you have no other choice than command line until the system comes back up and is running again, but afterwards you’ll need to do typical admin work and the ability to do it through the command line will increase you rebuild speed helping you meet SLA time frames.  I don’t think Tivoli had this in mind when they went to the ISC+TSM AC but in my opinion too many people rely on the web and don’t learn the commands needed to be truly proficient.  I find it amazing how many don’t even know how to use the HELP command.  So I could complain about the change in interface but change happens and although we don’t always like it (I don’t care for this one) we have to be able to change and adapt with it if we want to last in this work force.

Sunday, November 13, 2005

King Of All Backups! (AKA LAN-Free to Disk)

About a year ago we were tasked to setup a large multi-clustered Exchange server and provide the best possible backup and restore performance.  After much debate and research we decided on using LAN-Free to disk.  The system was a 7 node Windows 2003 cluster connected to an SAN disk array (I can’t remember if it was EMC or Dell). The first 5 nodes were Exchange servers, the 6th was the failover node and the 7th was turned into a TSM server.  The TSM server instance had 5 500GB secondary disks assigned to it for the backup of the five Exchange servers.  These five disks would be mapped one to each Exchange server allowing for the backup to occur across the SAN to the disks owned by the TSM server. To utilize the LAN-Free to disk capability we had to install Tivoli’s SANergy product.  SANergy is no longer a separate product but is now part of a TDP/Agent type install package for TSM.  We actually installed and configured SANergy first, which was easier than it seemed in the directions, then mapped the drives.  When configured with SANergy the mapped drives become accessible across the SAN as long as the clients are on the same disk SAN fabric.  So we now had mapped SAN-accessible drives and could backup the Exchange servers to disk using the FILE device class.  The FILE device class is the device class used since TSM does not support LAN-Free backups to diskpools at this time.  The FILE device class works like a virtual tape and it was configured to migrate the data a few hours before the next backup would occur, or when the storage pool reached a specific usage threshold.  The reason for this was to allow almost a 24 hr. timeframe for a restore and along with the new Exchange 2003 restore capabilities internally; it provided a high performance backup/restore solution.  We tested the backups against a 360GB DB and backed it up in 90 minutes.  People were impressed, but they wanted to see how it performed on restore. We then restored the same amount, 360GB, in 91 minutes. WOW!  It was amazing to see those numbers (68MB/s).  We even tested it with the failover node by mapping all 5 SANergy defined drives to the failover node and still saw the same numbers. We had everything ready to go when the account decided they wanted to go in another direction.  Weeks spent configuring and implementing the solution all for not!  At least I have the experience and know it works.  So if anyone is looking to do LAN-Free to disk it works, it’s fast, it takes a lot of admin work, and it will be a good solution for anyone looking for a high performance backup/restore environment.

Wednesday, November 02, 2005

Submit A Question Or Topic

If anyone would like to submit a topic they would like to see covered or have a question about TSM or tape/SAN issues or strategy please e-mail me at chadsmal@us.ibm.com. With 5.3.2 out soon there will some things to cover. I am also thinking of posting an article on my LAN-Free to disk trial that was a great success. If anyone is interested in LAN-Free to disk let me know and I'll post my experience. If you would like to submit a post I am open to having guest contributors. Even though the name says TSMExpert I do not profess to know it all. Your contribution or questions help others out there.