One of our TSM servers started to experience large numbers of "ANR0530W - internal server error detected messages." With further investigation we identified that these were related to ANR0162W which are DB deadlock or timeout problems. These errors were causing our DB TDP backups to fail and I eventually called support. I provided our db2diag.log file and a dump of our actlog for the last 24 hrs. and they found the issue to be the DB2 reorg process was locking records and tables and creating the deadlock situation. The problem was compounded in that with TSM versions 6.2.3 or lower the DB2 reorg process cannot be schedule so it can kick off during backups or processes that can create these deadlocks. So to resolve the issue I was told I needed to upgrade our TSM server to 6.2.4 or higher. With TSM 6.2.4 and higher you can schedule the reorg process using the REORGBEGINTIME and REORGDURATION parameters to schedule the reorg within a window. You can see the details of the APAR here.
Friday, May 25, 2012
Thursday, May 10, 2012
Client Lockdown
A directive came down for 75+ windows servers to be "locked down" when it came to accessing their backup data. The TSM client will allow anyone to open the client GUI and restore files with which they have permission. So I considered the various ways to keep anyone from accessing TSM and restoring data; we could set permissions on the GUI and command line to not allow executing unless in the admin group, We could delete the command line and GUI executable, or we could simply set the SESSIONINIT option on the server. After weighing the options the SESSIONINIT was the easier and most direct way to keep anyone but a TSM admin from restoring data. Once SESSIONINIT is set the TSM client GUI, command line, and web GUI will not be allowed to initiate a session. All restores will have to be executed through a schedule from the TSM server. Of course you can temporarily turn SESSIONINIT off, but only a TSM admin can do so, making it easier to track who's accessed the data.
(Note: SESSIONINIT does not support the CAD)
The problem was how to update 75+ Windows servers options file and then restart the TSM Scheduler. So you can change any MANAGEDSERVICES options to WEBCLIENT using a client option set, but SESSIONINIT is another problem. As it turns out, if you set SESSIONINIT on the TSM server you have to put the HLAddress and LLAddress in the node definition on the server. The client dsm.opt must have the TCPCLIENTADDRESS and TCPCLIENTPORT in the dsm.opt. What we didn't know was that we also had to put SESSIONINIT SERVERONLY in the dsm.opt also. If you set SESSIONINIT on the TSM server and not on the client and the scheduler was defined with /validate:yes then you will get "password" errors and the scheduler will crash. The reason for this is that TSM does not allow client initiated sessions, but the scheduler when started is trying to validate its password. Since the scheduler can't validate the password it fails similar to when the password has not been set when using PASSWORDACCESS GENERATE.
We had the TCP Client settings, but adding SESSIONINIT to all 75+ servers would have been a chore...unless you know how to use the Windows command prompt and dsmcutil. Here's how I added the SESSIONINIT option to all 75 servers.
Example of how to remotely add a line to the dsm.opt
c:\echo SESSIONINIT SERVERONLY >> \\WINSERVPRD20\C$\progra~1\tivoli\tsm\baclient\dsm.opt
Posted by Chad Small at 5/10/2012 0 comments Links to this post
Labels: dsm.opt, dsmcutil, SESSIONINIT, TSM, TSM Client
Monday, April 30, 2012
DBMEMPERCENT...Where'd That Come From?
I was having performance issues with a couple TSM 6.2 servers and could not find anything that pointed to the issue. I'm not one to call support unless I'm totally stumped and cannot find help through the web, but this time I finally relented and made the call. The issue was problems with backups failing repeatedly and when researched we were getting internal server errors along with DB table errors. IBM support asked for some DB2 log files and within 30 or so minutes had identified the problem.
TSM has a server option I have never used or heard of that had somehow been set that adversely affected all backups. Somehow the option DBMEMPERCENT was set in the dsmserv.opt file. This option tells TSM what percentage of the overall server's memory it can allocate for use. The default is AUTO and would have been fine, but somehow DBMEMPERCENT was set to 10 in the dsmserv.opt. Which means out of 16GB of RAM I was only using 1.6GB?!?!? How'd that happen? I didn't set it, none of my coworkers remember setting it, so where did it come from? IBM support stated the default was AUTO so the option was manually set. Since I had never used this option and its 6.x specific, I never would have looked for it. Good thing I called support.
Posted by Chad Small at 4/30/2012 0 comments Links to this post
Labels: DB2, errors, Options, Performance, TSM, TSM 6.1, TSM 6.2
Friday, April 27, 2012
db2adutl Error
I recently had an issue with a client and storage agent upgrade that resulted in problems with the db2adutl utility being unable to return any data. Here's the errors:
I pretty much knew what caused the error, the problem was how to fix it. The cause was due to an upgrade of the TSM client on the DB2 server that (after further investigation) could not support the more current TSM Storage Agent. An OS patch would have to be applied, however, that could not be done without an outage. Our only option was to roll the client back to a supported TSM client / storage agent level. The problem was that while attempting to figure out a better solution than rolling back the client the DB2 database had run a backup. When the client API was rolled back it could not "interpret" the new API's backup causing the db2adutl errors.
Support suggested renaming the node or the file space (file space is better since you don't have to stop and start db2 to reset the password as you would with the new node name). I didn't want to have to do either. The backups taken since the rollback were good, but db2adutl couldn't return the list of backups as long as the objects done with the newer API were still present. Luckily I have been dealing with Oracle admins long enough to have a solid grasp on manually deleting objects on the TSM server. When Oracle DBA's neglect their RMAN duties, I pulled out my trusty delete object command and I was able to remove the backup objects from the period of time that the new API had been used. Once completed db2adutl was able to immediately see it's backups and return a list of what was available.
Posted by Chad Small at 4/27/2012 0 comments Links to this post
Thursday, April 26, 2012
TSM Power Admin
I was just made aware of TSM Power Admin by a fellow adsm.org contributor and must say I like some of the features available. I hope to be able to test it soon, but just the ability to run commands against all the servers from the command line (without setting up a server group) is a nice touch. If I do test Power Admin I'll post a review like I did for TSMManager years ago. (Wow it's been that long?!)
Posted by Chad Small at 4/26/2012 0 comments Links to this post
Labels: Administration, reporting and monitoring, TSM Admin, TSM monitoring
Monday, April 23, 2012
TSM 6.1 & 6.2 DB2 Issue
I had a TSM server crash mutliple times over the course a week and after working with Tivoli support and sending them the core files, it was determined that the following error was the cause. Interesting, in that I never thought about the connections from TSM to the DB2 DB. So to summarize, the current connection from TSM to DB2 is not a TCP based but IPC and AIX has a limitation of 1024 IPC connections to DB2 otherwise the application in question (TSM in this case) can crash. The following link has directions on how to convert TSM to DB2 connections to TCP to eliminate this issue.
Posted by Chad Small at 4/23/2012 0 comments Links to this post