Friday, May 25, 2012

TSM 6.2.3 and Lower DB Reorg Issue

One of our TSM servers started to experience large numbers of "ANR0530W - internal server error detected messages."  With further investigation we identified that these were related to ANR0162W which are DB deadlock or timeout problems. These errors were causing our DB TDP backups to fail and I eventually called support. I provided our db2diag.log file and a dump of our actlog for the last 24 hrs. and they found the issue to be the DB2 reorg process was locking records and tables and creating the deadlock situation. The problem was compounded in that with TSM versions 6.2.3 or lower the DB2 reorg process cannot be schedule so it can kick off during backups or processes that can create these deadlocks. So to resolve the issue I was told I needed to upgrade our TSM server to 6.2.4 or higher. With TSM 6.2.4 and higher you can schedule the reorg process using the REORGBEGINTIME and REORGDURATION parameters to schedule the reorg within a window. You can see the details of the APAR here.

Thursday, May 10, 2012

Client Lockdown

A directive came down for 75+ windows servers to be "locked down" when it came to accessing their backup data. The TSM client will allow anyone to open the client GUI and restore files with which they have permission. So I considered the various ways to keep anyone from accessing TSM and restoring data; we could set permissions on the GUI and command line to not allow executing unless in the admin group, We could delete the command line and GUI executable, or we could simply set the SESSIONINIT option on the server.  After weighing the options the SESSIONINIT was the easier and most direct way to keep anyone but a TSM admin from restoring data.  Once SESSIONINIT is set the TSM client GUI, command line, and web GUI will not be allowed to initiate a session. All restores will have to be executed through a schedule from the TSM server. Of course you can temporarily turn SESSIONINIT off, but only a TSM admin can do so, making it easier to track who's accessed the data.


(Note: SESSIONINIT does not support the CAD)


The problem was how to update 75+ Windows servers options file and then restart the TSM Scheduler. So you can change any MANAGEDSERVICES options to WEBCLIENT using a client option set, but SESSIONINIT is another problem. As it turns out, if you set SESSIONINIT on the TSM server you have to put the HLAddress and LLAddress in the node definition on the server.  The client dsm.opt must have the TCPCLIENTADDRESS and TCPCLIENTPORT in the dsm.opt. What we didn't know was that we also had to put SESSIONINIT  SERVERONLY in the dsm.opt also. If you set SESSIONINIT on the TSM server and not on the client and the scheduler was defined with /validate:yes then you will get "password" errors and the scheduler will crash. The reason for this is that TSM does not allow client initiated sessions, but the scheduler when started is trying to validate its password. Since the scheduler can't validate the password it fails similar to when the password has not been set when using PASSWORDACCESS GENERATE


We had the TCP Client settings, but adding SESSIONINIT to all 75+ servers would have been a chore...unless you know how to use the Windows command prompt and dsmcutil. Here's how I added the SESSIONINIT option to all 75 servers.

Example of how to remotely add a line to the dsm.opt


c:\echo SESSIONINIT SERVERONLY >> \\WINSERVPRD20\C$\progra~1\tivoli\tsm\baclient\dsm.opt

I put the command for all 75 into a batch file and ran it looking for errors (not all our servers had TSM installed in the default location). Then I used the dsmcutil command to stop and start the TSM scheduler remotely.

Example of stopping and starting the TSM Scheduler service

c:\C:\progra~1\tivoli\tsm\baclient\dsmcutil stop /name:"TSM Client Acceptor" /machine:WINSERVPRD20
c:\C:\progra~1\tivoli\tsm\baclient\dsmcutil start /name:"TSM Client Acceptor" /machine:WINSERVPRD20

I needed to restart the scheduler on all 75 servers so once again I created a Windows command line batch file with the following commands in a list and it restarted all 75 quickly and easily from my own desktop.