TSM Topics Feed

Tuesday, June 21, 2016

Restoring TSM Without A Volhist

Someone in the comments to an old post just asked for directions/instructions on restoring TSM without a volume history or devconfig file. Well, I got some bad news and some not so bad but not fun news. We will start with the not so bad news. If you don't have a devconfig, don't panic! You can recreate the devconfig. That's fairly simple, just a pain. TSM has to have a devconfig file to initialize its devices so if the devconfig is not present you'll have to create one. Typically you do this when you rebuild a TSM instance. For example at a DR site you install TSM on the DR server, define the dsmserv.opt, and then you define base devices on the new install. Once that has been done you can bring down TSM and attempt a restore using the newly defined device(s). 

Now for the bad news. Without the volhist, if you don't know what volume(s) were used for DBBAckup your kind of screwed. The old DSMSERV DISPLAY DBBACKUPVOLUME command has been removed/deleted and IBM now says the following

DSMSERV DISPLAY DBBACKUPVOLUME - Information about volumes used for database backup is available from the volume history file. The volume history file is now required to restore the database. 

You can find a list of TSM Server deleted commands, utilities, and options at the following link.

Monday, April 11, 2016

DR Test - Things learned

I just did a DR test from one data center to another involving TSM and our Data Domain (DD) which we have configured for NFS and VTL usage. Things to know...

  1. We backup the TSM DB to the DD NFS file system
  2. The TSM server was not brought up on its own LPAR in the DR site, but shared with an alternate TSM instance.
  3. The DR site could not facilitate LAN-Free like the primary site.

So we built the secondary instance on the LPAR currently running a TSM server that services the customer's development environment. Then I disabled the replication pair and we mounted it to the LPAR so we could restore the TSM DB. This is where our main problem rose it's head. The NFS file system from the DD was mounting under the primary TSM instances ID, So while we wrestled for this for an hour or so, I realized after Googling the issue and reading the DD notes from people that the problem was the configuration. I would have been fine disabling the replication pair and mounting it to the TSM LPAR if it had been the default user ID, but the primary instance was the owner and we could not change permissions due to what is allowed by the default ID and settings from the DD. So I had to unmount the DD NFS file system to delete the pair on the DD then remount it with the full read/write permissions. I was then able to mount it under an alternate ID. Once we overcame this we were able to start the TSM DB restore which is where our second issue arose.
We were restoring the TSM DB and the active logs were not being restored to the active log directory. The first time I used dsmserv restore db and it ran fine until all the DB records were restored and I received the following error:

ANR2970E Database rollforward terminated - DB2 sqlcode -1004 sqlerrmc TSMDB1

The restore process restored the logs to the instances home directory eventually filling the filesystem to 100% and erroring out. I thought the logs were recovery log related so I then added the RECOVERYLOGDir option to the restore command and got the same results. This wasted an hour to achieve the same results, so after some more Google searches and talking to IBM support I decided to add the ACTIVELOGDIR option to the restore. I didn't add it due to the IBM support tech suggesting it (he didn't) I just realized recovery log was not filled with any logs and the only other logs they could be are Active Log files. I added the ACTIVELOGDirectory option to the restore command and DB restored worked without any errors. The question is why didn't TSM use the ACTIVELOGDirectory option stated in the dsmserv.opt? The RECOVERYLOGDir option was used but the log for recovery were never more than maybe 1GB, but the active log was over 53GB and the db2diag.0.log registered the error that no recovery log directory was listed so the default would be used. What the hell??? It is listed in the dsmserv.opt...

ACTIVELOGDirectory          /drtsmserver/tsm30log
ARCHLOGDirectory            /drtsmserver/tsm30arch

So I post this so you can learn from my mistakes. The final restore DB command was

dsmserv restore db on=db.list recoverydir=/drtsmserver/tsm30fail activelogdir=/drtsmserver/tsm30log

Tuesday, April 05, 2016

Find WWN's in AIX

So I use the following script to find WWN's in AIX. Does anyone have a better script they'd like to share?



for FCSX in `lscfg | grep fcs | awk '{ print $2 }' | sort`
echo ${FCSX}${CSV}`lscfg -vl ${FCSX} | grep "Network Address" | sed -e "s/^.*\.//"`${CSV}`lscfg -l ${FCSX} | awk '{ print $2 }'` 

Tuesday, January 05, 2016

Oracle RMAN Catalogue Cleanup - Revisited

Over a Decade ago I wrote an article on Object cleanup due to the issues with RMAN and our DBA's not keeping it sync'd with TSM. I then revisited it in 2007 and since then have not had to use it much. With version 6 of TSM I have not worried about my DB size, or old data not expiring since TSM seems to handle it somewhat better than the older versions did. I recently had to use the command and realized that the delete object command as it was used in TSM 5 is not 100% correct for TSM 6. So when trying to use the old command

TSM v5

delete object 0 [Object ID Number]
The 0 is telling TSM how many dependent objects to delete (at least I believe that's what it does, I can't fully remember). But with TSM 6 you don't need to provide the dependent object count in the command.

TSM v6 and higher

delete object [Object ID Number]

This command works (although it can sometimes take awhile to complete the command). So using the correct select you can produce a list and delete the objects from TSM. Reclamation with return the space but my understanding is that this is an individual expiration process and remember it is not a command support will help you with. USE AT YOUR OWN RISK!

Here is the select:

select 'delete object', object_id from backups where node_name=[TDP NODENAME] and
backup_date < '2007-06-01 00:00:00'

Monday, September 28, 2015

TSM (Spectrum Protect) Symposium 2015, Dresden (in pictures)

We made it:

About 340 people!!!

 Norbert Pott (IBM Germany), Tommy Hueber (Rocket Software, The Netherlands, http://www.tsmblog.org/), me

Matt Anglin (IBM USA)

Who always knows everything and tells the truth!

Zsolt Fekete (SCSS Kft., Hungary), me and Paul Oh (Sentia Solutions Inc., Canada)

Wednesday, September 23, 2015

Value Replacement in Select Statements

When working with the TSM DB there are times you might want to change/replace a returned value for easier reporting.  I am finding that with TSM now using DB2 more SQL functions are available to use when trying to gather information from the DB using select statements. There are a couple functions available to facilitate this which I will discuss below. The first one to cover is the CASE function (available in TSM pre-DB2) which also has the benefit of allowing flow logic to your SELECT statement. I first came across this when I found a Q EVENT macro someone had created on a TSM server I inherited.

def script event-check desc="Events - Exceptions"
upd script event-check "/* ---------------------------------------------*/"
upd script event-check "/* Script Name: event-check                          */"
upd script event-check "/* ---------------------------------------------*/"
upd script event-check '  select -'
upd script event-check '  schedule_name, -'
upd script event-check '   cast(SUBSTR(CHAR(actual_start),12,8) as char(8)) AS START, - '
upd script event-check '   node_name, -'
upd script event-check '   cast(status as char(10)) as "STATUS", -'
upd script event-check '    case -'
upd script event-check "      when result=0  then ' 0-Succ' -"
upd script event-check "      when result=4  then ' 4-SkFi' -"
upd script event-check "      when result=8  then ' 8-Warn' -"
upd script event-check "      when result=12 then '12-Errs' -"
upd script event-check "      else cast(result as char(7)) -"
upd script event-check '    end -'
upd script event-check '      as "RESULT" -'
upd script event-check '  from events -'
upd script event-check '  where scheduled_start<=(current_timestamp - 24 hours) -'
upd script event-check '    and result<>0 and node_name is not NULL'

As you can see in the macro we replace the basic result with some extra information to help identify what happened with the backup. The CASE function also comes in handy when changing the values for BYTES to the appropriate result of K, MB, GB, TB etc.

CASE -  
    WHEN bytes>1099511627776 THEN CAST(DEC(bytes)/1024/1024/1024/1024 AS DEC(5,1))||' TB' -
    WHEN bytes>1073741824 THEN CAST(DEC(bytes)/1024/1024/1024 AS DEC(5,1))||' GB' -

    WHEN bytes>1048576 THEN CAST(DEC(bytes)/1024/1024 AS DEC(5,1))||' MB' -
    WHEN bytes>1024 THEN CAST(DEC(bytes)/1024 AS DEC(5,1))||' KB' -
  ELSE -

    CAST(bytes AS DEC(5,0))||' B' -
  END AS bytes, -                                                          

 There is also another option for changing results when the field has a NULL value. When created a script that might encounter NULL values rather than return nothing you can use the NVL and NVL2 operators. For example say you want to create a report that shows all nodes and specifies whether they have ever connected to TSM. With TSM this can be determined not by the LASTACC_TIME but by the PLATFORM_NAME value. If PLATFORM_NAME is NULL then the node has never connected to TSM to set the value. So we can create a script that looks at the PLATFORM_NAME value and returns a value of Has Not Connected.

select varchar(node_name,40) as "Node Name", nvl(platform_name, 'Has Not Connected') AS "Platform Name" from nodes order by platform_name desc

Node Name                                     Platform Name    
-----------------------------------------     ------------------
HUCDSON.ADCC.W.ZZ.OU812.NET.SQL               Has Not Connected
DERELICT.ADCC.W.ZZ.OU812.NET.SQL              Has Not Connected
NOSTROMO.ADCC.W.ZZ.OU812.NET.MAIL             Has Not Connected
SULACO.ADCC.AIX.ZZ.OU812.NET.SQL              Has Not Connected
WIN-SEC-JMP-VH.ADCC.W.ZZ.OU812.NET            Has Not Connected
HICKS.ADCC.AIX.ZZ.OU812.NET.SQL               Has Not Connected
VASQUEZ.DLX.W.CS.AS19229.NET                  WinNT            
BISHOP.ULX.W.CS.AS19229.NET                   WinNT            
GORMAN.LV426.W.CS.AS19229.NET                 WinNT            
SPUNKMEYER.LV426.W.CS.AS19229.NET             WinNT 


The results work but what if you want to substitute results for PLATFORM_NAME when the value is not NULL?  You can use the NVL2 operator for this and return a list of the nodes showing whether they are ACTIVE or INACTIVE.

select varchar(node_name,40) as NODE_NAME, nvl2(platform_name, 'Active Node', 'Inactive') as STATUS from nodes order by Platform_name desc

Node Name                                     Platform Name  
-----------------------------------------     ----------------
HUCDSON.ADCC.W.ZZ.OU812.NET.SQL               Inactive       
DERELICT.ADCC.W.ZZ.OU812.NET.SQL              Inactive       
NOSTROMO.ADCC.W.ZZ.OU812.NET.MAIL             Inactive       
SULACO.ADCC.AIX.ZZ.OU812.NET.SQL              Inactive       
WIN-SEC-JMP-VH.ADCC.W.ZZ.OU812.NET            Inactive       
HICKS.ADCC.AIX.ZZ.OU812.NET.SQL               Inactive       
VASQUEZ.DLX.W.CS.AS19229.NET                  Active         
BISHOP.ULX.W.CS.AS19229.NET                   Active         
GORMAN.LV426.W.CS.AS19229.NET                 Active         
SPUNKMEYER.LV426.W.CS.AS19229.NET             Active

So ALL values that were not NULL were reported as Active. These functions can be very helpful when you need a specific report type. If you have other examples or operators you find useful feel free to leave a note in the comment section and I can add them to this post with your reference.