TSM Topics Feed

Thursday, April 20, 2017

Spectrum Protect 8.1 New DB Functions Supported

I upgraded some of my 6.3.4 servers to TSM 8.1 recently and have found the newer DB2 version (v11.1) offers some added functions for SQL queries. I have been trying to build a report to track an Exchange DAG servers backups. The problem is that the easiest way to track the backups of DB's is to view the last completed backup for each of nodes file spaces. That's the easy part. The hard part is that the file space name is huge.

TEST-AP-DAG\Microsoft Exchange Writer\{76fe1ac4-15f7-4bcd-987e-8e1acb462fb7}\TEST-MB-05\0e41e645-9acf-4b80-bc85-d606e04fe4d8

TEST-AP-DAG\Microsoft Exchange Writer\{76fe1ac4-15f7-4bcd-987e-8e1acb462fb7}\Mailbox Database 1223544393\2bc06db2-1966-4fd7-9545-f667102b0b7d

So how to extract the DB name, in this case TEST-MB-05 and Mailbox Database 1397248650 from the name when the length of the DB name changes? After some investigation and trial and error I found that the LOCATE_IN_STRING function works in v8.1. Whether it works in 7.x (DB2 v10.5) I can't say since I don't have a 7.x server to run it against. If any of you out there try this and it works let me know. So here's the script. I think you can read it and see what I am doing. If not let me know in the comments and I'll explain...

tsm: TSMSERV>q script DAG-EXCH-RPT f=raw

select -
 varchar(substring(filespace_name, LOCATE_IN_STRING(filespace_name, '\', 1, 3), -
 LOCATE_IN_STRING(filespace_name, '\', 1, 4) -  LOCATE_IN_STRING(filespace_name, '\', 1, 3)),30) as Exch_DB, -
 date(backup_end) AS LAST_GOOD_BACKUP -
 where -
  node_name='TEST-AP-DAG' -
 order by Exch_DB, backup_end asc

Here is sample output:

NODE              EXCH_DB                          LAST_GOOD_BACKUP
-------------     ------------------------------- -----------------
TEST-AP-DAG       \TEST-MB-01                            2017-04-18
TEST-AP-DAG       \TEST-MB-02                            2017-04-18
TEST-AP-DAG       \TEST-MB-03                            2017-04-18
TEST-AP-DAG       \TEST-MB-04                            2017-04-18
TEST-AP-DAG       \TEST-MB-05                            2017-04-18
TEST-AP-DAG       \TEST-MB-TEMP                          2017-04-18
TEST-AP-DAG       \MISC-MB-TEMP                          2017-04-18
TEST-AP-DAG       \Mailbox Database 1128394465           2017-04-18
TEST-AP-DAG       \Mailbox Database 1397248650           2017-04-18

If another function would work better feel free to correct the script and leave it for everyone in the comments. Now to go and find more functions I can use to manipulate my data!

Wednesday, February 22, 2017

Learn From My Mistakes

We recently did a DR test and ran into a serious issue with the recovery of the TSM instance. The DR test was to restore the TSM DB from data center A (DC-A) to data center B. The DC-A instance had been created the previous year so we only needed to remove the DB (DSMSERV REMOVEDB TSMDB1) and restore the DB. I followed the steps and restored the DB and after restoring the DB 99% the restore failed. The restore failure error pointed us to the size available being too small for the required. We were off by a couple GB in our DB directory sizes so we had to add another directory to the list. I tried to rerun the format of the space and quickly received and error that the DB was present. I then ran the DSMSERV REMOVEDB TSMDB1 command again then tried to rerun the format and received the same results. We tried multiple commands

db2 uncatalog db TSMDB1

db2idrop serverA

/opt/tivoli/tsm/db2/bin/db2greg -delinstrec service=db2,version=,instancename=serverA,instancepath=/drtsmserver/serverA/sqllib

This last command allows you to manually drop the DB from the global registry. When we tried this a couple times the instance would be removed from the global registry and when we would try to rerun the restore we would get the error that the DB was present and the instance would reappear in the global registry. (Notice the instance path....that is going to come play into the final solution.)

So at this point we went round and round with various commands to drop/remove/delete the instance from DB2 and the commands would fail saying that the instance was not present and then an attempt to restore the DB would fail saying the DB was present. We went round and round and round until I finally restored the DB under a new ID and instance directory which was successful.....BUUUUUUTTTTTT.....there in lies another issue. Once the DB restore was successfully restored it failed to allow us to bring up the TSM instance because the new ID was not the original owner of the DB instance. Immediately I realized my mistake as this had happened to me years ago and I totally forgot this DB2 requirement. You see DB2 does not allow an ID other than those that have been granted permission to bring the DB up. I get it, it's a security thing, but the TSM instances almost all Admins create only has one active ID allowed ownership of the DB2 instance. You can add as many as you like but it's not something IBM or Tivoli openly suggest or recommend. So if the ID you typically use wont allow the restore or for some reason can't be used and you want to use a new ID for the DB your S.O.L. If it's after the fact, IBM cannot provide a way for you to assign an additional owner of the DB2 TSM instance. IBM's recommendation was go back to DC-A and grant the new ID rights to the instance, run a TSM DB backup, copy/replicate the instance to DC-B, then restore the DB and it would work under the new ID.

By this point in the DR process we were so behind in the recovery of the TSM server we scrapped our portion of the DR test (we were not a significant portion of test, which was to test Recovery Point capabilities and TSM was a fall back). I continued to work with IBM on the restore under the original ID and the looping issue and after a couple attempts, the IBM level 2 DB2 support person identified that the creation of the TSM instance had created DB files in


and a simple rename of that directory should the the DB present/not present issue. We had repeatedly cleared out the DB and Log file systems but had forgotten about the DB folder and files created in the instance directory. TSM had created that subdirectory with DB related files. We had renamed the /drtsmserver/serverA/sqllib directory more than once, but we had forgotten about the other instance directory as IBM states in this technote.  When we renamed the second serverA directory and tried the restore and it ran to completion successfully. HEAD SLAP! This was one of the times where even with another set of eyes we were not seeing the bigger picture. IBM level one support was probably looking at the wrong issue and level two was stumped because we moved away from the initial issue we called about and were now asking them how bring up the TSM DB that we restored under the alternate ID.

OK at this point I was exhausted and frustrated with myself. So what did I come away with in this situation....

  1. Triple verify your DB file space size is adequate. (DOH!)
  2. From the instance home directory copy the critical files to an alternate location.  
    • This is in case the restore fails and clears them or updates them incorrectly (trust me this happened more than once)
  3. If the restore fails or a rerun of the restore is required rename both <inst home>/<inst name>/<inst name> and <inst home>/<inst name>/sqllib
  4. IMHO I would also create another ID or two on the production server and grant them access to the TSM DB instance just in case you had an issue with the primary ID. 
    • This is not required but cant hurt.

Thursday, August 25, 2016

Why Are You Not Using Google & YouTube?

I had an OS Admin contact me through LinkedIn and G-Mail asking for help on trying to find his archived data. He didn't have much experience with TSM and was looking for information on how to find the long term backups (i.e. Archives). I asked him if he even tried to search Google? Google and YouTube are great resources for all your needs. For example if you want  to learn more about the new Operations Center you can see a plethora of videos by searching YouTube. You can also use google to find all sorts of related documents and pages when it concerns APARs and errors. If you haven't done your due diligence you make yourself look dumb. Shoot, you can even search Google from my website and get my posts and outside relevant web pages!

So here is a list of YouTube videos you can reference:

Backup and Archive

Server Administration

Setup TSM Deduplication

Tivoli Data Protection Agents

TDP for Virtual Environments

Monday, July 11, 2016

TSM Explorer

I've been notified by the developer of TSMExplorer that a more current free edition is available for anyone looking for a GUI based management tool for TSM. Below is a brief note from the developer.

"TSMExplorer GUI is  free application for TSM server management. The solution is a comfortable tool to control and manage from a single sign-on. This version is free for works with  version TSM 5.x 6.1 6.2”

Tuesday, June 21, 2016

Restoring TSM Without A Volhist

Someone in the comments to an old post just asked for directions/instructions on restoring TSM without a volume history or devconfig file. Well, I got some bad news and some not so bad but not fun news. We will start with the not so bad news. If you don't have a devconfig, don't panic! You can recreate the devconfig. That's fairly simple, just a pain. TSM has to have a devconfig file to initialize its devices so if the devconfig is not present you'll have to create one. Typically you do this when you rebuild a TSM instance. For example at a DR site you install TSM on the DR server, define the dsmserv.opt, and then you define base devices on the new install. Once that has been done you can bring down TSM and attempt a restore using the newly defined device(s). 

Now for the bad news. Without the volhist, if you don't know what volume(s) were used for DBBAckup your kind of screwed. The old DSMSERV DISPLAY DBBACKUPVOLUME command has been removed/deleted and IBM now says the following

DSMSERV DISPLAY DBBACKUPVOLUME - Information about volumes used for database backup is available from the volume history file. The volume history file is now required to restore the database. 

You can find a list of TSM Server deleted commands, utilities, and options at the following link.

Monday, April 11, 2016

DR Test - Things learned

I just did a DR test from one data center to another involving TSM and our Data Domain (DD) which we have configured for NFS and VTL usage. Things to know...

  1. We backup the TSM DB to the DD NFS file system
  2. The TSM server was not brought up on its own LPAR in the DR site, but shared with an alternate TSM instance.
  3. The DR site could not facilitate LAN-Free like the primary site.

So we built the secondary instance on the LPAR currently running a TSM server that services the customer's development environment. Then I disabled the replication pair and we mounted it to the LPAR so we could restore the TSM DB. This is where our main problem rose it's head. The NFS file system from the DD was mounting under the primary TSM instances ID, So while we wrestled for this for an hour or so, I realized after Googling the issue and reading the DD notes from people that the problem was the configuration. I would have been fine disabling the replication pair and mounting it to the TSM LPAR if it had been the default user ID, but the primary instance was the owner and we could not change permissions due to what is allowed by the default ID and settings from the DD. So I had to unmount the DD NFS file system to delete the pair on the DD then remount it with the full read/write permissions. I was then able to mount it under an alternate ID. Once we overcame this we were able to start the TSM DB restore which is where our second issue arose.
We were restoring the TSM DB and the active logs were not being restored to the active log directory. The first time I used dsmserv restore db and it ran fine until all the DB records were restored and I received the following error:

ANR2970E Database rollforward terminated - DB2 sqlcode -1004 sqlerrmc TSMDB1

The restore process restored the logs to the instances home directory eventually filling the filesystem to 100% and erroring out. I thought the logs were recovery log related so I then added the RECOVERYLOGDir option to the restore command and got the same results. This wasted an hour to achieve the same results, so after some more Google searches and talking to IBM support I decided to add the ACTIVELOGDIR option to the restore. I didn't add it due to the IBM support tech suggesting it (he didn't) I just realized recovery log was not filled with any logs and the only other logs they could be are Active Log files. I added the ACTIVELOGDirectory option to the restore command and DB restored worked without any errors. The question is why didn't TSM use the ACTIVELOGDirectory option stated in the dsmserv.opt? The RECOVERYLOGDir option was used but the log for recovery were never more than maybe 1GB, but the active log was over 53GB and the db2diag.0.log registered the error that no recovery log directory was listed so the default would be used. What the hell??? It is listed in the dsmserv.opt...

ACTIVELOGDirectory          /drtsmserver/tsm30log
ARCHLOGDirectory            /drtsmserver/tsm30arch

So I post this so you can learn from my mistakes. The final restore DB command was

dsmserv restore db on=db.list recoverydir=/drtsmserver/tsm30fail activelogdir=/drtsmserver/tsm30log