TSM Topics Feed

Sunday, January 29, 2006

Great TSM Information

I found this great symposium presentation online that covers the future trends and directions of TSM and covers some of the topics I have been trying to get more information on. Here are some of the topics covered:

  • Point-in-time backup sets

  • TOC generation for backup sets to allow file level restore

  • Differential backup sets

  • Application (TDP) backup sets

  • File level restores from Image Backups

  • Active Only Copy-Pools

  • Collocation of active data

  • Benefits of TSM conversion to DB2

My only complaint is why I never get any information from Tivoli? I work for IBM! Oh well, at least the info is out there for those of you like me who don’t get invited to Oxford University symposiums.  You can view the presentation or see the program with links to the various presentations given. I would like to thank the people responsible for posting this symposium online as it gave me a great deal to look forward to with TSM.

TSM Express

I am awaiting the release of TSM Express and was wondering why it took so long to make a SMB version of TSM?  This segment is huge and a version that plays into the common Full + Incremental has been needed for some time. I have numerous sites where I could use Express and it will allow me to standardize on TSM as the backup tool. For those not aware TSM Express is a new TSM version due out in the 2nd quarter of 2006 that is very similar to Arcserve and BackupExec. It is a Windows only product that uses a log and DB although the DB has a 20GB limit. It follows the full plus incremental process most tools use in this segment and is suppose to be the new small business release. I would say it was a medium size business solution, but with a 20GB DB size limit that would be hard to support, especially in a Windows environment. I found this presentation through Google that discusses TSM Express along with TSM HSM. It’s worth a look.

Wednesday, January 11, 2006

TSM Library Sharing

TSM is truly an Enterprise Backup tool and one of the key selling points is how well it allows you to consolidate and centralize the backup environment. One feature I have used extensively is the library sharing option. This feature allows multiple TSM servers to use the same library without conflict. It can be easier than partitioning the library, allows for dynamic allocation of tape drives, and allows for a single shared scratch pool. The key to using a shared library is creating a separate TSM instance as a dedicated library manager (whether on the same server as another TSM instance or on its own hardware). The library manager instance will not perform any backups and will not be a big instance. From my experience it will have very little impact on the physical server. When creating our library manager instance I was unsure how large to make the DB so I made it 2GB, and it currently is 4.6% utilized. The library it manages is a 9 frame 42 drive 3584. The library supports 3 TSM backup instances (2 on one server and 1 on a separate physical server) with a total of 951 clients. We put the library manager instance on the server with 2 instances, thereby creating a third TSM instance. By creating the third instance dedicated to managing the library you will have increased uptime and when you need to do maintenance or restart the instance the library manager is back up very quickly. This will not be the case if you make a production backup TSM instance the library manager due to a production TSM servers need to run expiration and usually having a larger log size making restart time take that much longer.

The other nice feature of library sharing is the ability to dynamically allocate drives to specific TSM servers. I have seen some administrators take twenty drives and path ten to one TSM instance and ten to another but this defeats the purpose. If you setup the library manager and TSM backup instances correctly then you’ll allow every instance connected to the library manager to see all the drives. This coupled with the device class mount limit setting can allow you to change drive usage on the fly. So say you have two TSM servers TSM-A and TSM-B using a twenty drive 3584 managed by a library manager instance called TSM-C. Each TSM instance sees all twenty drives but you have their device class mount limits set to ten. Now each TSM instance can only use ten drives out of the twenty. Lets say on a specific day TSM-A server is in need of more drives and TSM-B is currently only using three drives. You could update the TSM device class on each to allow TSM-A fifteen drives and TSM-B five drives.

Recently we purchased some newer hardware to replace our 9 frame LTO-1 library and are looking at using two small AIX servers using HACMP as the library controller since the new library will need to be available at all times and will be supporting at least six TSM servers. Unfortunately I have had to partition this new library due to application demands. The problem we are experiencing with this new library is that the logical portioning requires us to use the Virtual I/O option in the 3584. This option allows the library to automatically insert volumes when placed in the I/O door. This would not be a problem if the library did not then require us to assign the volumes to a specific logical library. This is done through the library and not through TSM, which adds another process to our tape management. I would have preferred to not have partitioned the library and allowed a TSM library manager instance to handle allocation of tape drives, but alas I am not able to at this time (still awaiting the HACMP instances).

Friday, January 06, 2006

Archiving - The Great Debate!

I knew when I decided to cover archiving that many of you would have comments on it, so I am expecting your feedback and a good discussion on the matter. The problems I see with archiving are many but most companies fall within one of three categories:

Category 1 - They don’t know what they need archived
  • They don’t know where all the data needing archiving is

  • They don’t know how long they should keep their data

  • They don’t know where to begin to discover data that meets archive requirements

  • They don’t want to spend the money on a true archiving solution
Category 2 - They somewhat know what they need archived
  • They know some of the data needing archiving

  • They don’t have all data identified

  • They have an idea on how to identify data needing long term retention

  • They are willing to investigate purchasing a true archiving solution
Category 3 - They know exactly what they need archived
  • They know all data needing archiving

  • They know retention times for all data

  • They constantly review systems and apps to identify data that meets archiving requirements

  • They are willing to pay for a true archiving solution

The major problem with archiving is how many companies fall within the first two categories and getting them to the point of category 3 is sometimes impossible. It’s amazing to see huge companies scratch their heads and get that perplexed look when you discuss archiving with TSM. Most people come from the old school of taking a weekend or month end full and keeping it forever.  They think this protects them in case they need any data in the future and so the customer follows this pattern and doesn’t see the problems inherent in that scenario. The problems are numerous but the most glaring one is that 99.99% of the data on those weekend or month end tapes will never be needed and you are now paying a huge amount of money for tapes and offsite storage.  Now introduce TSM into the mix and the customer or management are accustomed to the old process and wonder why they now need to identify their data for archiving.

Let’s be honest most companies become overwhelmed when asked to do discovery on specific locations and data types that should be archived, so to make it easier and less work they try to make TSM conform to the old process. Unfortunately as TSM admins we tend to either not argue the case, or when we do dissent we are overruled. So you end up doing backupsets (If you are actually archiving whole machines please see a Psychologist immediately) and relying on the customer to keep his restores to a minimum. The problem is that backupsets sound good, they give management that false sense of security, which gets them off your back, and they are independent of any particular TSM server. The truth is that they stink! Backupsets are the worst archiving process you could use. Sure Tivoli has supposedly updated them to make them more functional in 5.3, but the truth is you still end up using too many resources, waste tape, and pay more for offsite storage due to that increase in tape usage. We wont even talk about the restore times, and what happens when syntax is wrong. So backupsets are the wrong solution for anything but DR needs or portability.

TSM is an adequate archiving tool. It does a good job for small to moderate archiving, but when you have situations where the customer needs to have very descriptive meta data stored with the archives to make retrieval easier you need an enterprise tool like IBM Content Manager, Xerox DocuShare, or one of the many others out there. The problems always seem to come down to cost.  What do you do when the customer or management can’t part with the money to truly protect themselves?  That is where you need to work with them to explicitly identify the data they need archived and that retention requirements are met, that management classes and include statements are used to match data with retention times, that they document the owners of the data for future reference, and that the documents and contact information are reviewed at least once per year. I had a situation where data was being archived and a few years down the road some one asked for data and the person who had been managing the archive had left the company and no one knew what process was in place and what data was being archived. They didn’t know who all the owners of the data were and the previous manager had not done any transition or hand over to other personnel.

You need to do constant review and audit of archiving processes and standards. Too many times requirements, laws, and applications change and you find yourself without the data required. Archiving tends to be like Ron Popeil’s Rotisserie, “Set it and forget it!” This is the breaking point. As a TSM admin even I have fallen into the trap of forgetting about archive jobs and assuming they are working. So I had to make changes to how we handled archive jobs and retention. Typically I recommend reviewing requirements and processes at least twice a year if not quarterly. This will hopefully allow you to identify any issues with new data brought online, changes in requirements, and application changes. Schedules need to be reviewed, shell scripts need to be checked, and archive data should periodically be audited to make sure they are performing correctly. DO NOT RELY ON YOUR SCHEDULE EVENT RECORDS! THEY DON”T GIVE YOU A COMPLETE PICTURE! What if the customer or management decides to change the location he or she stores the data? What do you do when the customer or management wants data archived from a directory weekly but does not want the data to be deleted? What if the customer wants data kept online (in the library) and also sends a copy offsite? These are the issues you will have to deal with as you work with archives. If you were expecting solutions and answers I only have suggestions. There is no one-way to do archiving so you have to find the best process that fits your needs. The key is helping your company or customer understand what is best for them even if they don’t initially like what they hear. When it comes to data the customer is not always right. Of course you can’t make the company or customer do exactly what you’d like but you’ll have to do your best to help them understand how much they stand to lose if they don’t follow the right procedures.

Tuesday, January 03, 2006

Happy New Year

Happy New Year!

To all the readers of TSMExpert here’s to a great 2006 with no major system crashes, RAID failures, or major disasters!  May this year be the most boring IT year you ever have!