Category 1 - They don’t know what they need archived
- They don’t know where all the data needing archiving is
- They don’t know how long they should keep their data
- They don’t know where to begin to discover data that meets archive requirements
- They don’t want to spend the money on a true archiving solution
- They know some of the data needing archiving
- They don’t have all data identified
- They have an idea on how to identify data needing long term retention
- They are willing to investigate purchasing a true archiving solution
- They know all data needing archiving
- They know retention times for all data
- They constantly review systems and apps to identify data that meets archiving requirements
- They are willing to pay for a true archiving solution
The major problem with archiving is how many companies fall within the first two categories and getting them to the point of category 3 is sometimes impossible. It’s amazing to see huge companies scratch their heads and get that perplexed look when you discuss archiving with TSM. Most people come from the old school of taking a weekend or month end full and keeping it forever. They think this protects them in case they need any data in the future and so the customer follows this pattern and doesn’t see the problems inherent in that scenario. The problems are numerous but the most glaring one is that 99.99% of the data on those weekend or month end tapes will never be needed and you are now paying a huge amount of money for tapes and offsite storage. Now introduce TSM into the mix and the customer or management are accustomed to the old process and wonder why they now need to identify their data for archiving.
Let’s be honest most companies become overwhelmed when asked to do discovery on specific locations and data types that should be archived, so to make it easier and less work they try to make TSM conform to the old process. Unfortunately as TSM admins we tend to either not argue the case, or when we do dissent we are overruled. So you end up doing backupsets (If you are actually archiving whole machines please see a Psychologist immediately) and relying on the customer to keep his restores to a minimum. The problem is that backupsets sound good, they give management that false sense of security, which gets them off your back, and they are independent of any particular TSM server. The truth is that they stink! Backupsets are the worst archiving process you could use. Sure Tivoli has supposedly updated them to make them more functional in 5.3, but the truth is you still end up using too many resources, waste tape, and pay more for offsite storage due to that increase in tape usage. We wont even talk about the restore times, and what happens when syntax is wrong. So backupsets are the wrong solution for anything but DR needs or portability.
TSM is an adequate archiving tool. It does a good job for small to moderate archiving, but when you have situations where the customer needs to have very descriptive meta data stored with the archives to make retrieval easier you need an enterprise tool like IBM Content Manager, Xerox DocuShare, or one of the many others out there. The problems always seem to come down to cost. What do you do when the customer or management can’t part with the money to truly protect themselves? That is where you need to work with them to explicitly identify the data they need archived and that retention requirements are met, that management classes and include statements are used to match data with retention times, that they document the owners of the data for future reference, and that the documents and contact information are reviewed at least once per year. I had a situation where data was being archived and a few years down the road some one asked for data and the person who had been managing the archive had left the company and no one knew what process was in place and what data was being archived. They didn’t know who all the owners of the data were and the previous manager had not done any transition or hand over to other personnel.
You need to do constant review and audit of archiving processes and standards. Too many times requirements, laws, and applications change and you find yourself without the data required. Archiving tends to be like Ron Popeil’s Rotisserie, “Set it and forget it!” This is the breaking point. As a TSM admin even I have fallen into the trap of forgetting about archive jobs and assuming they are working. So I had to make changes to how we handled archive jobs and retention. Typically I recommend reviewing requirements and processes at least twice a year if not quarterly. This will hopefully allow you to identify any issues with new data brought online, changes in requirements, and application changes. Schedules need to be reviewed, shell scripts need to be checked, and archive data should periodically be audited to make sure they are performing correctly. DO NOT RELY ON YOUR SCHEDULE EVENT RECORDS! THEY DON”T GIVE YOU A COMPLETE PICTURE! What if the customer or management decides to change the location he or she stores the data? What do you do when the customer or management wants data archived from a directory weekly but does not want the data to be deleted? What if the customer wants data kept online (in the library) and also sends a copy offsite? These are the issues you will have to deal with as you work with archives. If you were expecting solutions and answers I only have suggestions. There is no one-way to do archiving so you have to find the best process that fits your needs. The key is helping your company or customer understand what is best for them even if they don’t initially like what they hear. When it comes to data the customer is not always right. Of course you can’t make the company or customer do exactly what you’d like but you’ll have to do your best to help them understand how much they stand to lose if they don’t follow the right procedures.