Friday, May 22, 2009

Calculating Active Data

I was recently asked to calculate the amount of active data in TSM storage for file system backups, not TDP's and had some interesting results. If you search TSM active data in Google the first result will be this IBM support doc that explains how to calculate active data to help size an Active Data Storage Pool. IBM recommends that you use the EXPORT NODE command with PREVIEW=YES to determine the amount of active data. This, in theory, should work well but for TSM to process this request it has to analyze the backups table and who knows what else to gather the data. I have 10 instances I needed to gather the information from, they all vary in TSM DB size, and the amount of managed data stored. My smallest DB is a new instance that is 25GB and my largest is 155GB and size did not matter when it came to how fast the information was calculated. The TSM instance with the largest DB completed the taks in over two days (YES TWO DAYS!). Two TSM instances were still running the EXPORT NODE query after THREE DAYS and they had moderate to large sized DB's.

So what caused this problem? It all comes down to the number of files TSM has to inspect. The two instances that never completed the query have large numbers of Windows nodes and have the most registered nodes overall. These two instances seemed to be crawling through the process and where they should have calculated into the ten to twenty TB, as the next largest instance did after just over two days, the problem two were still in the 7 to 6 TB range and increasing slowly. My only explanation (and this is a guess) is that due to the fact that Windows servers tend to have hundred of thousands if not millions of files which TSM gets bogged down trying to inspect them all. I didn't notice a performance impact but IBM claims that it is a resource intensive task and should be run during non-peak hours.  How can you do that when it runs 24hrs or more?

Finally after three days, and no end in sight, I canceled the processes and now have to figure out some other way to calculate the amount of active data they have stored. I could calculate (i.e. guesstimate) by summing the amount of space used per file space.

Example:

select cast(sum(capacity*(pct_util/100)) AS decimal (18,2)) As Total_Used_Space from filespaces where node_name in (select node_name from nodes where domain_name like '%STD%') 

The problem is that this will not be accurate and will probably cause me to oversize any Active Data Pool I create. Now that's not a horrible thing (more space is always better than to little) but the whole process seems to time consuming for something that on the surface should be fairly easy to calculate.  This where I hope the new DB2 database can help, but until someone has it up and running and can try this process we can't say if there is a reasonable solution to find the total active data in TSM within larger instances.


Addendum

Are wondering why I used the '%STD%' filter in the select statement above? The team here at Infocrossing separates the file system backups and application (TDP) backups with different domain names using a standardized naming process. This is great because it also allowed us to run the EXPORT NODE command using wild cards for the specific domains to include in the export query.

EXPORT NODE * DOMAINS=*STD* FILEDATA=ALLACTIVE PREVIEW=YES

I highly recommend you follow a similar process when creating domains and even schedules to make it easier to process groups of nodes. So you could create a WIN-STD-DOM domain or like we do for our TDP's WIN-APP-DOM. These are examples but they can make life easier. 

3 comments:

  1. Hi Chad,

    You stated; "but IBM claims that it is a resource intensive task and should be run during non-peak hours. How can you do that when it runs 24hrs or more?"

    You can do this by means of the restartable export functionality. Cancel the export before the peak hours and resume it afterwards (use q export to see the status of current exports).

    But of course, 3 days for a preview=yes is no fun.

    Best regards,
    Tommy Hueber
    http://www.tsmblog.org

    ReplyDelete
  2. Good question Tommy. I am using 5.4 which I doesn't have the restartable export option. So it won't help me but it could help those using 5.5. I am not sure if it would work with a PREVIEW but it's worth a shot.

    ReplyDelete
  3. Chad, Trying to come up with TSM 5.5 query to determine the amount of active and inactive data is stored for specific directory/subdirectory of a specific node. For example: /siebel/log for node siebelAA. Can you help with that one?

    Thanks.

    ReplyDelete