Thursday, March 12, 2009

Active Data Pool - What's The Point?

To go along with the ADP story I have moved this older post up for easy reference.

With the release of TSM 5.4 Tivoli has added the ability to create an active data storage pool to allow for faster restore times. We have been looking into using them at work when I stumbled upon this interesting factoid in the description of the active data pools limitations:

  1. The server will not attempt to retrieve client files from an active-data pool during a point-in-time restore. Point-in-time restores require both active and inactive file versions. Active-data pools contain only active file versions. For optimal efficiency during point-in-time restores and to avoid switching between active-data pools and primary or copy storage pools, the server retrieves both active and inactive versions from the same storage pool and volumes.
So my question is why don't they allow the client to restore all the active data first then restore the inactive, or why didn't they implement a multi-session restore process when they added active data pools to the product, thereby speeding up the restore process? With the amount of P-I-T restores I do, this issue makes the whole active data pool useless not just for me but I'm sure for many of you.


  1. Hi Chad,

    IMOH it shouldn’t be an issue that you can't use PIT restore when the data is stored across different stgpools (aka PIT cannot be used when an ADP is used).
    You'll want to use an ADP only to collocate active files versions to a single stgpool. PIT is not needed for that, as you don't need 'historical' data seen from that perspective.

    ADP's on disk can be used for fast client restore and will reduce required resources. You don't need to hold active/inactive data on the same stgpool (or underlying tiered storage). You can keep active data on ADP (fast/expensive disk) and have the inactive data on lower cost disk/tape. ADP's on disk will give you the most advantages.

    In addition, you can create a list for nodes that will benefit from the advantages of ADP; being filesystems with small files. Create a list of nodes that require a fast restore of typically the last backup version (the active one). This way, you can leave the large databases, etc. out of the ADP.

    Another argument is that DRM don't manage ADP's. That is correct. You can use your ADP for local disasters (disk failure, clients' system down, etc.). Use simultaneous writes/copypool tapes to provide offsite DR.

    I think there a some advantages, for some people in certain environments, that can benefit from an active data pool.

    Best regards,
    Tommy Hueber

  2. The point of active data pools is recovery, not restores. If you are trying to get a crashed system back, then it is definitely faster.

    AD pools are great for replicating your active DR data to remote sites, or creating sets of DR media. Yes there is the argument for just replicating all your backup data at that point, especially if you run dedupe; but what if your remote site has only the barest of storage resources? Then you have to be selective about what stays.

    PIT is by closer to a restore than a a recovery since you are going back in time. Even if it is a true recovery, i.e. restoring a system to a point before corruption, since you can't quite call it a DR.

    A potentially novel use of AD is for the bane of the TSM admin's existence: long-term retention. I had a chance to design a system where the AD pool was created once a week to sequential disk storage. A TSM server dedicated to retention would then come in and back up the AD pool and the DB backup for long-term(10 year) offsite storage. Since the second TSM server only had to deal with a few large files, DB size over the long term was no longer an issue. Then the primary TSM server could then continue to focus on restore and recovery.

    Of course, with 6.1, IBM is biasing the AD pool even further towards DR, which is interesting.

  3. The reason that tape storage is a favorable medium for this type of business is that it is inexpensive to store massive amounts of data on tape media before having to purchase new hardware. Self Storage