Friday, March 28, 2014

Poor Performance

Currently I work in an environment where we have a specific TSM instance for a large SAP DB (99TB currently). We just upgraded the drives in the tape library (yes we use tape! I know...I know....) from MagStar 3592 TS1130 (E06) drives to TS1140 (E07) drives. The upgrade was pushed in hopes of a jump in write/backup performance, but I was skeptical. TSM adds so much overhead you cannot use the RAW tape read/write numbers from any manufacturer. Typically IBM is somewhat reasonable with their numbers, but in this case I have seen NO performance increase what-so-ever.  Here is a query of the processes for storage pool backup.

UPDATE (04/04/2014):  Let me give you some more specs, we have the 99TB DB split between 4 TSM Storage Agents each having 4 8Gb HBA's. Each storage agent runs 4 sessions (allocates 4 drives) for their backup process. So all 4 storage agents account for 16 simultaneous sessions and it still takes over 24 hours to perform the 99TB backup. The backups are averaging around 70-78MB/sec. Is this a TSM overhead issue or do I have a tuning issue with the TDP and TSM? I'm getting less than 50% of the throughput I should see.

Here's the command that is run to execute the DB backup:

ksh -c export DB2NODE=7 ; db2 "backup db DB8   LOAD /usr/tivoli/tsm/tdp_r3/db264/libtdpdb264.a OPEN 4 SESSIONS OPTIONS /db2/DB8/dbs/tsm_config/vendor.env.7 WITH 14 BUFFERS BUFFER 1024 PARALLELISM 8 WITHOUT PROMPTING" ; echo BACKUP_RC=$?

PROCESS_NUM: 2667
    PROCESS: Backup Storage Pool
 START_TIME: 03-27 23:21:54
   DURATION: 00 23:20:13
      BYTES: 6.0TB
 AVG_THRPUT: 75.87 MB/s

PROCESS_NUM: 2668
    PROCESS: Backup Storage Pool
 START_TIME: 03-27 23:21:55
   DURATION: 00 23:20:12
      BYTES: 6.2TB
 AVG_THRPUT: 78.48 MB/s

PROCESS_NUM: 2669
    PROCESS: Backup Storage Pool
 START_TIME: 03-27 23:21:55
   DURATION: 00 23:20:12
      BYTES: 6.2TB
 AVG_THRPUT: 77.99 MB/s

PROCESS_NUM: 2670
    PROCESS: Backup Storage Pool
 START_TIME: 03-27 23:21:55
   DURATION: 00 23:20:12
      BYTES: 6.4TB
 AVG_THRPUT: 80.13 MB/s

I average anywhere from 75 to 80 MB/sec.  Here is the Magstar performance chart. I am using JB media, not JC so I do take a little hit in performance for that.










So with JB media I could get as high as 200MB/sec but I am not even 50% of that number.  Is there any specific tuning parameter I should look at that could be hindering the performance? 

FYI - The backup of the 99TB DB runs LAN-Free using 16 tape drives over 26 hrs.