IT Services



Backing up machines which have high file counts


Contents



1. Introduction

If your machine has a high file count then that can cause TSM backups to fail. Backups may take an excessively long time to complete - or they may even fail early on, if there is not sufficient memory for TSM to use when it is processing the data that is to be backed up. It is impossible to give an exact figure, but having over a million files in a single partition may be problematic.

If scheduled backups are failing and you suspect that a high file count is the cause, then the first step is to inspect dsmerror.log and, more importantly, dsmsched.log. The latter will tell you exactly what TSM has been spending its time doing during each scheduled backup for the last month. On how to find these files, please see our list of TSM Options Files. You can also see how many files the HFS is keeping for each partition of a TSM node by looking up the number of files under Data held on tape.

If you find that the file count is very high (several million files) in one or more partitions, and that as a result TSM is taking a long time to back up that data, then there are several workarounds which should enable TSM to work more efficiently.



2. Exclude unnecessary data from backup

As best practice for use of TSM and the HFS, we recommend that you look into what TSM is backing up and verify that it should indeed be included for backup - both that it is data that you would wish to restore if it were lost, and also that it is work-related. On this subject please see further the section on our best practice page Limit what you back up.



3. Memory-efficient backup

TSM deals with data one partition at a time, comparing a list at the client end with what is already stored on the HFS, and backing up the difference. In cases where a partition has several million files, this can lead to poor performance, since the lists that TSM is comparing are very long.

The following instructions show how to change the way in which TSM processes your data. One of the two options given in this section will often fix problems related to failing backups if the issue is caused by a high file count.



3.1. Changing the memory-efficient backup option using the graphical user interface

To change the memory-efficient backup option in the graphical user interface, do as follows:
  1. Run TSM: in Windows, run TSM from [Start] > [All Programs] > [Tivoli Storage Manager] > [Backup-Archive GUI]; on a Mac, run TSM from [Applications] > [TSM Tools for Administrators] > [Tivoli Storage Manager]; in Linux, run dsmj as root. The following window will appear.

    Figure /hfs/help/images/backup01.gif []

  2. Click on Edit, then Client Preferences. The Preferences Editor window will appear, on the General Preferences section:

    Figure /hfs/help/images/wingeneralprefs.png []

  3. From the list of tabs on the left, click on Performance Tuning. The options for Performance Tuning Preferences will appear:

    Figure /hfs/help/images/winperformancetuning.png []

  4. Under Memory Usage Algorithm to be used during backup, the default Use memory-resident method should already be selected. To have TSM deal with your data one folder at a time rather than a whole partition at a time, change this to Use memory saving method.

  5. If you have already tried Use memory saving method and still find that TSM is running slowly, then you can instead select the third option, Use disk cache method. Please note, however, that this method creates a database on your hard disk, and you will need to have sufficient free space for this database. IBM advise that if your machine is running Windows, you may need 5GB spare for each one million files that you have; if you have a Mac, then 800MB is needed for each one million files; and on Linux 200MB is needed for each one million files.

  6. Now click OK in the Performance Tuning Preferences window. This will take you back to the main TSM window.

  7. If you wish, you can now try running a manual backup to see if TSM performance is improved. Note that TSM will run slower if you are working on your computer whilst you are backing up. For how to run a manual backup, please see our instructions for doing so on Windows, Mac, Linux or Solaris.

  8. Lastly, if you use the automatic scheduled backups, you must now restart the TSM scheduler after making changes to your TSM configuration. If you do not do this then the change that you have made will not be honoured on the scheduled backups. Please see our instructions for restarting the scheduler for Windows, Mac, Linux and Solaris. Alternatively, restarting your machine will have the same effect as restarting the TSM scheduler.



3.2. Changing the memory-efficient backup option by editing the TSM configuration file

To change the memory-efficient backup option by editing the TSM configuration file, do as follows:

  1. Locate the TSM configuration file dsm.opt. The location of this file is platform-specific and can be looked up in our list of TSM Options Files.

  2. In Windows Vista, 7, 2008 and 2008R2, permissions to edit the contents of C:\Program Files are limited. We therefore recommend that you move dsm.opt to the desktop before you open it for editing. Users of these versions of Windows should therefore browse to C:\Program Files\tivoli\tsm\baclient and drag dsm.opt to their desktop. (An alternative is to run your text editor as administrator before using it to open dsm.opt.)

  3. Now add to the end of dsm.opt:

    memoryefficientbackup yes

  4. If you have already tried memoryefficientbackup yes and still find that TSM is running slowly, then you can instead try the following:

    memoryefficientbackup diskcachemethod
    Please note, however, that this method creates a database on your hard disk, and you will need to have sufficient free space for this database. IBM advise that if your machine is running Windows, you may need 5GB spare for each one million files that you have; if you have a Mac, then 800MB is needed for each one million files; and on Linux 200MB is needed for each one million files.

  5. Save the TSM configuration file. If you moved it from C:\Program Files to your desktop in order to edit it, now move it back.

  6. If you wish, you can now try running a manual backup to see if TSM performance is improved. Note that TSM will run slower if you are working on your computer whilst you are backing up. For how to run a manual backup, please see our instructions for doing so on Windows, Mac, Linux or Solaris.

  7. Lastly, if you use the automatic scheduled backups, you must now restart the TSM scheduler after making changes to the exclude rules. If you do not do this then the change(s) that you have made will not be honoured on the scheduled backups. Please see our instructions for restarting the scheduler for Windows, Mac, Linux and Solaris. Alternatively, restarting your machine will have the same effect as restarting the TSM scheduler.



4. Journal-based backup

Journal-based backup involves the creation of a local change journal, which TSM then uses to track files that need to be backed up. It therefore uses fewer resources than the usual backup methods, since TSM does not have to compare a list of files on the client with the TSM server; occasional incremental backups are recommended, however, in order to ensure that your journal is up-to-date. Journal-based backup is available for Windows and, on Linux, for Red Hat 5, 6 and SUSE 10, 11. If you wish to use this method of backup, please see the section on the IBM manual for the TSM 6.4 client entitled Configure your system for journal-based backup.



5. Zip up small files and exclude the originals from backup

If you have millions of small files which rarely change, you can improve TSM performance by only backing them up in zipped-up form. If a large number of files is zipped into a single file, then TSM only has one object to deal with, rather than the large number of files which it previously needed to count.

Once you have compressed your data, you may wish to delete the original files, and only keep them in their new format. However, if you still need to keep the uncompressed form of your data on disk, you can exclude the original files from backup, using our instructions on how to exclude files, folders and drives from backup.



6. Virtual mount points

Virtual mount points present nominated directories to TSM as if they were separate partitions, effectively creating sub-partitions within any existing partitions that you may have. Virtual mount points do not change your data or its arrangement in any way - your current partitions and disk arrangements are not altered. They cause TSM, however, to view your data differently.

There is an advantage in having such pseudo-partitions because, as mentioned in 3. Memory-efficient backup above, TSM deals with data one partition at a time: so, splitting data into smaller groups can greatly improve TSM performance. However, if you implement new virtual mount points after having already backed up your data, any data that is under a newly-created virtual mount point will need to be resent to the HFS, since TSM will regard it as being in a new partition.

For example, you might have a series of directories in your root partition called /data1, /data2 and /data3, where /data2 contains 3 million files and is causing TSM to struggle to complete backups in good time. You can split off /data2 from the rest of the root partition by making it into a virtual mount point.

As an option in TSM, virtual mount points only exist for Linux and Unix; however, a workaround to create virtual mount points exists for Windows machines also. If you need to create virtual mount points in Windows, please contact the HFS Team. If you are using Linux or Unix, you can set them up as follows.



6.1. Setting up virtual mount points using the graphical user interface

To set up virtual mount points in the graphical user interface, do as follows:
  1. Run dsmj as root.

  2. Click on Edit, then Client Preferences. The Preferences Editor window will appear, on the General Preferences section.

  3. From the list of tabs on the left, click on Backup. The options for Backup Preferences will appear.

  4. Under Virtual Mount Point, for each folder that you would like TSM to back up as if it were a separate file system, click on Add, locate the relevant folder, and click on OK. Repeat this as necessary for as many times as is required.

  5. Under Domain for Backup, check whether or not Backup all local file systems is ticked. If it is, proceed to the next step. If it is not, then you need each virtual mount point to be ticked in this section. To do this:
    • Click OK in the Backup Preferences window. This will take you back to the main TSM window.

    • Click on Edit, then Client Preferences. The Preferences Editor window will appear, on the General Preferences section.

    • From the list of tabs on the left, click on Backup. The options for Backup Preferences will appear.

    • The virtual mount points which you created should now be listed under Domain for Backup. Ensure that each one is ticked.

  6. Now click OK in the Backup Preferences window. This will take you back to the main TSM window.

  7. Now verify that your new virtual mount point(s) is/are set up correctly. The easiest way to do this is in the TSM graphical user interface, by attempting to perform a manual backup with your new settings, verifying that the folders nominated as virtual mount points appear in TSM as if they were separate partitions. For how to run a manual backup, please see our instructions for backup selected files and directories in Linux or Solaris.

  8. Lastly, if you use the automatic scheduled backups, you must now restart the TSM scheduler after making changes to your TSM configuration. If you do not do this then the change that you have made will not be honoured on the scheduled backups. Please see our instructions for restarting the scheduler on Linux or Solaris. Alternatively, restarting your machine will have the same effect as restarting the TSM scheduler.



6.2. Setting up virtual mount points by editing the TSM configuration file

To set up virtual mount points by editing the TSM configuration file, do as follows:

  1. Locate the TSM configuration file dsm.sys. The location can be looked up in our list of TSM Options Files.

  2. Add a line for each folder that is to become a mount point, of the form:

    virtualmountpoint /data
    Note that the added line(s) must not end in a forward slash.

  3. Save the TSM configuration file.

  4. Now open dsm.opt for editing. This is located in the same directory as dsm.sys. Check whether or not the setting for DOMAIN is is ALL-LOCAL. If it is, proceed to the next step. If it is not, then you need each virtual mount point to be listed as a separate domain. To do this, you can either have one long DOMAIN line with space-separated values, or multiple lines. So, for example, to set your backup domain to back up the root partition, /media and /data, you can put either

    DOMAIN / /media /data
    or
    DOMAIN /
    DOMAIN /media
    DOMAIN /data

  5. Now verify that your new virtual mount point(s) is/are set up correctly. The easiest way to do this is in the TSM graphical user interface, by attempting to perform a manual backup with your new settings, verifying that the folders nominated as virtual mount points appear in TSM as if they were separate partitions. For how to run a manual backup, please see our instructions for backup selected files and directories in Linux or Solaris.

  6. Lastly, if you use the automatic scheduled backups, you must now restart the TSM scheduler after making changes to your TSM configuration. If you do not do this then the change that you have made will not be honoured on the scheduled backups. Please see our instructions for restarting the scheduler on Linux or Solaris. Alternatively, restarting your machine will have the same effect as restarting the TSM scheduler.



7. Repartition your file system

If none of the above helpa, then it may be necessary for you to repartition your file system. This is likely to be the case if you have a very large quantity of data (several terabytes or more) that is presented to TSM as a single partition. As is the case in 6. Virtual mount points, this would require a resend of your data to the HFS, so please contact the HFS Team if you think that this will be necessary.