This policy has been approved by the OUCS Management Committee and document supersedes the previous draft policy dated March 13th 1997.
The Hierarchical File Server (HFS) provides methods of storing data from a wide variety of sources, and for many different purposes. The HFS is a valuable University resource, with a finite, though large, capacity. The HFS Archive Policy is intended to ensure that the HFS Archive facility is used to best effect.
One of the goals in acquiring the Hierarchical File Server, in 1997, was to provide economical long-term storage of digital materials. At the beginning of 2013 the HFS Archive held over 80 terabytes of data, mainly from research and digital imaging projects. Over the past ten years the use of computers and the rate at which digital data is produced has changed substantially. Most, if not all, research projects have significant digital outputs whether large-scale data from remote sensors through to the digital materials collected in the making of a book. The HFS Archive Policy should, for example, now reflect the view that the writing of data to CD/DVD-ROM as a long-term data preservation strategy is unsustainable.
The HFS provides both backup and archiving services. The essential difference between archiving and backup is that the existence of an archive file is independent of the existence of the file from which it was copied; a backup copy is dependent on the existence of the file in the local filestore, and if the file disappears from the local filestore, its copies will, after a delay, disappear from the backup filestore.
There are other differences that make some data suitable or unsuitable for the HFS Archive Service - a simple table of what constitutes suitable candidate data for archive is listed for quick reference.
2. Data Storage and Curation
Archiving is the process of transferring information of value into a distinct repository to ensure its long-term safe-keeping (and therefore also accessibility). All archives, whether physical or digital, have policies which dictate the selection, retention and deletion of materials. Discrimination in the receipt and retention of material is always required since the cost of retention is never zero (and indeed is likely to increase over the longer-term as various preservation actions are initiated). The University is therefore wary of "just in case" archiving or of attempts to use archives as an extension to day to day file storage.
The HFS Archive policy is intended to encourage uptake of the Archive service for the long-term storage of data considered to be of value to the University as whole, or likely to be of value to our successors. Therefore, the decision on whether to make use of the HFS Archive should be based on qualitative rather than quantitative judgements (i.e. the value rather than simply the amount of data). Since, in practical terms, it is impossible to state with any firm conviction that a given data set will remain of value to the University for ever, it is important that the Archive incorporates retention and deletion policies.
The HFS Archive provides long-term file storage. The HFS Team has the expertise to maintain and develop the storage infrastructure, migrating data from one medium to another as required. However, the HFS Team does not provide a data curation service. The documenting and management of the data content is the responsibility of the Data Curator. Every dataset lodged with the HFS Archive must have a Data Curator, the contact details of whom must be kept up to date. The Data Curator is responsible for submitting data to the Archive; for ensuring the data is documented to agreed standards; and for reviewing on a regular basis with the HFS Team the need to retain or to delete the data. The HFS Team expects to work in collaboration with domain experts and OULS to ensure that good practice in data curation is implemented for all datasets submitted to the Archive, and that consequently the HFS Archive service continues to offer value for money in the long-term storage of the University's digital assets.
3. HFS Archive Policy
This Policy should be read in conjunction with the HFS Service Level Description (http://www.oucs.ox.ac.uk/internal/sld/hfs.xml).
Preserving files in the HFS Archive represents a major investment of University resources. As such it should be subject to proper control and its usage should be open to scrutiny by appropriate University bodies.
The HFS Archive Policy comprises the following elements:
The HFS Archive service is available to Senior Members, Staff, and Postgraduates. Applications from Postgraduate students should be sponsored by a Senior Member. A Postgraduate student cannot be named as the data curator.
3.3. Project Definition
The HFS Archive contains projects. For the purposes of this policy a Project is defined as a discrete, finite activity arising from which are one or more datasets. Projects in this context need not be related to research, externally-funded, nor involve teams. Projects do have start and end dates, and project data may similarly comprise a bounded collection of digital objects (including related sub-collections).
Each HFS Archive 'project' is allocated a maximum quota of 4 TB. Usage above this quota is subject to cost-recovery charging. The 4 TB limit reflects the total amount of data that may be stored on a single HFS tape (data is replicated on three tapes: online, local secure location, remote secure location).
3.5. Conditions of Use
The following requirements apply to the use of the HFS Archive service:
4. Cost-Recovery Charging
The hardware infrastructure underpinning any reliable Archive service must be of high quality, be available 24/7, be scalable and have a degree of redundancy that avoids service disruption if any single element or path of elements fails. Such systems obviously entail an elevated level of expense. The University bears this expense partly via the funding of the Archive service as a core-service. and this is represented by there being no point-of-use charge for the archival storage of the first 1 TB of any approved project.
Increasingly, however, this level of 'free' storage is not adequate with some projects generating many terabytes of data requiring archival. All projects seeking funding should now include a defined element for long-term storage of any data and as a result the following charging models have been developed, based on the Full Economic Costs (FEC) model, to reflect the costs of large-scale, long-term archival of data within the HFS.
Charges are incurred at the TB boundary and in advance. A purchase order should be raised for "the archival of N TB of data for X years with the HFS", include the Project Name and be sent to OUCS. The HFS will monitor actual occupancy on a monthly basis and report this to the designated contact email address for the project. Additional purchases can be made for increased storage and applied on a yearly basis.
The charging models cover staff costs (management, systems administration, and user support), as well as the costs of media, hardware and software maintenance licensing and support - all of which, with the exception of the client software licence, may be seen to increase generally in line with the amount of data stored. Media costs represent a significant element of the storage costs and have a reasonable lifetime of five years before the data should be rewritten. Charges are incurred at the TB boundary as the tape media have a capacity of 1TB. Thus it clearly requires the same number of tape volumes (2) to store either 1.1TB or 2TB. Three copies of data are written to separate tape volumes, one of which is stored online for ready data retrieval, one volume is stored on-site in a secure firesafe and one stored at a secure site outside Oxford. It is hoped that this represents an affordable yet high quality, reliable archive service.
The charges are reviewed on an annual basis and are documented as a Premium Service in the HFS Service Level Description.
5. Example HFS Archive Projects
The following are typical examples of how the HFS Archive service is used to support the management of research data.