Description (en)
While the Open Archive Information System (OAIS) model has become the de facto standard for preservation archives, the design and implementation of a repository or reliable long term archive lacks adopted technology standards and design best practices. This paper is intended to provide guidelines and recommendations for standards implementation and best practices for a viable, cost effective, and reliable repository and preservation storage architecture. This architecture is based on a combination of open source and commercially supported software and systems.
Although several operating systems currently exist, the logical choice for an archive storage system is an open source operating system, of which there are two primary choices today: Linux and Solaris. There are many varieties of Linux available and supported by nearly all system manufacturers. The Solaris Operating System is freely downloadable from Sun Microsystems. Many variants of the Linux operating system and Solaris are available with support on a fee base.
The Hierarchical Storage System, or HSM, is a key software element of the archive. The HSM provides one of the key components that contributes to reliability by through data integrity checks and automated file migration. The HSM provides the ability to automate making multiples copies of files, auditing files for errors based on checksum, rejecting bad copies of files and making new copies based on the results of those audits. The HSM also provides the ability to read in an older file format and write-out a new file format thus migrating the format and application information required to ensure archival integrity of the stored content. The automation of these functions provides for improved performance and reduced operating costs.
The Sun StorageTek Storage Archive Manager (SAM) software provides the core functionality of the recommended preservation storage architecture. SAM provides policy based data classification and placement across a multitude of storage devices from high speed disk, low cost disk, or tape. SAM also simplifies data management by providing centralized metadata. SAM is a self-protecting file system with continuous file integrity checks.
The digital content archive provides the content repository (or digital vault) within Sun's award-winning Digital Asset Management Reference Architecture (DAM RA). DAM RA enables digital workflow and the content archive provides permanent access to digital content files.
With SAM software, the files are stored, tracked, and retrieved based on the archival requirements. Files are seamlessly and transparently available to other services. SAM software creates virtually limitless capacity. Its scalability allows for continual growth throughout the archive with support for all data types. The policy based SAM software stores and manages data for compliance and non-compliance archives using a tiered storage approach with integrated disk and tape into a seamless storage solution, SAM software simplifies the archive storage. Allows you to automate data management policies based on file attributes. You can manage data according to the storage and access requirements of each user on the system and decide how data is grouped, copied, and accessed based on the needs of the application and the users. Helps you maximize return on investments by storing data on the media type appropriate for the life cycle of the data and simplifying system administration.
Sun Open Storage solutions provide the systems built with an open architecture using industry-standard hardware and opensource software. This open architecture allows the most flexible selection of the hardware and software components to best meet storage requirements. In a closed storage environment, all the components of a closed system must come from the vendor. Customers are locked into buying disk drives, controllers, and proprietary software features from a single vendor at premium prices and typically cannot add their own drives or software to improve functionality or reduce the cost of the closed system. Long term preservation is directly dependant on the long term viability of the software components. Open source solutions offer the most viable long term option with open access and community based development and support.