Archival of data
Digital permanence addresses the history and development of digital storage techniques, specifically quantifying the expected lifetime of data stored on various digital media and the factors which influence the permanence of digital data. It is often a mix of ensuring the data itself can be retained on a particular form of media and that the technology remains viable. Where possible, as well as describing expected lifetimes, factors affecting data retention will be detailed, including potential technology issues.
Since the inception of automatic computers, a key difference between them and other calculating machines has been their ability to store information. Over the years, various hardware devices have been designed to store ever larger quantities of data. With the development of the Internet the quantity of information available appears to continue to grow at an ever-increasing rate often characterised as an information explosion. As information is increasingly being stored on electronic media as opposed to traditional media such as hand-written documents, printed books, and photographic images, humanity's social and cultural legacy to future generations will depend increasingly on the permanence of these new media.
However, not all of this information is worth saving; sometimes its value can be short-lived. Other data, such as legal contracts, literature, scientific studies, are expected to last for centuries. This article describes how reliable different types of storage media are at storing data over time and factors affecting this reliability.
Librarians and archivists responsible for large repositories of information take a deeper view of electronic archives.
- Data format
- Data must be stored in a format which can be meaningfully accessed now and in the future.
- Technology reliance
- If data requires a special program to view it, say, as an image, then software must also be available to both interpret the basic data file and also render it appropriately. In some cases, this might also require special hardware.
- Archival strategy
- Data must remain available in the long term.
- At present, a growing problem is the time taken to reproduce an archive, for instance following a hardware or system upgrade. Since the sheer volume of archive data continues to grow, new hardware is always required to maintain the archive and so regular migration of data to a new system must be performed on a regular basis. The time taken to migrate data is starting to approach the frequency of system upgrade, such that archive transfer will become a continuous, never-ending process.[1]
- Digital rights management
- Maintaining digital information in an accurate and accessible format over an extended retention period also must address the requirements of the authors' digital rights.
- In many cases, the data may include proprietary information that should not be accessible to all, but only to a defined group of users who understand or have legally agreed to only utilize the information in limited ways so as to protect the proprietary rights of the original authoring team. Maintaining this requirement over decades can be a challenge that requires processes and tools to ensure total compliance.
- Reproducibility
- Digital information must be able to be reproduced as originally intended or available.
- This is significant especially where the original data was produced on technology at a lower level than currently possible. For example, archivists try to maintain the distinction between listening to a gramophone record played on a gramophone as opposed to a digitally cleaned version of the same recording through a modern hi-fi system.
Given that individuals' personal data has been growing at a rapid rate in the 21st century,[2] these archiving issues affecting professional repositories will soon be manifest in small organisations and even the home.