Digital preservation: Rethinking the future of our history
Digital preservation: Rethinking the future of our history
Digital preservation: locking and unlocking the lasting value of digital content
How digital preservation supports digital connections, unlocks potential, and generates lasting value.
Modern civilisation is becoming more and more defined by the rapid evolution of technology, and its equally rapid obsolescence. The cultural heritage we create for future generations now largely exists in the form of ones and zeros found in myriad formats and in countless data storage systems. On top of that is the fact that we generate exponentially greater volumes of data every year.
When it comes to physical media, a book that is centuries old is fundamentally the same as one we print and publish today, and if preserved properly, is equally accessible today. When it comes to digital media, however, it is a very different matter. Devices and formats have changed almost beyond recognition in the last couple of decades, many are wholly incompatible with one another and many files can no longer be read due to platform obsolescence. This gap will only widen if we fail to pay attention to the importance of digital preservation in enabling lasting value.
Preserving cultural heritage
World Digital Preservation Day is an opportune time to address the importance of creating a cultural infrastructure and to raise awareness around the subject. The ultimate goal of digital preservation is to make obtaining a digital work of any kind just as easy as retrieving a book from a shelf – no matter how far into the future.
Records and Information Management (RIM) largely serves the needs of daily operations in an organisation yet it does address the lifecycle of information, which includes strategies for creation, use, storage and disposition of content – all of which have an impact on digital preservation. It is vital for any sector that generates, curates or archives information which requires preservation for use over an undetermined period of time into the future, to have a plan of action to retain essential content. Without it, valuable information related to entertainment, arts and culture, scientific research, or any other form of valuable intellectual property could be lost for all time.
A constantly evolving challenge
The need for digital preservation has largely arisen because of the relatively short lifespan of digital media and platforms, due to obsolescence or technical limitations. It also encompasses the use of digital technologies to digitise and preserve traditional content, such as print media and analogue cassette tapes, which also degrade over time.
For example, widely used hard drives, solid state drives, and flash memory devices often develop data errors after a few years. Most optical discs have a maximum lifespan of 50 years, and that assumes they are kept in very favourable conditions.
Even in cases where the physical limitations to the lifespan of media are reduced, obsolescence often happens at a faster pace. Some formats, such as SuperDisk and Zip Disk, are virtually unheard of today, making it increasingly difficult to find devices to read them. The use of more obscure proprietary formats also poses challenges, especially when their creators go out of business or discontinue support. The high-definition optical disk format “war” of 2000 to 2005 is a prime example, which ultimately resulted in the briefly lived HD DVD format vanishing from the market.
The ultimate challenge of digital preservation is the machine-dependency of any digital format. Even the ubiquitous pdf file used for years as a commonly accepted means to archive content is dependent on the platform on which it is stored.
Another rapidly growing challenge is the sheer quantity of data that we generate – currently around 1.7mb per second for every person on Earth (and we are fast approaching 8 billion). We have entered the era of big data, where data sets have become so large that using a traditional management approach is simply impossible.
Which data needs to be preserved?
It is safe to say that many of the records and data created by organisations have no long-term value at all. A records retention schedule is a formal policy that indicates how long information must be kept to satisfy regulatory and operational requirements, some of which may be for very long periods of time. Beyond these core requirements, archivists need to determine what information should be retained for its potential historical or cultural value. Even then, certain types of data may end up being more important in the distant future than they might seem now. For example, in 2011, four centuries after his death, police reports from the early seventeenth century provided insights into the last years of the master painter Caravaggio’s life.
Given the vast amount of digitised media in many archives, determining which of it needs to be preserved is a great task to tackle manually. Artificial intelligence (AI) and machine learning are proving promising solutions for automating, or at least partially automating, that process. For example, the UK National Archives is introducing a process to use AI for digital selection of government documents for permanent preservation (link to our case study which I can’t find on the website at the moment). These solutions can be trained to recognise candidate records and other types of media for permanent preservation by automatically detecting characteristics such as duplicates and other types of content that have little or no long-term value. Equally, they enable scalable identification of valuable information that is hidden in the data.
With the addition of metadata, archivists can also apply context to preserved data, thus making it more readily accessible in the future. For example, just because something we preserve may not mean something to one researcher, it might still be important to someone else. Metadata can help categorise the data for more convenient access and greater educational value.
Preserving lasting value
The overarching goal of digital preservation is to ensure the continued accessibility of digital or digitised content for as long into the future as possible. Such a strategy encompasses a broad range of disciplines, including records and information management, IT backup programmes, and archiving. The integrity of data must be preserved in order to protect it from manipulation later on.
Fortunately, technology exists to help us preserve our cultural heritage. The Voyager Golden Records, launched into space in 1977, are expected to last five billion years. But when it comes to more down-to-Earth use cases, we must adopt a range of approaches which requires us to collaborate closely with industry experts to address the challenges.
Finally, digital preservation is not solely the domain of museums, libraries, and other cultural institutions. It is vital for any organisation or individual who creates value in the form of digital materials.
Sue Trombley is the managing director of thought leadership at Iron Mountain.