Saving Electronic Records from Rot and Decay

An institutional archivist explains how she wrangles floppy disks and hard drives that preserve the history of the Getty

Hands holding six floppy discs.

My work cut out for me—the oral-history interviews on these old floppy disks will be transferred and archived.

By Lorain Wang

Oct 10, 2016

Social Sharing

Body Content

Here at the Getty’s Institutional Archives, our job is to preserve the history of the entire Getty.

This includes archiving the significant electronic records staff have produced over the years.

These born-digital files—meaning files that originated in digital form, as opposed to being digitally scanned from papers and photos—document the vast array of work that the Getty conducts in areas such as exhibitions, publications, public programming, art history research, and art and cultural heritage conservation. Both Getty staff and visiting researchers are welcome to consult these records.

In the past, digital files have come to us hidden within paper collections on old floppy disks and CDs. In more recent years, staff have been actively transferring their files to us through hard drives, network drives, email, and drop boxes. The files we deal with encompass your standard text documents and photographs, but also sound and video files, emails, databases, CAD files, digital art installations, and even the Getty’s website and social media presence.

Old hard drives stored in plastic bags organized inside a red case

Hard drives from former Getty staff awaiting processing

How do we handle all this? Well, Institutional Archives’ strategy for preserving born-digital files is very much a work in progress. In fact, you’ll be hard pressed to find any organization that has digital preservation completely figured out. It’s common knowledge that you should back up your files. But that by itself isn’t sufficient as a long-term preservation strategy. The reality is that there’s no magical one-step solution for preserving electronic files.

When Files Age and Rot

Digital archivists deal with two major challenges: obsolescence and bit rot. Bits of a file can change over time due to physical deterioration of storage media, which can make files corrupt and even inaccessible. This is common with CDs, but it can also happen with files sitting on hard drives or network drives.

Even with pristine files, there’s still the issue of obsolescence. Remember LaserDiscs? Technology is constantly evolving, making it difficult to ensure that disks and drives—and more importantly, the files they hold—can still be accessed in the future. Because of these vulnerabilities, when it comes to digital preservation, there’s no such thing as permanent storage. Files need to be periodically moved from one storage media to the next.

Forensics to the Rescue

That’s why an important step in our workflow is to get content off disks and hard drives. To do this, we’ve borrowed digital forensics tools used by law enforcement agencies. Although our motives may differ, archivists and digital forensic investigators are both concerned with making exact copies of files without altering them.

We maintain two forensic workstations, one of which is a FRED (Forensic Recovery of Evidence Device). FRED has ports to connect various kinds of internal and external hard drives and a built-in write-blocker to prevent file modification. Our second workstation, which we’ve named Fluffy, is a standard laptop with old drives attached by USB to read 5.25” and 3.25” floppies and zip disks. These two workstations, along with forensic imaging software, allow us to make exact copies of disks and hard drives that we then preserve as our master copy.

Two hands holding an external hard drive

An internal hard drive connected to FRED

We also use forensic software to examine the content of digital collections. The software allows us to safely view files in various formats, including obsolete formats, without accidentally modifying them. It also has keyword and pattern search functionalities so we can flag files containing sensitive information like social security numbers and credit card numbers.

Don’t Drag and Drop—Bag It!

To deal with file corruption, we use software to capture and monitor the checksums of all the files we accession. Checksums are alphanumeric strings that are unique to each file—like a digital fingerprint; if a file changes, its checksum will also change. We use the checksum to verify that we made an exact copy of the original and to make sure the file doesn’t change over time.

Files are particularly vulnerable to corruption during transfers, so rather than using the usual drag-and-drop or copy-and-paste function, we use Bagger, a tool developed by the Library of Congress to move digital content from one location to another. Bagger calculates and compares the checksums of the original and copied files and verifies that the checksums match. This can take a while when we’re moving really large sets of files—sometimes an entire day.

Automating File Monitoring

The final storage destination for our digital files is Rosetta, the Getty Research Institute’s digital preservation system. Rosetta extracts technical metadata from the files and monitors checksums on a regular basis. If a checksum mismatch is identified, the system can restore the file to its previous, uncorrupted version from a backup copy. We can also program the system to provide alerts if a file format has become obsolete, so we can decide whether to convert files to another format.

Back to Top

Stay Connected

  1. Get Inspired

    A young man and woman chat about a painting they are looking at in a gallery at the J. Paul Getty Museum.

    Enjoy stories about art, and news about Getty exhibitions and events, with our free e-newsletter

  2. For Journalists

    A scientist in a lab coat inspects several clear plastic samples arrayed in front of her on a table.

    Find press contacts, images, and information for the news media