One of the perks of being an editor is learning new things. Getty Publications publishes over thirty books a year on art history, conservation, architecture, archaeology, and related fields, so every time we take on a book project, we learn about new discoveries, new models of interpretation, and even new words.

Still, as an editor I’m not often asked to learn new technologies. Sure, the usual operating system upgrades and updates to Microsoft Word can have me tearing out my hair, but the editor’s tools have remained remarkably unchanged since the dawn of word processing…that is, until e-books came along.

The Getty was relatively quick to embrace digital publishing, but for our early e-books, the editorial workflow remained much the same; only the end product was different. For our two new online collection catalogues, Ancient Terracottas from South Italy and Sicily and Roman Mosaics, however, we thought differently from the outset. We asked ourselves: How can we capture what’s best about a print book—its longevity, user-friendliness, and portability—alongside what’s best about an e-book, especially the ease of access and interactivity?

Our digital publications team came up with an innovative solution, which presents the content as a responsive website and paperback and offers free downloads of the book as EPUB, MOBI, and PDF, as well as exports of catalogue data and image sets. Under the hood for all these is a static-site generator that unites images and style sheets with text files that contain the entire content of the book.

With that innovative solution, however, came new processes for us editors—and, even a new vocabulary, one that includes words like markdown, git, pull request, repository, commit, clone, middleman, and a bunch of others that take on new meanings once you’ve dipped a toe into coding waters.

While I left the heavy technical lifting to our digital publications team, I nevertheless had to get cozy with GitHub, a web-based repository designed to support open-source software projects. For those who are interested how this art-book nerd became (sort of, a little bit) a coding nerd, well, read on.

Editing for Print

For a print book, I first work with the author using Track Changes in Microsoft Word. We spend a good amount of time developing the text before sending it off to copyediting and, finally, to design and production. The designer takes the Word documents and typesets the manuscript using Adobe InDesign (really, transforming the words and images of a book into a beautiful object in and of itself).

After all this I oversee the proofreading of the text, which is to say, poring over tabloid-sized sheets of paper (otherwise known as page proofs) and marking the pages by hand using proofreaders’ marks. The designer enters those corrections into the files, I recheck the proofs, and we repeat if necessary, until the pages are approved and the book can be sent off to the printer. There are a lot of other steps and numerous other people involved in the process, but for me, it’s all Word docs and page proofs (and, OK, the occasional PDF).

Ruth Lane, Getty Publications editor, working on PDFs of Ancient Terracottas

At work marking up page proofs for Ancient Terracottas. Photo: Eric Gardner

Editing for the Web

Ancient Terracottas began the same way as most of our other books, with a bunch of Word documents that had been passed around between the author in Italy, curators at the Getty Villa, the peer reviewer, the translator, the copyeditor, and me. But instead of going off to design for typesetting, those Word docs went off to our developer, Eric Gardner, to be transformed into the Markdown and YAML files that power the catalogue in all its forms.

Markdown is basically code lite. There are bits of code floating around—links, special styles of text like small caps, and images all require some knowledge of HTML—but for the most part, a Markdown file doesn’t look too different from a regular Word document. But unlike a Word document, it can be easily converted to numerous other formats, including HTML. And unlike an HTML file, you can actually read a Markdown file without any specialized knowledge. YAML files are similar—plain text and human-readable—except that in the context of our publishing system, they have the added power of transforming a collection catalogue into something that can function as a searchable, sortable database as well as a linear, readable book.

Screen capture of Microsoft Word file showing a catalogue entry for Ancient Terracottas

Microsoft Word file for a catalogue entry in Ancient Terracottas, with a few late-breaking edits, just before Eric transformed it into a YAML file.

Catalogue 1 from Ancient Terracottas, online and as prepared in a YAML file

A catalogue entry (Cat. 1) from Ancient Terracottas as published on getty.edu (left) and as prepared in a YAML file (right). The data—like catalogue number, accession number, title, group, and location—are all separated out individually, which allows for sorting by date, typology, or location, and facilitates better searching.

After the Word documents were converted to Markdown and YAML files (and loads of other programming things happened), Eric produced a PDF and a beta version of the website for proofreading and user testing. The proofreading process for the PDFs wasn’t too different from that for a print book—we still looked at big tabloid-sized page proofs, alongside the website, and marked up corrections by hand—but instead of handing those proofs off to design to input the corrections, I downloaded GitHub and the text editor Atom and entered them myself, with some handholding from Eric and our digital publications manager, Greg Albers.

Screen capture of a YAML file showing the object description for an entry in Ancient Terracottas

YAML file with the object description for Ancient Terracottas, cat. 1. Notice how readable it is, with only a few bits of code, like the asterisks around klismos, which indicate italics.

I won’t say that using these new tools was painless—I definitely developed greater sympathy for the designers who enter the text corrections for our print books—but nor was it as painful as I worried it might be. I struggled with the bits of HTML included in the text. The elegantly small-capped BC, which appears hundreds of times throughout the manuscript—is surrounded by HTML markup, making the experience of reading the Markdown files much more challenging. But using Atom’s powerful find-and-replace function made, say, correcting a misspelled name that appears throughout the book much easier, and as I spent more time in GitHub, I got more comfortable using the tools that make it unique.

And of course, these tools serve a goal far bigger than “just” the one hundred thousand words and hundreds of images that comprise Ancient Terracottas from South Italy and Sicily. Only two months after its publication, we launched Roman Mosaics, and other books using the same platform are under way. The platform allows for the books to be experienced on our website, downloaded as PDFs, e-books, and datasets, or ordered as paperbacks.

To me, more remarkable than the form they take now is that they are built to last. While these publications take advantage of the latest technology, the fact that they are built from plain-text files and as open-source static sites helps to ensure they will never be dependent on soon-to-be-out-of-date platforms and are much more likely to be compatible with the technologies of the future. And because each catalogue is also available in its project repository on GitHub, anyone can see how the publications were made, from text corrections to code.

It’s fitting, then, that the first book built on the Getty’s new online publishing platform is about ancient terracottas, which were created from molds, circulated widely throughout the Mediterranean, and have survived for thousands of years. “Create once, publish everywhere” is something of a mantra in the digital publishing world…but as with many things, the ancients beat us to it.