Meet David Newbury, the Getty’s first-ever software and data architect. He joins Rob Sanderson and Chris Edwards as part of the newly formed enterprise architecture team.

A creative developer and onetime motion-graphics animator, David comes to LA from Pittsburgh via the Caribbean, Romania, Belize, and multiple US states. Outside of software, his favorite things are sailboats, too many books, woodworking tools, and comfort food—he can confirm that tacos are tastier, and pierogis sparser, in LA than on the East Coast. We talked about what he’s up to now, and what cultural heritage tech might achieve in the next 50 years.

Annelisa Stephan: You’re the new software and data architect. What does your job entail?

David Newbury: I came to the Getty to help design systems and processes that let computers help people tell the story of history of art. I’m doing that by working with other people who are experts in their fields and providing the common ground for them all to work together.

The Getty’s digital architects are the plumbers. If you’re doing something that you only going to do once, you don’t need plumbing—you can just use a bucket. It’s more efficient to have the plumbing, but no individual project can support doing the plumbing. People can come to us and say, “this is what we wish existed.”

My job is to understand the systems running on computers across the Getty, in particular the ones that present information to the public, so I can find similarities and help bring together  people across the Getty who are solving same problems. Half my job is wandering around and having people tell me what they’re working on, and the other half is finding those connections and writing down what we need to do to make them all play together.

AS: Share an example.

DN: The Getty Conservation Institute manages the AATA, a database of abstracts of conservation research. The Getty Research Institute, in their Provenance Index databases, has a list of archival inventories containing what exists in individual archives across the country. They’re both great projects with really rich data, but what I see from my 50,000–foot view is that they’re also both lists of documents wrapped around a particular discipline’s needs.

Two web pages side-by-side showing the search entry fields.

The Getty Conservation Institute’s AATA Online database of conservation literature abstracts (left) and a search interface for archival inventories within the Getty Research Institute’s provenance databases (right)

If we can come up with a consistent way to capture a citation with additional research, any tool we write to present that to the audience will be useful for both projects. I could then work on describing what pieces of software will work across both of these systems.

AS: It sounds like a big part of your work is talking to people.

DN: It’s 80% talking to people and 20% typing. I’ve been telling people my job is software diplomacy.

AS: Software diplomacy—say more about that.

DN: Diplomacy is the art of finding compromises. It’s finding places where everybody is happy to do a little bit more work to make the next project even better. Almost everything I suggest is more work, today. It won’t be down the road. My job, I think, is to help show people where collaboration can make what they think is important, better.

AS: Why is collaboration so important to you as software and data architect? 

DN: I’m coming to cultural heritage as an outsider. I can’t do the work an archivist does, and they can’t do the work a software developer does, so if I’m going to do anything meaningful in this space, I have to do it in partnership with other experts. Every success I’ve had in this field has been in working together with somebody else to do something that neither of us can do on our own.

This works at the personal level, but also at the institutional level. Memory institutions are very good at being closed systems, and we’ve solved the sorts of problems we can solve on our own in each of our sandboxes. We need to start building things that are bigger than the sandbox, and we can only do that through collaboration.

AS: You’re the third enterprise architect at the Getty, along with Rob Sanderson and Chris Edwards. What are the three of you working on?

DN: Our three big pushes are linked data, IIIF, and machine learning for computationally generated metadata. Linked data is taking descriptions of entities, things, people, or events, and making them concrete and structured. IIIF is about doing the same thing for pictures.

Computer vision and machine learning are nice buzzword-y things, but they don’t solve problems. “I’m going to computer-vision your picture” doesn’t mean anything! But if I say, “I’m going to take a picture and run a computer to generate some metadata,” I can make a human’s job easier. You could sit there and transcribe all the text contained in an image, but what a terrible use of a human being.

AS: What are your next steps?

DN: We’re having a lot of conversations that sound like freshman philosophy questions, like, “What is a thing?” Museums have a wonderful language to describe things, and archivists have a wonderful language to describe context, and librarians have a wonderful language to describe holding and access. If we can steal the best bits of every discipline, then we’ll have a bunch of great techniques to describe things.  And we’ll know that these are useful, because they’re coming from people who know what they’re talking about.

Here at the Getty, we’re doing some of the R&D work. I think what we should be doing is building little tools that together make up the whole pipeline, but that pipeline is really only useful here at  the Getty. If we build a bunch of Legos—little black boxes of established, learned experience that we can try out, and describe—we can see if other people need them.

I’m always thinking about small museums. If we can build things that are useful not just for the Getty, but for a smaller museum as well, then we’ve actually solved a problem for the field.

AS: When you talk to colleagues in the cultural heritage technology field about your new role, what are they most interested in?

Two diagrams. Object and Person appear in the center of ovals with boxes of data radiating from each one.

Diagrams of how an object and a person are linked to relevant data under a Linked Open Data model. Image via Rob Sanderson, website, licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0)

DN: Linked open data. We now have enough resources being put into it that people are asking us, “Okay, will this actually solve problems? What problems is it solving?” LOD only works if we as a field do it as a whole. No one wants to be that person who jumps off the cliff first and sees how deep the water is—but we have jumped off the cliff, and are currently hurtling down toward the water. In the next year we’ll know whether or not what we’re doing actually works or doesn’t.

AS: What problem do you think linked open data solves?

DN: As museums and as archives and libraries, we’ve spent 20 or 50 or 100 years learning how to describe objects. We don’t know yet how that object interacts with other objects. Memory institutions are beginning to realize that what makes their objects valuable is not the description of the thing itself, but how it relates to the world as a whole. The Mona Lisa is pretty interesting as a painting, but it’s not nearly as interesting as the Mona Lisa as seen through history, all the people who’ve ever owned it, the stories that are attached to it, their relationships.

The promise of linked data is that it gives us a way to start capturing those relationships. And most of those relationships don’t happen within the boundaries of a particular institution. They happen across the world at large, and so we have to be able to share information across the world to describe them. It’s also way harder to capture relationships than it is to describe things. The Getty, for all of its size and expertise, isn’t big enough or sophisticated enough or fancy enough to describe all those relationships for everything. We have to do it as a field and we have to do it together.

And this is a tiny version of the real problem, which is: how do we get all the people who know interesting things about cultural heritage to talk to each other?

AS: Consistency is one of the priorities of the Getty architects’ team. Why?

DN: Even the Getty, with all its resources, can’t talk about art history broadly alone—but if we can get all the institutions talking in the same way, then we have a much better chance of talking about art history, which is what we’re really interested in.

Fieldwide standardization is what we need. How can we make it easy for people to work together? How can we make it easy to do things the right way?

AS: And who determines the “right” way to do things in art history?

DN: There isn’t an objectively right way to do these things, so what we’re really looking for is consistency—and even more, consistency that we, the Getty, didn’t invent. The more times we can say, “A group of smart people have already solved this problem,” the more efficient we will be.

AS: Help us understand the big picture. What does your work help us do?

DN: I think of the plumbing work I do as enabling storytellers to tell stories. Computers are lousy storytellers, but they can expose connections that aren’t obvious on the surface—this and this object were in the same room at the same time, or these two objects are similar, or these two people’s lives converged—and these are the stories that interest the public.

Data and software people can take lots of data and turn it into something usable. We can give storytellers raw material, but we need humanists to tell the stories and identify the important ones from the unimportant ones. Art historians are also storytellers. The ones I’ve known who are good at what they do are always looking for that exceptional event, that story that defines a bigger world.

AS: In 5 years from now, what would success look like to you?

DN: If I can, say, pick any two artworks in any museum in America, and be able to have a computer say what they have in common, that would be an amazing success. Or if I were a tourist in Paris in 1850, what artwork would I have seen?

These are hard questions to answer now. And they’re not hard because the information is hard, but because you have to pull so much information across so many sources to get the answer.

To get there we need consistent data structures across institutions, and to get there, we need examples of the cool things that can be done with this data to make it useful. I can’t build the art history time machine without everyone being on board.

AS: What about 50 years from now—what would success be then?

DN: In 50 years, if we all do our job well, I should be able to pick any work of art in the world and say, “Computer, teach me about this.” It would be able to understand my interests and know enough about how this thing is connected to the rest of the world to explain why this artwork matters to me. I think that’s what museums try to do, but we have to do it in bulk right now.

Knowing how to write down—in a way that a computer understands—everything that’s in the head of an art historian is the sort of project that we’re trying to tackle over the next 50 years.