Woodrow Wilson School professor Stanley N. Katz went to Congress in the late 1980s with a modest proposal: Why not allocate a billion dollars to digitize the contents of the nation’s academic libraries?
“People just laughed at me,” says Katz. “They thought I was crazy.”
They did not think he was crazy because they knew that $1 billion was nowhere near adequate to the task. They laughed because they couldn’t imagine why you’d want to do all that digitizing in the first place. Even Katz hadn’t an inkling of the revolution soon to come. “None of us could foresee large-scale digitization,” says Katz. “As it turned out, it wasn’t Congress [that did it], but Google. It’s happening right now. A variety of different entities are doing it on a massive scale.”
Indeed, the Princeton University library system is one of 29 libraries participating in Google’s ambitious digitization project. Each month, thousands of its books get packed up in boxes and carted off in a Google truck to be scanned and posted on the Google Book Search Web site, books.google.com. Google announced the project in 2004. It is secretive about the number of books it has digitized so far, offering “over one million” as the figure for public consumption, but some experts put the actual tally as high as seven million. That would be nearly halfway to its goal of 15 million. When you recall that the Google effort is one of thousands of such projects being carried out all over the world, and that all of this material is to be made available on the Web, you realize that the way libraries function is going to change, too. Indeed, we are in the midst of the most exciting, transforming time for libraries since ... well, probably since Johannes Gutenberg made enemies of all the scribes of Europe.
“Very quickly, everyone’s going to have a Firestone on their laptops,” says Dan Cohen ’90, who oversees George Mason University’s Center for History and New Media, which explores the ways in which digital technology might transform libraries, scholarship, and possibly, the way we think.
That represents a huge leap forward in the democratization of information. But you don’t need to be Rod Serling or even possessed of a particularly dystopian turn of mind to see where all this digitization might lead. Karin Trainer, Princeton’s top librarian, says she has been hearing apocalyptic scenarios for some time now: “I’m constantly approached by people who say, ‘Everything is digital now. Why do you need a library?’”
If you recall the libraries of your childhood as magical places, such a comment comes close to sacrilege. Those libraries were warm and safe; you could spend entire afternoons opening books onto worlds you never knew existed, with the only threat being the sharp tongues of zealous librarians. In a recent essay on libraries in the New York Review of Books, Princeton history professor Anthony Grafton conjured up a lovely phrase — “loneliness and freedom” — to describe what bookish children find in libraries.
Grafton himself is a lifelong connoisseur of libraries. Over the last decade he has watched them change — mostly, he thinks, for the better. What he refuses to do is try to predict what they might look like in the future: “What worries me is a very simple question,” muses Grafton. “If you had tried to guess in 1998 where we’d be now, nobody, however well informed, would have been able to guess what the research landscape would look like now. So, to try to build a library or to plan one — it’s impossible. Who knows where we’ll be?”
For Trainer and the University, that is not just a theoretical question. Having completed a successful renovation of Marquand Library in 2003 and opened the eye-popping Lewis Library for the sciences this fall, Princeton is turning its attention now to the renovation of Firestone Library, which is expected to take 10 years. Princeton’s main library turned 60 this year; even its “new” addition is 20 years old, placing all of it on the far side of the digital divide.
For centuries, “modernizing” a library meant adding shelf space and updating the furniture. It still means building the collections and providing inviting workspace. But more and more, it means designing a building that supports scholars who have grown up in the digital age and who increasingly will be creating digital objects and materials in place of traditional papers and books.
“Obviously, one thing you do is to make your space and systems as flexible as possible,” says former Princeton provost and former Harvard University president Neil Rudenstine ’56, who, as a member of the board of the New York Public Library, has been helping to plan a major renovation of its main building and a reorganization of its branches. “You don’t design the interiors in the grand and wonderful manner of the past, out of heavy marble!”
“In some ways, the future is here,” says Dan Linke, the University archivist and curator of public-policy papers at Princeton’s Mudd Library. He means that a variety of academic fields shifted their focus from books to papers some time ago. Humanists may cling to their books, but in number-crunching disciplines like astrophysics, chemistry, and math, journal articles are what scholars produce, and those articles get published in digital form. “The rhetoric of the scientific moment is that they don’t need books or paper journals at all,” says Grafton. “Scientists want the data right now.” Indeed, in 2007 Paul Ginsparg, a professor of physics and computing and information science at Cornell, won a MacArthur “genius” award for, among other things, creating arXiv.org, an online system to publish scientific research results.
But Linke also is referring to the ways in which digitization has transformed the legwork of the historians who visit the Mudd collections. No longer must they phone Linke or write him asking what is available: “With our finding aids up online, people walk in the door already knowing what they want to look at,” he says. “The transparency of it all is what’s really changed. You can know what’s behind these windowless brick walls, and you don’t even have to be here in the building. You can be anywhere in the world.”
In the digital age, libraries are like information octopuses, reaching tentacles out into laptops all over the world. Grafton knows this firsthand. He recently has been studying Isaac Casaubon, a Renaissance classical scholar who, Grafton found, also was a major scholar of Hebrew. Grafton has a partner in this effort, Joanna Weinberg, a scholar of Hebrew and Jewish studies at Oxford. For most of the year, the two are an ocean apart, but they’ve spent the last few summers working together in the British Library, where they have discovered some 35 books they believe belonged to Casaubon. Studying these books in London, Grafton uses his laptop to read the vast collection of Casaubon’s correspondence digitized in Mannheim, Germany, while at the same time Weinberg might be consulting Jewish sources on her computer. Presumably, Grafton could travel back and forth to Mannheim to look at the original letters — in some cases, it may be essential to do so — but it almost certainly would be costly and time-consuming.
There are two different approaches to digitizing materials. One is to take whole shelves of books and scan them, as Google is doing — a process Clifford Wulfman, Firestone’s digital initiatives coordinator, calls “Hoovering,” a reference to the vacuum cleaner. The other way is to pick and choose items or collections based on a variety of considerations: how important they are to scholars; how fragile they are; and how readily one can obtain funding for the project, since digitizing can cost $1 per page. Princeton has five specialized high-resolution scanning devices of its own, one with a special “cradle” designed to permit a book to be scanned without opening it so far that its binding suffers. Among the materials Prince-ton has chosen for digitization so far are a series of postcards of the campus dating from the turn of the last century, a collection of fragile subway posters, 200 Islamic manuscripts, and, in collaboration with Harvard and the Library of Congress, 150 rare Chinese medical texts.
Digitization involves more than the time-consuming process of scanning each volume, page by page. At least as important is the creation of meta data, the descriptive tags that enable Web browsers to find them. “If you were to take the contents of a box of John Foster Dulles’ papers [from Mudd’s public-policy collection] and just scan it all, and not attach some information to it so that the Web browser knows something about it, it would be the equivalent of dumping the box upside down on your desk,” says Linke, noting that the Mudd Library has about 35 million documents, enough to keep a small army of technicians busy scanning and creating meta data for some time to come. “If I could wiggle my nose and have a digital image of all of them tomorrow, there still has to be some work to make sense of them.”
Moreover, preserving and maintaining digital copies is still a major challenge. The technology is evolving so rapidly that what we digitize today may not be decipherable by tomorrow’s reading devices. “Formats fall out of favor,” says Wulfman. The Library of Congress maintains a lot of “legacy equipment” so that people can read information encoded using old technologies. In some cases people are actually taking scanned materials and converting them back into paper versions in order to ensure their preservation.
Trainer has a reply for those who ask if we really need libraries: “The answer is [that] everything is not digital,” she says. “Only a small amount of the world’s knowledge is available in digital form, and much of what is available is very expensive and out of the reach of the ordinary citizen.” (In a 2007 Daily Princetonian column, classics professor Joshua Katz noted that to test the availability of online materials, he searched for secondary sources he had used in the writing of two of his own papers. Very few were electronically accessible.) And, as Trainer points out, libraries continue to acquire huge numbers of books. Last year roughly 1 million books were published around the world, and Princeton’s library system acquired about 125,000 of them.
But there’s no doubting the growing importance of digital materials. If anything, professors worry that Web searches give students the illusion of having conducted a thorough search without actually doing so; some estimate that 90 percent of student research commences with a Google search. There’s a lot out there: Princeton now subscribes to thousands of electronic databases, at a cost of about $5 million a year. Indeed, since 2001 a fulltime staff member has been employed to keep licensing agreements current and available to Princeton library users — wherever they happen to be. The databases Princeton subscribes to cover a wide range of subjects: One database has 250 operas on video; others contain materials related to Jane Austen, Aristotle, and African-American poetry. You can find statistical data about New York City in a database called Bytes of the Big Apple. There’s JSTOR, a compendium of digitized scholarly journals, and ARTstor, a digital library holding nearly 1 million images (Rudenstine is its CEO).
To provide access to it all, Princeton subscribes to services like Portico, which maintains more than 8.5 million electronic articles and migrates them forward technologically, so that future generations of readers can use them. “If we subscribe electronically to a journal, we don’t physically have it,” says Sandy Brooke, head librarian at Marquand. “We’re really dependent on either the publisher or somebody like Portico to save it. It’s a very different role for libraries, isn’t it?”
One vision for university and research libraries of the future holds that they will be distinguished by their unique materials, their special collections, and archives. There will be a vast body of digitized material that is held in common and easily accessed via laptop, and then there will be treasured, one-of-a-kind materials whose deepest secrets will be penetrated only by close examination. Grafton, in his library essay, describes a scholar systematically sniffing 250-year-old letters in hope of detecting the presence of vinegar, which used to be sprinkled on letters sent from cholera-ravaged areas in attempts to disinfect them. In this way the scholar hoped to chart the contagion’s spread.
Most of us, though, don’t require such intimate familiarity with original documents. As electronic reading devices like the Amazon Kindle — and even more-enticing versions of it to come — become comfortable and convenient, even lending libraries won’t need to stock copies of the latest bestseller. Those, readers will access on super-quick, easy-to-use machines.
Still, Grafton believes that humanists will want to wander the stacks. “Students need to be in the library because that’s how you learn your fields of scholarship,” he says. “I don’t actually think you can learn it on the Web in the way you learn it by walking the stacks and seeing the books and reading some of them and leafing through others. I don’t see any other way to train graduate students.”
That’s how Princeton English professor William Gleason familiarized himself with the history of leisure as a graduate student at UCLA. “It’s really important for libraries to strike a balance between having all those data-generating machines and creating an environment that invites people into the library, to read and to use it in a way that’s not available on your laptop,” says Gleason. “The closer digital technology comes to actually reproducing the experience of wandering the stacks, the better — that’s really what the humanists are looking for. So much of the work we do involves a kind of serendipitous wandering through the library.”
Twice, Gleason’s serendipitous wanderings have been rewarded in amazing ways, once so dramatically that it seems the sort of magical discovery you’d find in a children’s tale. He had taken his class to Firestone to examine first editions of some of the American classics they were reading. When he picked up a copy of Uncle Tom’s Cabin, out fell a letter — and not just any letter, but one from Harriet Beecher Stowe herself! “The library [staff] didn’t even know they had it,” says Gleason.
Cohen, at George Mason, agrees that accidents of this kind are a boon to scholars, but he points out that research done in traditional libraries allows scholars to miss a lot, too, because books with pages are not keyboard-searchable. “The digital environment can indeed obscure things,” he says. “But on the other hand, you’re going to find things you absolutely won’t in the stacks. There’s only one arrangement of the stacks.” Even that may change soon. David Mimno, a graduate student at the University of Massachusetts–Amherst, has come up with something called Virtual Shelves, a program that arranges the card catalog in a way that matches specific research needs. “A historian doing research on the Erie Canal is going to want the stacks to look different from an economist studying the Erie Canal,” says Cohen.
In any case, there will be no way for humanists of the future to get around digitally based research. “Robert Caro could probably read all of LBJ’s papers,” Cohen says. “It might take him 10 years, but it’s a doable project. But if Bill Clinton’s future biographer goes to the National Archives, he’ll find an e-mail server with 40 million e-mails on it. No one can read 40 million e-mails. You’re going to have to rely on digital tools [to mine it].” The George Mason center that Cohen presides over provides some examples of what scholars may look forward to. Among its more intriguing projects is a collection of digital materials related to the Sept. 11 attacks. The 50,000 objects donated to the site by 30,000 people include BlackBerry communications and digital video, art, and photography. It’s a bit like an oral-history project, except that there are so many more pieces of it, all of which can be searched digitally. “We’re trying to get a sense of what that scale can do,” says Cohen. “What can you do when you have that much from that many different people? I think it really does present new research possibilities.”
Provost Christopher Eisgruber ’83 is a leader of the team planning the Firestone renovation. He’s also a legal scholar, and he’s seen how digitization has changed the field. “I can’t remember the last time I looked at a journal article in hard copy,” he says. “Probably one of my own, when the publisher sent me a free copy!”
Eisgruber is among the last generation of lawyers to be trained using the old West Key Number System, which grouped case law into rigid categories and limited the connections lawyers could make between different fields. But electronic searches have changed that dramatically, opening up fruitful new ways to think about the law. “It means you can draw links between cases that couldn’t have been drawn under the old system,” says Eisgruber, who also points out that there are many empirical questions that legal scholars now can ask because they have a reasonable expectation of finding an answer — like patterns in decision-making by judges.
Katz allows that all this digitization will not lessen the need for the hard thinking that inevitably follows the manipulation of data. “A lot of this is the capacity to utilize information, and the information was always there,” Katz says, noting that digitization has made it possible for whole new fields to emerge. “Think of computational linguistics. We have learned incredible things by dumping huge amounts of text into computers and analyzing the text.” He also considers the many ways in which digital technology can allow scholars to compare data sets and apply them to maps — much as atlases used onion-skin overlays to illustrate changing conditions like the migration of tribes, the spread of plagues, or the reach of glaciers in different geological periods. With geographical information system (GIS) databases, detailed re-creations of a landscape and its shifting human, vegetable, and geological presences make this sort of geographic modeling sophisticated and thorough. “You could re-create digitally the east coast of North America from the beginning of recorded time until today, plugging in information about climate, population, geological changes, wildlife migration, human records,” says Katz. “Well, think about that: We could figure things out that we simply couldn’t do before. That’s what I call a real intellectual breakthrough.”
None of this, Katz points out, necessarily is a library function — at least in the way we once conceived of libraries. These things could be done in a technology center or in rooms devoted to technology in different departments. But Katz sees the digital learning centers of the future as the most logical venues for this sort of intellectual activity to take place, offering computer power and collaborative workspaces where scholars from different fields can exchange ideas. “These are toys that really smart people can do very interesting things with,” says Katz, who sees libraries as the logical site for a broader reorganization — really, a reimagination — of the musty old academic departments of yore, like “English” or “History.” The most ambitious work, Katz believes, will leap right over the old, confining categories and create new and richer fields.
“We’ve got to create a building without walls that really is something like a Princeton Google campus where we let the students play with the toys and figure out what they want to do,” says Katz. “That playground needs to have the machinery, but more importantly it needs to have the playmates, who are librarian technologists, and it has to be welcoming to all three [kinds of users] — faculty, undergraduates, and grad students.”
This is just the beginning of a bigger shift toward what Cohen calls the “active library,” one that actually reaches out to scholars with recommendations and alerts them to useful publications. Some universities, he says, are experimenting with notifying faculty members when books they might be interested in come into the stacks. Think of the recommendations provided to customers by Amazon.com: If you like book A, we think you’ll like book B.
For Katz, there is another challenge to designing libraries in a digital age, one that hearkens back to all those warm and fuzzy joys we experienced in the libraries of childhood: “To what extent can we reinvent — do we need to reinvent — the social function of the physical library?” he muses. Despite dire predictions that in a digital era, students would turn their backs on libraries and read only on their laptops, Princeton undergraduates still flock to Firestone — drawn, it seems, by some deep need, perhaps simply for quiet, perhaps to be close to all that magnificent human striving. However quick and comfortable reading devices become in the future, there is some larger way in which libraries still need to embody our respect for the accumulated wisdom of the past.
During his undergraduate years at Yale, Wulfman developed a kind of ritual to help him savor that other, more mystical meaning of libraries. Upon finishing his last exam, he’d go to the Beinecke Rare Book and Manuscript Library where, in a huge glass case, a tall column of rare books is displayed. The rumor on campus was that those books sat upon an elevator; in the event of a disaster they would descend into a vault, where they’d be preserved. Wulfman would sit and contemplate that great tower of books, savoring all that accumulated human aspiration and knowledge.
“How do we preserve that in the virtual world?” wonders Wulfman. “How do we get a sense of the totality of knowledge in a visceral way? That is something that designers of libraries are going to have to think about.”
Merrell Noden ’78 is a frequent PAW contributor.