“Data never speaks for itself,” Dan Bouk *09 writes in Democracy’s Data: The Hidden Stories in the U.S. Census and How to Read Them, which will be published next week. Instead, reading data “requires careful observation; thoughtful, curious questioning; and creative but also cautious interpretation.”
Bouk, an associate professor of history at Colgate University, performs that analysis on the 1940 census in his new book, which is both a case study in how data are designed, collected, and used and a vivid portrait of the society surveyed in that census.
“I think of myself as a cultural historian of data systems,” says Bouk. “I look at numbers and databases the way people look at poems and paintings, and I look at them with poems and paintings.” He believes that people are often reluctant to look at how data and statistics are made. “Particularly because modern society is mostly held together by systems of quantification,” he says, “there’s no excuse for us not spending a lot of time understanding how systems of quantification work.”
Bouk earned his undergraduate degree in computational mathematics from Michigan State University in 2002 and came to Princeton to study history as a graduate student. In his first book, How Our Days Became Numbered: Risk and the Rise of the Statistical Individual, which grew out of his dissertation, he explored the rise of the life insurance industry in the late 19th and early 20th centuries, an early example of the creation and use of big data by American business and one that heavily influenced the design of Social Security in the 1930s.
As Bouk details in Democracy’s Data, life insurance executives had a hand in designing the 1940 census, as did Census Bureau officials, politicians, and business leaders — including the chairman of the board of Sears Roebuck & Co., the Amazon.com of its era. “Statistics depend on politics (and politics on statistics),” he writes.
Two years ago, Americans could fill out a seven-question census survey online. In 1940, census data was collected by 120,000 enumerators, as they were called, who went door-to-door all over the country to ask at least one person in every household 30 questions. They recorded the answers by hand and sent the completed forms to Washington, creating a data set with about 4 billion pieces of information. The U.S. government releases census information 72 years after it was collected, so the 1940 census was the most recent one Bouk could study when he began his research for the book in 2017.
Bouk researched both the census records and the archives related to them for his book. “The data teems with the stories of Americans from all walks of life,” he writes, “the sort of stories that a historian cannot find anywhere else.” He offers portraits of a number of those people in his book. Among others, we meet the Rochester, New York, enumerator who counted the 15-year-old boy who would become Bouk’s grandfather; a small-town Mississippi politician who lobbied a congressman for a job as an enumerator; ordinary citizens who vigorously oppose being counted; and a Japanese-American employee of the Census Bureau whose parents lived with him in a Washington, D.C., boarding house and was counted as white by his enumerator.
“I hope this book will help people hear data speak in new ways,” Bouk writes toward its conclusion. “I hope readers will develop an admiration for data’s depths, for the way that sweat and blood suffuse a data set.”