Nicole Coleman, Stanford University

What is data? Dictionaries define it as structured information, chiefly numerical, scientific, used for calculation. That definition probably rings true for most of us. But it would be a poor description of the wide range of information at play in research laboratories across disciplines, not to mention the data generated and exchanged in our daily social interactions. We still tend to think of data as if it were controlled, measurable, and verifiable, even if we know it isn’t. It remains common practice in the humanities and social sciences to create data visualizations by feeding numbers through a set of instructions to generate graphics, giving those numbers form as bar charts, lines, bubbles, scatterplots, and network graphs.  That approach to data visualization is well-suited to review and assessment because it is based on well-established and theorized statistical models and quantitative methods. But it ignores the way data responds to the instruments used to capture, manipulate, and display it. 

Graphite in the hands of an engineer drawing precise technical plans produces a different result than a pencil in the hand of an artist capturing a moment in time on paper. The medium is the same. The purpose is different. We would not expect the two people to produce the same type of drawing simply because they are using the same medium, and we would not evaluate their skill based on the same criteria. So why hold historians, literary scholars, and statisticians to the same standards and methods, ignoring the epistemological intent of the data analysis? 

At Humanities + Design, a research lab at Stanford University, we look to other domains for examples of how to engage with the complexity and uncertainty in data.  From the “Information” exhibition at the Museum of Modern Art in 1970 to digital artists working today we discovered techniques for interrogating the expressive qualities of data that influenced our design of instruments for humanities research.  Historians mining archives and compiling data understand that data are imagined, that there is interpretation involved in the moment of choosing which information to capture and in deciding how to structure it. The structure of the data, in turn, influences the questions we can ask of it. As the questions evolve, the structures may change, too, and the data will flow into the new structure in a way that offers new or different answers. Data is not inert or neutral, it is potential. When activated it can be shaped and used to persuade. When we act on data, we are influencing the data and generating new data in the process. That is data as medium. 


Nicole Coleman is the Digital Research Architect at the Stanford University Libraries as well as the co-founder and Research Director at Humanities + Design, a research lab at Stanford’s Center for Spatial and Textual Analysis.