Distant Horizons: Digital Evidence and Literary Change

by Ted Underwood

Just as a traveler crossing a continent won’t sense the curvature of the earth, one lifetime of reading can’t grasp the largest patterns organizing literary history. This is the guiding premise behind Distant Horizons, which uses the scope of data newly available to us through digital libraries to tackle previously elusive questions about literature. Ted Underwood shows how digital archives and statistical tools, rather than reducing words to numbers (as is often feared), can deepen our understanding of issues that have always been central to humanistic inquiry.  Without denying the usefulness of time-honored approaches like close reading, narratology, or genre studies, Underwood argues that we also need to read the larger arcs of literary change that have remained hidden from us by their sheer scale. Using both close and distant reading to trace the differentiation of genres, transformation of gender roles, and surprising persistence of aesthetic judgment, Underwood shows how digital methods can bring into focus the larger landscape of literary history and add to the beauty and complexity we value in literature.

About the Author

Ted Underwood is professor of information sciences and English at the University of Illinois, Urbana-Champaign. He is also the author, most recently, of Why Literary Periods Mattered: Historical Contrast and the Prestige of English Studies.

Read an Excerpt


Do We Understand the Outlines of Literary History?

Literary studies is littered with terms that suggest one critical practice has displaced another: poststructuralism, postmodernism, New Criticism, New Historicism. Even "distant reading" is often understood as a would-be successor to "close reading." Habits of debate are by no means a bad thing: they make conversation in an English department playful and agile. But it is worth teasing out the assumptions implied by these succession stories — which are not dominant, after all, in every corner of the university. Many disciplines tell their own stories instead as a cumulative process of expansion. Bioinformatics has not replaced biochemistry. Although these different approaches to the study of life admittedly compete for resources, no one imagines that the exchanges between them are zero-sum games or that one has displaced the other.

Literary scholars, by contrast, commonly do assume that critical approaches are locked in dialectical struggle. And this assumption is not arbitrary: the premise has been correct for much of our history. Critical debates amount to struggles over a scarce resource — readerly attention. Should the ghosts in The Turn of the Screw be experienced as repressed desires or as a suppressed classidentity? We can say both, but dividing classroom time between interpretations is still a zero-sum problem. Even when scholars move from theoretical debate to archival research, our enterprise seems to draw meaning from the pressure of limited attention: recovery projects become notable and important (rather than merely additive) when they argue that new discoveries force us to redefine an established concept like Romanticism.

I'm dwelling on these disciplinary habits in order to highlight the assumption on which they all rest. Literary scholars tend to feel they are arguing about the redistribution of interpretive emphasis within fixed historical outlines. Implicit in this self-understanding is an assumption that the broad divisions of literary debate are already known: that we are unlikely to discover, for instance, a new genre or period in the archives. It seems a safe assumption, because the historical organization of our discipline has been relatively stable, compared to the list of research topics in biology. There are exceptions — especially toward the contemporary end of the timeline, where new terms are always taking shape. But once a list of periods, movements, and genres has consolidated, it tends to endure: if you page through old college catalogs, you find courses on "Elizabethan Drama" and "English Romanticism" offered in unbroken succession from 1900 to the present. Nor is it clear yet that distant reading has altered the logic of this game. Franco Moretti's "Slaughterhouse of Literature," for instance, could be read as a new approach to a familiar topic. Although Moretti gestures at a mass of unread works, literary scholars have been reluctant to believe that this really creates a new object of knowledge, seeing instead a familiar dialectical move that plants a new flag on terrain of recognized significance (say, detective fiction) by claiming to displace a critical approach called close reading.

This model of disciplinary history was well founded in the twentieth century. But I would argue that it is unreliable today. Distant reading may have begun by proposing to displace existing approaches to detective fiction and the nineteenth-century novel, but it is turning out to matter in a less familiar way: by uncovering new objects of knowledge. As literary historians stumble over a growing number of trends that span two or three centuries, we are beginning to realize that we moved too hastily in assuming that our discipline had been mapped, or even loosely sketched. Do we actually understand the broad outlines of literary history? I think we understand them well at a generational scale, and hazily at the level of a single century. But as we back up farther, the limits of our knowledge become evident: large patterns emerge for which we don't yet have names.

Moreover, these historical patterns are not alternatives to the canon: they are broad trends that embrace Jane Austen and Toni Morrison along with the "great unread." Discoveries of this kind don't create yet another approach to The Turn of the Screw; they occupy a different scale of historical description and require a different kind of debate. Literary historians will still have plenty to argue about, since large frames do change our understanding of the things inside them. But the specific martial talents we developed in fights over anthologies and critical editions become a bit beside the point here. Instead of staging normative arguments about the right approach to a limited number of familiar topics, we need to explore ways of coordinating different scales of analysis.

The Horizon of Literary Knowledge

For instance, this chapter about our ignorance of long-term trends might begin by acknowledging how much we already know at a local level. To show off our achievements, we could choose a task that literary scholars have spent some time practicing — say, the explication of short passages in novels. Here are two versions of a situation that has been common in fiction: in both, two young people are developing a romantic attachment, without full awareness of the attachment yet on either side. How much literary history can we extract from the contrast between the passages?

Sophia, with the highest degree of innocence and modesty, had a remarkable degree of sprightliness in her temper. This was so greatly increased whenever she was in company with Tom, that had he not been very young and thoughtless, he must have observed it; or had not Mr. Western's thoughts been generally either in the field, the stable, or the dog-kennel, it might perhaps have created some jealousy in him: but so far was the good gentleman from entertaining any such suspicion, that he gave Tom every opportunity with his daughter that any lover could have wished; and this Tom innocently improved to better advantage, by following only the dictates of his natural gallantry and good-nature, than he might perhaps have done had he had the deepest designs on the young lady.

This passage is not difficult to date — first of all, because many readers will recognize Sophia Western as a character from The History of Tom Jones (1749). But even readers unfamiliar with that book may guess that they're listening to an eighteenth-century voice. Qualities like modesty and good nature are confidently attributed to characters in long sentences articulated by wry counterfactual conditions. The narrator is so casually omniscient, in fact, that he can call his protagonist "very young and thoughtless" in passing. The second passage is almost equally easy to place on a timeline, although very few people alive today are likely to have read this novel.

The last time he had been in a boat with a girl, it had been Aileen who had sat in the bow against a background of sparkling sea and deep blue sky, and in her scarlet cap, the young gull's wing he had given her. Now against a background of green branches and sunflecked water there sat opposite him a gray-eyed young woman of whom at first he had been a little afraid. But facing her half-wearied quiet, and remembering the shadow picture of ten days before, he felt much more at his ease. "And the little breeze? Y're catching it?" he said. He could see that it had caught her, for it was blowing the soft hair around her ears and lifting the ends of her black four-in-hand tie.

"Oh, yes! And it is delicious! It is much nicer even than — than soy beans!"

Like Tom Jones, this story of growing affection has a rural setting, and both passages even develop a brief comic sidelight involving a character distracted from love by agriculture. But readers will have little difficulty discerning that the passages are separated by more than a century.

What gives the second passage away is an elaborate strategy of indirection. Readers are not told that Patrick Joyce is beginning to love Olivia Ladd — only that he notices her "half-wearied quiet" and the motion of the "soft hair around her ears." But we also know that his feelings are divided, because her image is juxtaposed with the image of another woman. The divided feelings seem intense. How else should we interpret the strange vividness of these images? The contrast between women requires four color adjectives in the space of two sentences — deep blue, scarlet, green, gray — followed shortly by a fifth, when Patrick notices Olivia's black tie. This rigorous insistence on expressing emotions indirectly by tracing a character's flickering attentiveness to physical sensations probably tells us that we're somewhere near the early twentieth century, although, as it happens, Frances Allen published The Invaders in 1913, six years before T. S. Eliot popularized the phrase "objective correlative."

Works of fiction from different periods invite readers to play very different interpretive games. We might not even need formal training in literary history to feel the differences and place the passages on a timeline. Where readers do have historical training, it becomes fair to ask harder questions — for instance, Why are these passages so different? Since the author of the first is famous, it will be tempting to answer the question there by invoking common knowledge about Henry Fielding: his narrators are comically intrusive and opinionated even for the eighteenth century. I have chosen the second passage from a little-known author to prevent us from leaning entirely on that kind of personal explanation. Although she published three novels that were well regarded at the time, Frances Allen is discussed today mainly in local histories of Deerfield, Massachusetts. When it is remembered at all, The Invaders is remembered for exploring tensions around immigration: Patrick Joyce is "a little afraid" of Olivia Ladd in part because her family is hostile to the Irish, who (along with the Polish) are the invaders of the title. But little has been said about Allen's aesthetic choices. If we want to set her in a literary frame, we will have to back out at least to the level of the period. Henry James might provide one precedent for her tactic of limiting third-person narration to a particular character's perceptions. Allen doesn't stick to a single character's perspective as consistently as James, but she was one of hundreds of turn-of-the-century writers who discovered that focusing description on physical sensations and objects could, paradoxically, dramatize their characters' subjectivity.

Anyone with a degree in English could write paragraphs like the last three above, which characterize both passages, and situate them in the context of a period. This is the kind of thing we're good at. But if we back out a bit farther and try to draw a single line all the way from Henry Fielding to Frances Allen, we enter a scale of description where literary history becomes speculative at best. To be sure, there are stories in circulation we could lean on. For instance, one "critical tradition," according to Paul Dawson, "favours a historical view of the novel progressing towards an invisible narrator who will simply present events (or experience) without commentary or evaluation." It's a view of history often mixed up with a normative opinion that good writers "show" stories instead of "telling" them. Fielding was certainly a teller, and Allen's tactic of expressing human feeling through colors and wind-ruffled hair qualifies as an extreme form of showing. So these passages fit the alleged historical pattern.

But frankly, I chose both passages for that reason. How confident are we that the rest of literary history between 1749 and 1913 can be organized around a transition from telling to showing? There are critics who have implied a version of that argument. Percy Lubbock and F. R. Leavis are well-known examples. But it would be an understatement to say that their interpretation of history remains controversial. Other scholars have pointed out that Leavis's grand narrative about the rise of impersonal narration forces him to underrate George Eliot, and even Joseph Conrad. It is clear that limited third-person perspective had a good run from Henry James through Virginia Woolf but not at all clear that previous nineteenth-century fiction should be understood as a slow progress in that direction. Omniscient exposition permitted the characteristic strengths of the Victorian novel, and it remains important today in postmodern metafiction and genre fiction. The once-dominant historical narrative about the rise of impersonal "showing" now looks suspiciously like a Whig history constructed by modernists to glorify a particular modernist practice. Many literary historians have followed Wayne Booth's lead in replacing it with a more complex ecology of rhetorical techniques.

In short, literary-historical knowledge is extremely rich within certain chronological limits. We are good at using short passages to characterize authors, movements, and periods. But if we try to stitch those pictures together to map a longer timeline, consensus becomes elusive. Scholars have certainly tried to make century-spanning arguments. But it's hard to stage a focused debate about critical generalizations on this scale, since every reader frames the pattern at issue slightly differently and sees different writers as exemplary. Historians who think highly of modernism, or of Henry James, are more likely to believe in a broad shift from "telling" toward "showing" than historians who think highly of postmodernism or George Eliot. Scholars do agree about some simple long-term trends: the decline of rhymed poetry, for instance. Beyond that, we have influential hypotheses and long-running debates. But in contexts where we need consensus (like a classroom anthology), historical generalizations are typically limited to periods.

Nor is this limitation necessarily perceived as a problem. Since our disciplinary institutions are also built around periods, literary scholars specializing in different centuries are rarely forced to reach agreement about long-term patterns. Disagreements between historical fields often present themselves as normal, even inevitable temperamental differences between people who have chosen to study different things. So we let the argument drop and move on to something else. It's not obvious that these debates could ever be resolved simply by discussing a larger number of books.


And in fact, I don't recommend trying to resolve a dizzying argument like the one between Percy Lubbock and Wayne Booth simply by multiplying examples. It is often more useful to reframe the question as we back up to consider a longer timeline. For instance, what happens if we bracket this sprawling debate about narratorial impersonality and "showing" in order to focus on some apparently trivial descriptive detail?

Take the names of colors. I may not have persuaded you that they signal emotion in the passage I quoted from The Invaders, but there is no denying that colors are oddly prominent in that book and comparatively absent in Fielding. Could that difference reflect, by any chance, a broader shift in the language of fiction?

Color perception may seem a basic constant in human experience, but its role in fiction has changed rather dramatically, as we see in figure 1.1. There is, to be sure, a lot of variation from one volume to another; most early-twentieth-century novelists were not as obsessed with visual experience as Frances Allen. (The fact that she and her sister both lost their hearing and worked as photographers is possibly relevant.) But the rising significance of color is not a personal idiosyncrasy; in fact, references to color are about three times more common, on average, in early-twentieth-century fiction than they had been in the eighteenth century. (Here, I am describing changes in the central trend line in figure 1.1, which reflects a mean value for the volumes plotted; we can have 95% confidence that the real trend falls somewhere in the shaded area at any given point on the timeline.) The meaning of this trend is not immediately legible. But the evidence itself is clear, and robust enough to resist two familiar forms of skepticism.

For instance, it is true that the concept of color has blurry boundaries. The list of words I have counted here is long enough to include colors like "cerulean," but there will always be chromatic experiences it leaves out and emotions that it mistakes for a color when tallying up occurrences of a word like "blue." If we were looking specifically at the history of "blue," the rise and fall of its various senses might be crucial. But the rising frequency of reference to color in fiction is not driven by the expansion of any single term: all the primary colors become more common together, in parallel. Semantic nuances won't explain away this trend.


Table of Contents

List of Illustrations
Preface: The Curve of the Literary Horizon1 Do We Understand the Outlines of Literary History?
2 The Life Spans of Genres
3 The Long Arc of Prestige
4 Metamorphoses of Gender
5 The Risks of Distant Reading
Appendix A: Data
Appendix B: Methods

