Student Portfolios : Intro/Advanced Digital Research Methods

I’m pleased to announce the Mellon Scholars’ 2016-17 Team Portfolios! Click on the images below to visit the four team sites: 1. The Modernist Journals Project, 2. 19th Century Sunday School Books, 3. Antislavery Almanacs, and 4. The Founding Fathers Lending Library.


Team portfolios comprised the bulk of our assignments in IDS 180-181; students ranked their topical preference, and were placed in teams during the second week of class (September 2016).  I was a little wary about a year-long team project– as were the students– but I couldn’t be prouder of the results.I began by introducing the project in the fall, showing how each of the parts fit into our course goals. Students conducted traditional research on their chosen dataset, resulting in an annotated bibliography and an argumentative essay. They also employed some research techniques that were a bit newer to them, developing an Omeka exhibit, cleaning their dataset with OpenRefine, analyzing and visualizing their data with a Voyant, Tableau, Palladio, Google MyMaps, Plotly, and RAWgraphs. They also learned how to build WordPress sites to host their portfolios.

The team projects were successful, in my mind, because of the time that we took to build team charters and discuss best practices in Project Management. I have both Lynn Siemens and Ethan Wattrall to thank for this: I was fortunate to take Lynn’s course in Project Management at DHSI, and Ethan’s discussions of DH Project Management in the CHI Fellowship translated perfectly to the classroom.Students co-wrote team charters, assigning team roles and developing communications plans. I also asked students to think about their team values, and I think this step was crucial in developing a positive, collaborative atmosphere. Students exhibited a great deal of trust in one another. We discussed the progress of the team projects regularly– it seemed important that I check in often, since so much of their grade depended on it!– and my class was shocked that the projects were going well. Their previous bad experiences in teams, they told me, were due to group members who would not contribute, or whose work was subpar. These students told me that they could trust each other, and they enjoyed the opportunity to encourage their teammates as they followed their interest.

I was really moved by the relationships that developed amongst the teams. On our last day of class, I asked students to reflect on the year; as they discussed their teamwork, it was clear that they did not simply trust one another to perform— they also trusted one another to learn and to fail. One student said, “Christie* really wanted to learn how to work with Palladio, and so I said to her, ‘you get it, girl!’ Because I knew she would do a good job, and she was really excited about it.” (*not her real name) We paused to discuss this interaction. I asked, “But what if Christie had a really hard time learning Palladio? What if she did a really crap job with network analysis?” Christie– who, in fact, had done a fantastic job with Palladio, said– “I think it would’ve been OK, because I know my teammates wouldn’t blame me. We would’ve worked together and we would’ve fixed it. And even if it wasn’t perfect, that’d be OK.” The process of building a team– and the hard, hard work of maintaining a team– ended up modeling the process of long-term research in a way that I hadn’t initially intended. Because each step of the team-building process was deliberate and thoughtful, students came to see how each step of the research process must also be deliberate and thoughtful in their lives as scholars.

MSA 18: Codex Industries

I’m excited to be presenting my work on the Armed Services Editions at this year’s Modernist Studies Association annual conference (and also excited to escape to Southern California in November!). My friend Alex Christie has organized a panel on Modernist Codex Industries, with four really fantastic papers addressing issues of modernism and media theory, with a dash of DH.

Panel Abstract

While modernist scholarship has documented the formal relationships between literature and film, it has yet to attend to production methods that move between book printing and mass media.While Edisonian innovations in the mechanical reproduction of sound and sight marked the emergence of old media as an industrial product, equal revolutions in the apparatuses of print production—including the introduction of the linotype and the rotary press—transformed the printed word into an industrial commodity. As Matt Huculak has recently noted in the Journal of Modern Periodical Studies, the production of modern periodicals relied upon the mass extraction of paper from Newfoundland. As such, “In effect, on or about 1910, Newfoundland’s natural timber resources began to underwrite the material production of modernity.” Attending to the modernist word’s instantiations as an industrial commodity, this panel unites book history with media studies to uncover modernist literature as media products. Papers may take up broad and ranging intersections between literary and mass mediated modernity, addressing topics such as: mechanical reproduction techniques that cross literature and media (old or new), the geopolitics of cultural production and consumption in the modernist period (including ecocritical approaches), the relationship between subject and object in industrial culture, the emergence and influence of experimental writing techniques (i.e. automatic writing), and the shifting nature of representation across formal and material registers. Additionally, papers may reflexively consider modernist literature’s re-emergence as a mass mediated enterprise through the mechanisms new media and large-scale digitization. Considering the confluence of modernist studies with the digital humanities, specifically as it inherits the legacies of industrial modernity, participants will collectively take stock of the ongoing politics of literary reproduction as they play out through technological and disciplinary transformations of the printed word.

Papers include:

Hannah McGregor: “You Owe Very Much to Advertising”: Mass Mediating the Modern Nation in Canadian Magazines

Kathryn Holland: The family in the network: an infrastructure for modernist literary activity

Me: Reading the Armed Services Editions: The Book Industry and the Production of Vernacular Modernism

Alex Christie: Unspooling Roussel’s Spectacle: Mass media and the manuscript

…and our chair, James Gifford!

Hope to see you there!

Launching the Armed Services Editions

I am happy to announce the launch of my CHI Project, The Armed Services Editions: A Computational Analysis. On my page, users can navigate through three “Data Narratives”: simple analyses that I conducted to answer critical questions about these data. The Gender Data Narrative considers the distribution of gendered pronoun usage throughout the corpus, and features a basic foray into LDA topic modeling. The Genre Data Narrative considers the types of books that were sent to servicemen, and how the generic representation of books may have shifted over time. Finally, the Geography Data Narrative the geographic imagination of the corpus– both domestic and internationally– with NER.

This first phase of this project is, quite simply, a book history project. To date, the ASE Corpus has not been studied in total. Several scholars have published institutional histories of the Council on Books in Wartime, or discussed the role of specific books, or even discussed the ASEs in relation to a larger sociological project. I am interested in assembling a more thorough, stylistic, macro-history of the ASEs, that attends to both it sociological import as well as its formal properties through computational analysis. The data I’ve assembled is descriptive, working toward that end, and is a necessary foundation to the more advanced analysis I will be conducting this summer.

In addition to an analysis of the ASE Corpus, this website is also a record that chronicles the development of my methodological chops. While I had a basic foundation in R (thanks to a fabulous course at HILT), my skills needed (and still need) development. I used two textbooks to improve my skills, testing my dataset throughout. Users familiar with Text Analysis with R for Students of Literature by Matt Jockers and Humanities Data in R by Lauren Tilton and Taylor Arnold will likely be able to trace my data analysis back to the chapter problem sets.

Full disclosure: I feel insecure about this. I would like, eventually, to publish on the ASEs. A record of my fledgling explorations in R and data analysis is… well, nerve-wracking. Yet, as Ethan Wattrall has reminded me in a variety of ways, it’s also an important intervention. Over and over again this year, I have been reminded of and impressed by the generosity of my colleagues in DH; I post this basic data analysis in hopes of inviting that same generous conversation.

Only a fraction of the work that was completed on this project his featured on my project website. I should have foreseen this problem and created a time-lapse video of my hours and hours running OCR on hundreds of documents, or adding metadata to my database. Or, better yet, learning how to analyze data in R. For this project, however, I decided to visualize my data using Tableau. Tableau provides far less specificity, for sure, but it also allows for a greater degree of user interactivity. Since my data is, at this stage, largely descriptive, I wanted users to be able to explore with greater flexibility.

It’s been a long year working on this project, and that long year has turned out to be just the beginning. I’m so excited to see how this project continues to develop. Over the summer, I’ll be continuing this project by running these analyses—and much more interesting, advanced analysis (fingers crossed)—on the entirety of my corpus.

The questions motivating this project are increasingly pressing, and continue to motivate me—particularly as a powerful political candidate has remained consistently hostile toward the free exchange of ideas that should define any democratic discourse. Ultimately, this project asks, what (or whose) ideas are acceptable, and what (or whose) ideas aren’t? And what (and who) makes that so? These questions should be asked about 1940, and they should be asked about 2016.

MLA 2017: Digital Holland and Community-Based DH

I’m happy to announce that my paper, “Digital Holland and Community-Based DH,” has been accepted to MLA! This panel, “Local Digital Humanities” is organized by the MLA Forum on Digital Humanities. I’m excited to be sharing my work at Hope College– and, more importantly, the work of undergraduate Mellon Scholars.

Paper Abstract

Digital humanities has been celebrated for its emphasis on collaborative research. Yet DH’s transformative collaborative ethos is restricted to institutional spaces—within the university, between universities, or in formal academic communities—and at large research universities. This paper is interested in collaboration of a different sort and in different places, considering the role of the local community in the digital research program of the small liberal arts college, arguing for a model of Community-Based DH. Literature in the Scholarship of Teaching and Learning has emphasized Community-Based learning as a hallmark of “high-impact” education, encouraging classrooms to expand beyond their institutional havens (Kuh, 2008). Digital humanities research lends itself, especially, to these pedagogical practices, expanding the reach of collaboration into the local community—work that small liberal arts colleges are especially equipped to undertake. To make this case, I rely on my own community, university, and students. As the Mellon Fellow for Digital Liberal Arts at Hope College (Holland, MI), I oversee a robust partnership between local partners (archives, historical associations, churches) and diligent undergraduate researchers. Our flagship research project, Digital Holland, has been running for four years; students have been partnering with local organizations and independent researchers to make Holland’s cultural heritage freely available and accessible to the public in the form of digital archives and exhibits. In addition, students have also developed a series of “turnkey” projects to make digital research methods accessible to the public, inviting community members to participate in the making of Digital Holland, partnering in the act of knowledge creation alongside student project managers. In placing the discourse of Community-Based learning in conversation with DH Pedagogy, I hope to transform both: reframing the local community as more than simply a site, object of study, or recipient of service, but as a partner in research, and by expanding the traditional models of the DH Lab or Center to consider the applications of digital research and scholarship for the public good.

Politics and Form: The Armed Services Editions

As a CHI Fellow, I’m undertaking a large-scale text analysis of the Armed Services Editions, a collection of novels sent to US Soldiers during WWII to “fight the war on ideas,” to consider issues of politics and literary form. I first stumbled on the Armed Services Editions a few years ago, while researching Ernest Hemingway’s The Sun Also Rises. You may recall Jake’s description of Robert Cohn, early in the novel:

He had been reading W.H. Hudson. That sounds like an innocent occupation, but Cohn and read and reread “The Purple Land.” “The Purple Land” is a very sinister book if read too late in life…For a man to take it at thirty-four as a guide-book to what life holds is about as safe as it would be for a man of the same age to enter Wall Street direct from a French convent, equipped with a set of the more practical Alger books.

I was working on a project on modernist reading networks, and this passage jumped out at me. I looked into The Purple Land and found that it was chosen to be a part of the Armed Services Editions in World War II, 16 years after the publication of The Sun Also RisesCursory research into the Armed Services Editions led me to the Council on Books in Wartime, a committee of publishers that assembled during World War II and contracted with the US Military to produce cheap paperback editions for US soldiers abroad. The goal (and slogan) of the Council on Books in Wartime was to use books as “weapons in the war of ideas.” Books had an important role to play in the war effort, the CBW wrote, because “Books can help us recover our past and teach us what a tough-fibered people we can be when we have to. Books can tell us what our enemies are like. Even prizefighters study their opponents carefully.[…]Books can tell us what our allies are like.” All of this was vitally important to such a “total war.”

Yet, the process for selecting these books for such an important task was fairly opaque. According to a booklet commemorating the ASEs found in the Princeton University Mudd Manuscript Library,

“Titles are selected by the following process: Publishers’ lists are combed and copies of books thought desirable are asked for. Each book is then carefully read by a professional editor who makes out a written report. The books and the reports are submitted every two weeks to an Advisory committee consisting of publishers, librarians, booksellers, critics, and authors. Books that meet with the approval of this Advisory Committee are then sent to the Army and Navy, both of which services must agree on a title before it is accepted for publication.”

Presumably, a desirable book would be selected and approved because of its fit within the general aims of the ASEs: to boost morale, to promote democracy, to learn about the enemy. Histories of the ASEs show very little censorship of books (though, presumably, certain books would not have been “thought desirable” and suggested for publication in the first place—James Joyce didn’t make the cut, nor did DH Lawrence). A quick scan of the ASE database reveals some books that make sense as “desirable” in the promotion of democracy for the war on ideas (in the hive-mind of the DoD in 1943): Jack London novels, for instance. Others seem out of place, such as Virginia Woolf’s The Waves. Yet, over 120 million copies of 1,322 books were distributed on the front lines and in military hospitals, all of which met the criteria outlined by the CBW: they each helped to “fight the war of ideas.”

I’ll be looking at this corpus for my CHI project, analyzing what it would mean for a text to be made into a weapon for democracy.

Big picture: how might an understanding of the CBW Corpus help us think about textual politics, politics and style, politics and form? To answer this question, I want to consider how “democracy” might be operationalized and measured—in other words, what formal or stylistic measures might make a text “democratic”? I have other plans for this project down the road, including developing a predictive model. But for the purposes of my CHI Project, I’m going to be building this corpus and conducing some preliminary analysis in R. Right now, I’m eyes-deep in Phase One: Building the Corpus.

Fortunately, it is quite easy to find a full list of all of the ASEs. Also fortunately, many of the titles assembled by the CBW were written prior to 1923—that is, public domain. It is unlikely that I will be able to assemble a corpus of all 1,300 titles. I plan to do the following:

  • Follow the release of the ASEs chronologically, starting with the A series and moving through ZZ.
  • Keep texts that I can find already digitized in the public domain (Hathi Trust, Project Gutenberg, Google Books)
  • Keep a running list of texts that
    • not digitized but ARE public domain
    • still protected under copyright
  • See what I end up with and make some hard choices about scaling, about digitization, and about copyright and fair use.

Highly scientific and conclusive, I know. I’ll cross the OCR bridge when I get there.

There are some texts that I know already that I can discard. The ASEs assembled some “made texts,” short story collections by famous authors like Ernest Hemingway (his novels were excluded). There will certainly be more difficult choices to make about inclusion/exclusion. For instance, some texts were abridged to fit the specific production dimensions of ASEs, such asMoby Dick. In these cases, I’ll have to decide if I want to take the full-length version or discard it entirely.

And I’ll also have to think critically about the sort of metadata I hope to assemble in the process. Author gender might be interesting (if infuriating). I was surprised to find that the most popular ASE was Betty Smith’s A Tree Grows in Brooklyn. Perhaps I expected something with more machismo, or perhaps I’ve just got Jonathan Franzen perpetually in the back of my head bashing women writers (god help me). Regardless, I’d be interested to see how author gender impacted the selection of books.

Given the CBW’s aims of “learning about our allies” and “learning about our enemies,” I would also be interested to track author nationality, or the book’s primary setting. Some of this can be collected as metadata—though I don’t want to put too much weight on authorship—but some of these questions can best be answered through analysis (NLP recognition for place names, for instance, to track primary settings). Through the process of building, I hope to develop some more hypotheses beyond my initial thoughts (to be shared later) that might help guide the analysis phase of the project.

I’ll clean the data and make the corpus (or, as much as possible) available via GitHub, as I would love for others to join me in this analysis. And I’ll certainly be blogging about the process the way.