O rocks! Tell it to us in plain images (A THATCamp/Bloomsday Visualization)

Some results from the Bloomsday hack session, where we discussed small digital Joycean projects we might take on that day and continue working on over the following weeks:

We decided to create a dataset that could be used in Gephi to make something both informative and pretty: a log of the social interactions among characters that could be turned into a social network visualization. Chad Rutkowski read through the Wandering Rocks episode and logged a list of character interactions, which I then turned into a dataset and manipulated in Gephi to produce (click image for larger version):

Wandering Rocks visualization

Character nodes are weighted by the number of edges touching them (i.e. by how many interactions with other people a given character has), so unsurprisingly for anyone who’s read Ulysses, Father Conmee appears as one of the most connected characters in the episode.

Our next step will be to answer some questions about types of character interactions and include these answers in our dataset:

  • Do we want to log the direction of an interaction to cover cases where, for example, and conversation is one-sided? (Yes, but this means creating two edges for every dialogue: Bloom > Molly as well as Molly > Bloom).
  • What counts as an interaction? (Telegrams, letters, overheard shouts in the street?)
  • How do we handle time? (Is Bloom > Molly recorded only one time in our Ulysses datatset, or every time they interact? If the latter, how do we decide when an interaction counts as ended?) If we can find a satisfactory and non-insanity causing way of coding this, we could create a time-lapse visualization of interactions in the novel, perhaps with some sort of cumulative or heatmapping feature.
  • How to handle different types of interactions? We discussed assigning different numbered weights to interaction edges so that its easy to see the degree of interaction taking place (was a character imagining a conversation with Bloom, or actually talking to him?), but there’s some difficulty in deciding what types of interaction are deeper than others in order to place these on a spectrum and make visual apprehension easier.

Ben Schmidt also did some neat and quick work with the Circe episode, running a script to gather character names by grabbing the all-caps words in that section and mapping interactions stepwise by linking the names that appears next after a given character in the text.

We’ll continue working on this project as we have time, so if you’re interested in helping out, send us a tweet! The work involved is pretty easy: identifying a section of the novel you wish to attack, then making a list of the characters who interact and ID’ing the type of interaction according to a scheme we’re using.

Cross-posted from LiteratureGeek.com.

THATCamp thoughts from newbie

I thoroughly enjoyed my first THATCamp. I was a bit nervous, since I am from the LIS rather than the AH community, I do not have mad coding skills, and I assumed I would be on the older side. Well, I would now recommend a THATCamp to anyone. I felt like I was part of a community that had a common interest in learning and sharing. Even the workshops were not “teacher to student”, but peer to peer. There was no pressure at THATCamp to be an expert, all contributions were valued and respected, and you could just observe if you wanted to, but there was no barrier to participation either. At other conferences the formal sessions are usually sparsely attended because the real networking is happening in the hallways and bars, whereas at THATCamp the hallways were deserted during sessions, and the session rooms were buzzing. I participated in conversations about linked data, digital humanities project support, the role of libraries, and comics (about which I knew nothing). I took workshops on WordPress plugins, hybrid mobile design, and ViewShare. I ate far too many of the delectable Panera pastries I have been deliberately avoiding at home. I played a decoding game with the mysterious AgentQueue. I walked away with new friends, some great teaching ides for my digital libraries class, a bunch of new people to follow on Twitter, a t-shirt, many session Google Docs for further exploration, and a determination to turn my friends on to THATCamp New England and CHNM next year. Thanks to all the hard working folk who put this together, and to the sponsors.

#THATcamp report part 1: Roy Rosenzweig Forum on Technology and Humanities

x-post Knitting Clio

Hi folks,

I’m back from a busy four days at THATCamp CHNM (aka THATCamp Prime). I’ll start by discussing the fascinating presentation by Pamela Wright, Chief Digital Access Strategist at the National Archives and Records Administration about the Citizen Archivist Dashboard, online projects created with the recently-released 1940 census data, and other exciting digital projects from “our nation’s attic.” I thought Sharon Leon‘s choice to use an interview format was excellent and made for a much more dynamic and engaging forum than a straight-up presentation. The Citizen Archivist Dashboard grew out of the Open Government Platform initiated by President Obama. The goal of Citizen Archivist is to make NARA’s documents more accessible while also serving as a forum for engaging the public in the intellectual work that makes accessibility happen. Pam realized that simply opening the archive’s data to the public without any guidelines would be like dumping out a load of raw cake batter: it might be yummy for the most dedicated enthusiasts (e.g. “Lincoln Lady”) but most people would like to have a “cupcake” — i.e. a specific task or subject on which to work (e.g. the Titanic is the featured “cupcake” right now).

So far, Citizen Archivist has been wildly popular: within two weeks of going live, the archive received 1,000 page transcriptions (by contrast it took Sharon several years to reach the same number of transcribed pages for the Papers of the War Department). The 1940 census received 20 million hits the morning it went live.  Pam hoped that one of the hackers at THATCamp or elsewhere would design a “pocket archivist” app that would allow users to upload images while they are doing research at NARA. She also asked for suggestions for other topics and projects to add to the initiative.

Another way that NARA engaged the public was in the redesign of its website. They received 4 choices from the designer and then let the public vote on which one they liked best. Voters overwhelmingly chose the simplest design (which many at NARA found too minimalist). This is something to keep in mind as my colleagues and I set out to redesign our department website.  Perhaps we should survey our students to see what they want from a website?


Post links to session notes here

If there are notes from your session living in something like Google Docs, please post the link to it here in the comments. To ensure its preservation, you can also share it with , add it to the the THATCamp Zotero bibliography. Thanks!

Special thanks to Aram Zucker-Scharff for setting up the overarching THATCamp Google Docs folder. I definitely need to make sure that it gets used more.

UPDATE: If you have trouble finding the Google Docs, go to docs.google.com/file/d/0B5gCrWfqDPTcaHZoY3dpek5fME0/edit?pli=1 where you can download (but not edit online) all the session notes from THATCamp CHNM 2012.

Display JPG2000 Images

JPG2000 holds the promise of lower storage costs for large collections of scanned documents. It can also minimize the bandwidth requirements for display the image details.

One limitation is that most web browsers do not support this format and thus would require a viewer.  Omeka does a good job of creating thumbnails from JPG2000 images, but you would still want to view the image at the full resolution.

Any suggestions on a good JPG2000 viewer / plugin that is able to handle multiple pages documents.

Show me your data: Institutional Repositories

There has been some move in research to not just publish papers with the final results but to also release the raw data sets and even software for other researchers to verify the results and further discovery. There are even some futuristic claims that the data sets will be viewed as the ultimate results of research and the actual paper will be a secondary product.

Like to discuss what peoples experiences have been with Institutional Repositories. Has it been to showcase work, preserve for the future or to play an active role in furthering discovery?

If you have created or maintained an IR, what issues did you face, how well accept was the IR by the researchers. Does data set sharing play a role.

Building a DH Culture from the Ground

So my proposal is late-breaking, but here ’tis: I’m currently moving to a new institution where I will help start a new DH center. I’d like to think collaboratively—well, about how that happens. I want to get at this question, however, not by talking about getting grants or picking a pithy acronym for the center’s name. Instead, I’d like to jump off Stephen Ramsay’s recent post, Centers are People, and think about how one begins building the kinds of communities where “a bunch of people…[are] committed to the bold and revolutionary project of talking to one another about their common interests.” I’d especially like to think about how to draw in those people on campus who are interested in DH but don’t yet know it: that history professor with a personal archive she’d love to make public, that librarian crafting the library’s ebook strategy, or that computer science undergrad with an odd side interest in Renaissance poetry. Topics might include:

1.) organizing and effectively promoting DH events to the wider college or university
2.) creating and fostering hacker-friendly spaces on campus
3.) building on-campus partnerships between departments, libraries, &c. &c.
4.) seeding DH incursions into the curriculum

This topic may well tie into hmprescott’s “More Disruptive Pedagogy: Thoughts on Teaching an Un-course” proposal or Kimon Keramidas’s “Of courses, curriculum, networks, and unconferences”.