Our project is the Kennecott Miner Records, a transcription project that focuses on personnel records of people working in the Kennecott Copper Mine from 1900-1919. We were able to get over 40,000 cards digitized for no cost due to a partnership with FamilySearch. Still, we wanted to make the records available online in a structured format that could support both family history research and digital scholarship research inquiry. We’ve been experimenting with Collections as Data at the Marriott Library recently, but on a smaller scale. For a quick look at what we’ve released so far, you can check out our github repository. We’ve also published an article, From Digital Library to Open Datasets in Information Technology & Libraries, that provides an overview of our early efforts.
The grant matched up well with many of the research goals that we have in Digital Library Services, as well as our primary mission of making digitized cultural heritage materials accessible. The main thing that we needed to get up and running with the project was the money to hire some additional student workers who can focus on transcription. Annie Jensen has worked on the project since fall 2019 and continues, and we’re hoping to hire an additional student to work on the project as well. The goal with the DM funding is to create a small pilot collection that we can use as the basis to seek additional funding opportunities and to develop and refine transcription workflows. Ultimately getting a sense of how many student hours we need to support the full transcription of the collection.
We’ve gained better insights into the time needed for this type of transcription work and created best practices for working with this type of material. We approached transcription by defining three tiers (minimal, extended, and full). We evaluated the efficiency and suitability of each process to develop a workflow that would capture the most structured data efficiently. After experimenting with these approaches, we settled on an extended transcription workflow that captures all the data points that we think someone would need to create visualizations with the dataset. We have created several visualizations from these data points using Tableau Public, for example, a map of miner’s nationalities. Currently, we are not transcribing data about occupations and wages, because that information was written in a much less standardized fashion on the cards.
When Anna developed the application, she framed the project in terms of both individual scholarship and larger goals, like developing a regional dataset for Utah-centric digital scholarship and pedagogical activities. Knowing that the funds would help get us started with the project, but not complete it, she developed the project narrative with a limited scope in mind from the start.
In Digital Library Services, we are engaged in traditional descriptive metadata work to develop digital collections while also branching out in new areas such as digital exhibits and collections as data. There are always trade-offs in both time and funding as we pursue the idea of developing enhanced data in our digital collections that can support digital scholarship. We hope as we work with collections like the Kennecott Miner Records, we’ll be on an excellent path to figuring out how to develop similar collections that are just as exciting, that have the potential to support digital scholarship in a variety of disciplines.
–Anna Neatrour & Rachel Wittmann
The grant matched up well with many of the research goals that we have in Digital Library Services, as well as our primary mission of making digitized cultural heritage materials accessible.