Main »

Digitization Day

Digitization Day

This is my conference report on Digitization Day at the University of Alberta. As it was written live it will have problems - please email me corrections. The Digitization Day was organized by the Histories and Archives Lab of the Canadian Institute for Research in Computing in the Arts (CIRCA).

See also my list of links Digitization Day Links

Mary-Jo Romaniuk, Opening Remarks

The Acting Chief Librarian of the University of Alberta opened the Day. The Library is more than just keeping information. They see a role in curation, in helping people repurpose information, curating information, in helping researchers use information and helping with the interpretation. For this reason the Library now is entering into partnerships more and more. They are thinking about things collaboratively more and more.

Libraries used to be built on collections. Now they are also about supporting researchers and curating collections we build. The Library has two roles. There is a preservation and access role. There is a creative role now of curating new collections that repurpose materials in new ways.

The Library has one of the largest unique collections in North America. Given broad access over the net the role of libraries curating unique items becomes more important. There is also a convergence between libraries, archives, and museums. This is also true for research projects.

Their vision is to preserve things digital (and other) for 500 years.

Geoffrey Rockwell, Histories and Archives

I spoke about how we are digitizing humanities computing ephemera and administrative documents. We are trying to contrast them to how computing was being discussed in the public sphere.

Paul Hjartarson, Andrea Hasenbank and Harvey Quamen, Digitizing the Wilfred Watson Papers

Paul talked about the EMiC project that he is the U of A lead for. They are digitization the Wilfred Watson papers and working with the U of Toronto that holds the Sheila Watson papers. Wilfred and Sheila were arguably the most important Canadian modernists in the mid-century. Wilfred was a poet and playwright whose work was continuously evolving. Sheila was a novelist and short story writer.

Andrea talked about other projects like the Ostenso/Durkin Collaboration project and the Canadian Manifestos project. She also showed a slide of their workflow. They are trying to recognize the labor in this process. The Watson archives have 85,000 items.

Harvey talked about the technical set up. They are using a database for the metadata which will then inject into the XML/TEI documents so that the TEI Header is generated not hand coded.

Geoffrey Harder and Leah Vanderjagt, By Spoon and By Shovel: Digitization and Repository Services at the U of A Libraries

Geoffrey started by talking about their repository services layer, ERA (Education and Research Archive.) There is an accreditation process to becoming a "Trusted" digital repository. They are trying to support both large scale and boutique projects. Boutique digitization is smaller scale, highly curated, higher quality and rich metadata.

The Library uses vendors to do a lot of large-scale scanning, post-processing and correction. Some of these vendors work on the Library premises. They tend to build their own digital asset management, building on Fedora. They use MODS = Metadata Object Description Standard. They can do faceted browsing, they can do Google Maps hooks, and hooks to social media.

He talked about the mass large-scale digitization that they are doing with the Internet Archive. It depends on uniform workflows that minimize human intervention. They can get significant savings per page. They work with the Internet Archives which has a cost sharing model. The IA is open, unlike Google. The University of Toronto and the U of Alberta have digitized close to 300,000 Canadian titles. U of A is specifically doing Canadiana - Canadian historical materials. U of A is actually hosting a lot of these materials and they show up in the catalogue.

What have we learned? Mass digitization leads to low cost and high impact. There is a long tail - by digitizing things see use that they wouldn't have in the archive. We need better policy and copyright rules. The question is how they can add value to the mass digitized.

Leah Vanderjagt talked about ERA. ERA wants to be our trusted digital repository. They want all sorts of our stuff from posters, to draft papers to data. They have a strong team with Web interface developers, programmers, metadata specialists. They will mediate the accessioning or train us to do it. Above all they are willing to work with us so we can build our own interfaces.

Raymond Frogner, The Enourmous Condescension of Cartography: Digitizing the Papers of Willian Pearce

He started by talking about the "moral defense of the archive". There is value in the interconnection of records in an archive and how they relate to the creator. He talked about the opening image to the Pearce archive. This image of Buffalo Bones at Saskatoon, Saskatchewan is symbolic of the settling of the West. Pearce was important to this history as a surveyor and administrator.

We had a discussion about authority standards and services.

Chris Want, AICT Visualization Services

Chris showed a 3D laser scanner which scanned in the background while he talked about the services they offer which include conferencing (Access Grid), visualization, scanning, 3D printing, and evaluating new technologies. He showed some of the archaeological uses of the 3D printer in the humanities.

Other roles they have are data conversion and they do a lot of graphics and animations.

Andriko Lozowy, "Youth Picturing Place" and the Intermedia Research Studio (Sociology)

Andriko started by talking about the CFI funded Studio and the type of work they do in Sociology. He asked about you do with a room of aging equipment. They are shifting to supporting digital photography for research. He showed a project about youth and Fort McMurray where they put the equipment into the hands of high-school aged students who could then document their surroundings. He talked about the interesting ethical issues of showing images by youth. He also talked about the artistic issues of working with the youth to photoshop the images. They went from 2000 images and went down to 20 that they printed. What do they do with the 20 images other than write a paper. He left us with questions about what to do with this collection of images and ethnographic materials.

The Intermedia Studio lends out the equipment they have to Arts researchers.

Gary Kelly, David Buchanan, and Mark Madsen, Streetprint of StreetSuite

Gary started by talking about training graduate students and finding support from them. He has been asking the question "What is it to do research and report about research in the digital environment?" He works by networks recruiting students. Streetprint is a database program for creating web sites of print materials.

Mark talked about the importance of digitizing for mobility. David talked from the perspective of a user who is using Streetprint for a project like Streetprint Bratislava.

Pierre Boulanger, Digitizing The El-Dorado

Pierre talked about digitizing gold objects from Columbia. He worked with a museum (museo del oro) that has a number of historic gold objects associated with the pre-columbian peoples of Columbia. Pierre developed a 3D Scanning system that does high-quality scans. They created digital proxies that can be studied and manipulated instead of the original.

They also had a project to present the 3D objects back to an audience in a way that has haptic feedback so people get a feel for the weight and touch of the object. Pierre talked about a "non intellectual relationship" with the object.

Pierre talked about standards for 3D like MPEG 4 that are industry standards. The key is the clients that can interpret the data. He also talked about how museums are very careful about not letting high quality 3D scans out for forgers. The museum community is not big into openness.

David Desheneau, Digitization and Repository Infrastructure at folkwaysAlive!

David talked about the folkwaysAlive! project which is a collaboration with the Smithsonian. This collection is based on Folkway Records that released 2,168 albums of mostly non commercial materials. The collection was donated to U of Alberta. The folkwaysAlive! folk have some virtual projects, they release new recordings, they organize events.

He talked about the complexities of metadata for music and then about the Virtual Museum of Canadian Traditional Music. This project led to technical problems given that they have 4 TB of data. They created a cluster of linux boxes to do transformations.

MuDoc is a multimedia (Music Documentation) project to create a fully federated multimedia database. It had social peer-review and all sorts of stuff, but in the end is not used.

They now have a project gathering materials for South Asian Music and Culture in Canada. This raised interesting questions about community projects.

Peter Baskerville, Digitization and Quantification: A New Paradigm for the Humanities

Peter feels that we need to take numbers seriously. We can't continue to ignore quantitative data the way we have. The humanities have prejudices against quantitative data that lead to our ignoring them.

Peter went on to talk about three projects the CCRI project, the Last Best West, and LINC. Canadian Century Research Initiative is digitizing census data and integrating with other national projects. The Last Best West is digitizing 200,000 homestead applications so that this info can be linked to other collections. They hope to add GIS to this as homesteads are obviously connected to place. Linked Infrastructure for National Censuses (LINC) is trying to connect (link) people across censuses so that you can follow people across time.

Eleni Stroulia, Digital Selves

Eleni talked about tools that she is developing in the context of the history of the web. Her short history goes:

  1. A web of information - facts
  2. A web of applications - services, e-commerce
  3. The personal web - opinions
  4. The social web - social groups

She talked about research she has done on collaboration and influence. They have developed tools like Annoki to support formal groups like GRAND, CIRCA and CWRC. She is now developing a SociQL a Social Query Language that is a language for people interested in exploring datasets to understand the social.

Benjamin Tucker, Digitizing Indigenous Language Materials

Ben talked about the Alberta Phonetics Laboratory and how they are trying to rationalize the variety of oral recordings they have. He talked about a specific project digitizing Tsuut'ina materials which were all on cassettes in a box. This is a collaborative project where the university provided equipment, training, long-term back while the community provided language and cultural expertise. At the end both benefit.

Ben then talked about all the other audio digitization projects at the Lab. He closed on storage issues. Now they do DVDs, remote servers, and external drives. During questions Ben talked about how speech recognition isn't there yet.

Frank Tough, The Metis Byte Back, matriX, MAP and the Digitization of Historical Records Relating to the Struggle for Constitutional Rights

Frank and his team have created a digital archive that can be used in land claims and land claims research. They are now building a database for Metis National Council. Theirs is a social science approach that digitizes documents and add metadata. This has been an inter

Metis Archive Project (MAP) is the name of his project now. He brings a lot undergraduates to Ottawa to the national archives. A big issue is maps which don't get the respect that text gets. His philosophy is to try to do it at the highest quality. He showed a number of photos of the process.

Each of their projects is documented in a Data Engry Rules and Standards Manual that will allow projects to be restarted or understood later.

Keavy Martin, The Ethics of Archiving Indigenous Stories

Keavy talked about a project that is stuck. She talked about The Three Lives of Thrasher, a prison autobiography that is out of print. She is wary of the role of editor and control over material. She thought that digitization would be the solution, but not everyone thinks about digitization and access in the same way that we do in the academy. The family wasn't happy with the book. This got Keavy thinking about enlightenment principles of access and openness. There have been warnings about this. People talk about the social life of stories. Sometimes digitizing disempowers some people.

This creates the possibility that the texts will be preserved in absentia where we talk about what wasn't. Is failure a form of access?

We had an interesting discussion about how to return control from university researchers to the community and respect them. We need to question our assumptions and unlearn our habits. Even the notion of community breaks down.


At the very end we had a lightning (or lighting) round to raise questions and discuss what issues there are:

  • How do we help new people getting started?
  • How do we train project staff?
  • Pushback and a lack of trust from donors - they either don't want it digitized or don't think it is useful
  • Problems about how to help new projects
  • How to get academic credit for these projects
  • How to find out about what technologies are out there
  • How to find out what other projects are out there

Things to do:

  • How to start a digital project for dummies
  • Who has what technologies that one can sign out
  • Who has what expertise that they are willing to share
  • Help educating the community
  • What techniques are there to deal with stuff - home brew solutions
  • What are the standards to pay attention to
  • How to explain the language of digitization



edit SideBar

Page last modified on December 19, 2010, at 04:24 PM - Powered by PmWiki