Panel 22: Access and Description: Current Trends (2015)
Moderator: Bronwen K. Maxson, Indiana University-Purdue University Indianapolis (IUPUI)
Rapporteur: Viviane Ferreira de Faria, University of New Mexico
Presenters:
Daniel Schoorl on behalf of Orchid Mazurkiewicz, Hispanic American Periodicals Index (HAPI)
Lost in Translation/Traducción/Tradução: Building a Trilingual HAPI
Wendy Pedersen, University of New Mexico
Discovery through Acquisitions: Colonizing WorldCat with WMS
Timothy Thompson, Princeton University
Descrever é preciso: Adding Item-level Metadata to the Leila Míccolis
Brazilian Alternative Press Collection at the University of Miami Libraries
Presentation 1
Daniel Schoorl presented on behalf of Orchid Mazurkiewicz. The moderator presented Daniel Schoorl's biography and Orchid Mazurkiewicz’s biography. They have been working together since 2009.
Daniel introduced the presentation by presenting a short description of HAPI. Last fall HAPI's new version was launched in English, Spanish and Portuguese and the new indexing is inspired by the American Model.
Daniel provided the description of the 1st version of HAPI online, launched in 1997. He also provided a description of the 2nd version, launched in 2007. As he compared the two versions, Daniel established that the 2007 version of HAPI online, with interface changes in Spanish and Portuguese, had the same to offer in terms of subject headings plus the terms in Spanish and Portuguese as the version of 1997, redirecting to the English subheadings. They added a new heading and also modified all headings.
Then, Daniel moved on to presenting the new version, which is also trilingual. This version allows for the trilingual search with autocomplete prompts. It also has a smaller amount in French, German and Italian.
The presenter stated that the way people used HAPI drastically changed, thus the trilingual version change was driven by a desire to provide greater content accessibility to Spanish and Portuguese users. According to the presenter, the new HAPI provides Spanish and Portuguese translations of the main subject headings. Apart from being trilingual, the major shift in this new version involves translations of complete subject thesaurus, making it now possible for trilingual subject searching. Thus, in whichever language version, it will seek all the three languages versions of the subject heading. You can search any of the subjects in one language - autocomplete prompts - and you will see the subheading in any of those languages.
As their work showed, the real shift resides in doing away with English as a dominant language and creating this trilingual subject file, with translations of all subject headings and subdivisions. Considering the international standards for developing multilingual thesauri and, when we discuss these standards, there are basically three types of issues to be addressed: administrative, linguistic and technological. The creation of a multilingual thesaurus involved providing equal treatment of all languages. It should be a fully developed thesaurus, structured with all semantic relationships as prevalence, affinity and hierarchy. The idea to create this was to build a thesaurus in each language without reference to the terms or structure of an existent thesaurus. In this sense, the source language becomes the dominant language with a result of the target languages adequately reflecting it (the dominant one) in the target cultures. As a monolingual thesaurus is always culturally biased, the straight translation might be considered a form of cultural imperialism. It’s a management decision, and often the choice made is to use the already existing thesaurus for obviously economic reasons. There is an English thesaurus with a number of translations for main terms, but when it comes to terminology, when languages have equal status, every preferred term in one of those languages should be matched by a good one. Thus, there are decisions to make to avoid literal translations from the source language into meaningless expressions into the target language.
The presenter reinforced the importance to take the following issues into consideration: prevalence issues (for instance, when the target language does not contain a term that corresponds in meaning to the source language) and technological issues (because a developer might say that, when it comes to technology, almost anything is doable and it is just a question of what you can afford). In the light of such considerations, their project aimed at the creation of a text structure that could provide the maximum flexibility that they could afford.
In 2013, a new editorial platform for HAPI was created: HAPI Central. The system completely transformed the way that data and the editorial process were managed. Daniel showed the record for political campaigns with Portuguese and Spanish translations. Since the indexing was done into only one language, there was the need to identify terms in every language and apply them separately, but at the same time it allowed for multilingual searching and across all three languages as the terms are all connected. So for example, someone doesn’t have to be in a Spanish version of the database to successfully perform a search using Spanish subheadings.
According to the presenter, the weakness of the structure is that it offers little flexibility in dealing with issues when there is no one to one equivalence between terms. The data structure is relational, so each index article points to the subject heading record associated with it and the trilingual display is very simple. Daniel showed an example containing the same article in three layers of HAPI Central.
Daniel described the process for creating subheading translations. He also exemplified the complexity of the process by highlighting the existence of numerous headings for specific indigenous groups. The process to create these subheadings involved consulting the Brazilian National Library (Portuguese Language); Mexican National libraries (Spanish Language) and the Library of Congress (English Language) as well as the and lsch-es.org website. Their team had to make decisions among the different options and they come up with headings of their own, they looked for literature they found at HAPI and terminology found on the web. They had a list reviewed by a translation company that uses native speakers. HAPI staff then reviewed the list. Overall the process took 5 to 6 months to translate around 3000 headings, including subdivisions.
Daniel also touched on a couple of issues that posed difficulties during the process. For instance, the presenter mentioned that the standard does not require structure, but the HAPI system does. The presenter used the term land reform (agrarian reform) as an example of duplicating and creating a circular reference through non preferred direct translations. He also used the small business term ‘pequenas e médias empresas’ to demonstrate the comprehensive approach of HAPI to the translations. Another example is the case of ‘biomass energy’ (biofuel and biogas) whose translation (‘biocombustíveis’) in the HAPI update is supported by crossing information with the Brazilian National Library. In another example, they decided to use the Spanish term ‘comunidad andina’ as the preferred heading instead of using the original term ‘Indian community’ in the old system. It was advantageous to change the original heading to the English version, now there are three different versions of the term and they found all references associated with the term.
The presenter closed with a brief overview of what is ahead for HAPI. With a browse subjects option, one can search for different keywords and see the preferred or used headings as well as redirect for non-preferred terms (eg: from Healthcare to Health).
Presentation 2
Wendy Pedersen, University of New Mexico
Discovery through Acquisitions: Colonizing WorldCat with WMS
Wendy Pedersen was introduced by the moderator and presented her biography.
The presenter introduced the topic of the presentation by defining WMS - WorldShare Management Systems: a web/cloud-based system that no longer requires a local server, filing updates nor overlay docs imports. WMS was acquired for a consortium of 17 libraries to replace III Millennium, which was client-based and maintained on servers at UNM.
According to the presenter, the change to WMS has required the UNM librarians to internalize certain changes to the vocabulary of acquisitions and cataloging. Wendy used a comparative approach to provide the correspondence of vocab between the old system and WMS. For instance, Integrated Library Services platform (ILS) is now Library Services Platform (LSP). Another change regards the transition from having a catalogue record to utilizing metadata instead. Also, in the new system, receiving is cataloging and cataloging is receiving.
As Wendy pointed out, when it is necessary to make an order, one performs a search in the backend interface, discovering items and searching WorldCat. The system comes up with various options and, with some luck, the item will immediately be available in WorldCat. And, once the item is found, one can just add it to the order. Moreover, the acquisition ordering staff are trained to pick the best record, and it is very much like copy cataloging. According to Wendy, there are several things that the new system allows the acquisition ordering staff to do, for instance: they can add it to a purchase order, apply a template when necessary, add fun, change the process entirely from zero to monograph, put in the shelving location if it is known, etc. However, WMS will not provide information regarding the date in which the book/item was actually received. Because the term ‘receiving’ means something else in WMS, Wendy and her team had to think of other ways to express it, especially when the physical pieces came into the building. As the situation surfaced at times, they called it ‘checking in’.
Wendy stated that, in the catalog, the receiving function actually pulls up the record and gives you a code number. Thus, when you put in the barcode and hit enter, you are in the catalog and you are done. From then on, Wendy walked us through the process of changing the location of an item in WMS if need be. She also explained how to verify whether the record being displayed is correct or not. The presenter also showed that the system allows for messages to be added as short or longer local Public notes. Hence, the presenter was able to demonstrate a few tools of WMS, and possible "hiccups", showing how easy it is to navigate the system.
In the next segment of the presentation, Wendy pointed out that difficulties might arise when Latin American books do not have a record on WorldCat. As her statistics showed, 25% of the works received on approval plans from Latin America are not found in WorldCat at the time of receipt. The lack of such records hinders the generation and payment of approval invoices; and the creation of a purchase order. Thus, in order to be added to a purchase order, each title needs to exist in WorldCat. There is no such thing as a temporary or masked record since WMS is Live! So, the presenter provided an example of how to go about making an item/book discoverable in case its record is not available.
The presenter highlighted some results based on UNM’s catalogers’ experience since WMS was implemented - about a year ago. Among these results, the presenter stated that her team has created over eleven hundred records for Latin America monographs that were not otherwise ‘discoverable’ yet. She also mentioned that, with WMS, the UNM Latin American Technical Services team can make better original contributions to WorldCat in the ordering process, from the creation of more substantial K level records to the subsequent upgrade to full level by their own catalogers after the backlog has aged 2-3 months. Wendy also mentioned that, in the past year, over 1,100 Latin American works have been made discoverable to the broader community via UNM’s use of WMS for acquisitions.
The presenter also proposed some takeaways from the application of WMS. On one hand, this system might be good for business since it creates efficient workflows for acquisition of mainstream materials; it streamlines cataloging and item creation processes; it updates catalogs automatically to latest bibliographic enhancements; and it forces discoverability for less common library materials. On the other hand, the use of WMS might not be so good for the catalogers’ profession since its interface is all point-and-click with dropdowns; there is more work for acquisition of less mainstream materials; non-catalogers can alter records or delete holdings; and it populates WorldCat with a certain number of junk records, leaving the professional cataloging to “someone else”.
Wendy closed her presentation on the UNM’s migration to WMS by suggesting further reading of the following articles:
1) Sever Bordeianu and Laura Kohl, “The Voyage Home: New Mexico Libraries Migrate to WMS, OCLC’s Cloud-Based ILS”, to be published in Technical Services Quarterly (v. 32, no. 3).
2) Claire-Lise Benaud and Sever Bordeianu, “OCLC’s WorldShare Management Services: A Brave New World for Catalogers”, Cataloging & Classification Quarterly, DOI: 10.1080/01639374.2014.1003668 http://dx.doi.org/10.1080/01639374.2014.1003668
Presentation 3
Timothy Thompson, Princeton University
Descrever é preciso: Adding Item-level Metadata to the Leila Míccolis
Brazilian Alternative Press Collection at the University of Miami Libraries
Timothy’s biography was introduced by the moderator.
Timothy started by disclaiming that the presentation was initially designed to be a “Roda Viva” presentation.
Firstly, the presenter showed an outline of his project about the metadata. The outline was divided into 5 parts: Background; Timeline; Approach; Metadata enhancement and Data Transformation and Analysis.
Thus, Timothy started with the project background by introducing Leila Miccolis, a Brazilian poet and activist whose career in the 70s and 80s was quite productive. The poet was involved in the Zine scene in underground networks during the dictatorship. The collection spreads mostly from the 60s to the early 90s, but there are some recent materials too. It also comprises a Brazilian Alternative Press Collection Publication sample, where works such as ‘Lampião da Esquina’ and ‘Opinião’ can be found.
The presenter read the statement of the mission of the Collection and its description.
Then, the timeline for the project was briefly presented. In 2006, the acquisition, processing and inventory of individual publications phases took place under the guidance of the University of Miami Library. Timothy presented a sample of a pdf, which gave a bit of information about the collection, but did not really make the publication accessible to users. In 2010, the University of Miami and other institutions around the Caribbean were involved in a project called the Collaborative Archive from the African Diaspora. In 2013, the LM collection metadata enhancement was funded by a grant using the Collaborative Archive from the African Diaspora funds and the metadata enhancement focused on the representation of Afro Brazilian identity within LM’s collection.
When explaining the approach to the project, the presenter highlighted the reigning paradigm in archival conventions: more process, less product. This paradigm applies to archivists, who seek to make their collections quickly available for people to have some kind of access to researches that already exist, do not spend a lot of time providing a higher level description of each folder, each piece, etc. The focus is to put the collection out there so people have immediate access to it; and, when they have time, they go back to the files and add to the metadata. Based on insights provided by this paradigm, Timothy described his own experience in working with metadata with minimal resources in a sustainable way. The presenter affirmed that this project may serve as a foundation for a model of metadata handling with limited funding. Thus, the introduction of the concept of Archival Context and Thematic Focus – metadata librarians to complement archival research – was concluded.
In the following segment of the presentation, Timothy provided the Metadata Enhancement Template he utilized in his project. The template had a streamlined metadata format, 54 elements for things like title, creator, contributor, description, publisher, dates - the bare bones, core elements that are necessary for discoverability. They used the pdf inventory as a basis, and split that up into individual records in this template. The template was given to a student to fill in by hand. This work was done 10 hours per week and the student recruited to perform this task was Brazilian and took classes with Professor Butterman for her major in Gender.
The presenter also described the Metadata Enhancement contents. The collection is very large, containing about 120 boxes, focused on thematic approach to African Brazilian identity. They looked for individual poems or articles or special issues that had some relation to or some representation of African Brazilian identity. This was not necessarily systematic, it was skimming the public issues and looking for things, but whenever the student found something relevant, she provided in-depth descriptions for the issues or titles. She would include all the contributors to that issue as well as the contents which were related to their thematic focus (geographic information, etc.) with core metadata that was not available in the inventory. She would also add the role of the contributor to the entry Timothy then showed an example of contributors for a publication.
Timothy presented a breakdown of the Data Transformation and Analysis by showing 'finding aids' container list to provide a sample of the entries, with controlled vocabulary provided by the student. The presenter demonstrated how the Archive manager software works and explained that the student created her own controlled vocabulary, what was helpful because the nature of these publications. This is of extreme importance, since there may not have been adequate headings in the Library of Congress subject list, for example. Timothy also pointed out the importance of social network analysis and the relationships in the data. For the presenter, the social networks are fascinating and contribute to the advancement of several forms of resistance, mentioning the Network Graphs in Gephi. The presenter referred to an article that is of interest for everyone who would like to have more technical information about this network graphs: Modeling Afro-Latin American Artistic Representations in Topic Maps: Cuba’s Prominence in Latin American Discourse Digital Humanities Quarterly 7.1 (2013).
The presenter closed his presentation by drawing some conclusions regarding the project outcomes, limitations and a quick demo for the Network Analysis Gephi. As for the outcomes of the project, Timothy highlighted the opportunity to provide enhanced access to individual publications; the rich learning experience the project is (blog post); the opportunity to collaborate with faculty (Professor Butterman); the cultivation of donor relationship (Facebook page shared the project achievements with with Leila Miccolis); the has been a lot of follow up work since her original acquisition; and the opportunity to explore and analyze new dataset. The presenter also pointed out some of the limitations of the project such as the use XForms rather than oXygen; the point that EAD profile (Archon) cannot accommodate enhanced item-level data and the fact that controlled vocabularies have not been reconciled.
As for the quick demonstration of Network Analysis Gephi, Timothy concluded that the software has a powerful analytical tool that connects the information and links it by affinity as is established by its settings. During that demonstration, Ruby Gutierrez from HAPI asked if the software shows where the notes contributors are located. Timothy clarified and showed how the graph works for the African Brazilian Identity and Leila Miccolis project. São Paulo, for instance, is highlighted as having its own network; some authors are identified as LM collaborators. It is a data laboratory that allows you to look at the numbers. There are some different view possibilities, layouts and options to save as PDF, etc. The presenter stated that, by utilizing this tool, one can get a higher level enhancement of metadata that has many different potential outcomes and uses.
Questions:
● Jessie Christensen from BYU asked Wendy to elaborate on the relationship between acquisition and cataloging. Wendy affirmed she couldn’t give any solid conclusions since the boundaries between cataloging and acquisitions are fuzzier than ever. They have a lot of shelf-ready stuff that comes in already and that has enhanced the disconnect there. The people in acquisitions that are ordering have had to be trained to recognize what is acceptable record and what is not an acceptable record, and they don’t know how well that is really going. She affirmed that they get stuff in cataloging that needs attention. And, through some exemplification, Wendy said that everyone is trying to find out their roles.
● Bark Burton from Notre Dame asked Wendy about the K Level record creation. Wendy said that she starts from scratch, puts it in the back log, lets it age for about 2 months and then goes back to it to perform the enhancement. Probably a quarter of the books she created in the K Level record book, she had to come back and enhance them, either completely or to finish off what somebody else started on top of what she had done.
● Erma (…) from MLA asked Daniel about the timing of the project he presented. Daniel described the timeline and Ruby added to it. Daniel elaborated on the language (Portuguese, Spanish) records being searchable at some point.
● Ruby Gutierrez from HAPI asked Daniel if the English version will pull up the records for the Portuguese language. Daniel answered that they will.
● Timothy asked Daniel about working with translation companies and ongoing translations. Daniel said that, in the past, translation companies were involved in the HAPI online project; but, now, they prefer to recruit in-house by actively adding native Portuguese and native Spanish speakers to the staff.
● Timothy asked Daniel about making the thesaurus and dataset open to download in order to provide collaboration. Daniel said that, in the future, it is a desirable move. Ruby elaborated on the answer, using the example of the old website being open, but not sustainable. She said it is possible to do dumps and that Orchid would be willing to develop it. The databases is MYSEQUEL.
● Timothy asked Wendy who the vendors she works with are and if they are approvals and if the vendors provide cataloguing information for her records. Wendy listed the vendors and what sort of info they provide to aid her creation of records.
● Timothy asked Wendy if it becomes the master record. Wendy responded that it does and that people should be upgrading the record she creates.
● Diana Restrepo from a library in Colombia talked about the experience in cataloging in Colombia and how they are tackling buying and cataloging books from all over Latin America. Ruby asked where they are getting their terms from. Diana talked about the process (multi-meetings, policy decision, develop own terms).
● Daniel asked Wendy: What kind of training is provided by LCLC? Wendy says that they are very supportive and provided great training. The mechanics of the system was different from a 30-year usage of database and both acquisition people and circulation people received quite a lot of training.
● Brenda Salem from the University of Pittsburgh asked Daniel: What did you do about the additional descriptors in HAPI, as they need a lot of indexing in English. He said that they maintained their policy of keeping them in English. Even though the Portuguese and Spanish headings are for the record view, the English descriptor key words are still appearing. They just maintained that consistency.