Sunday, 23 February 2014
In class this week, Dan McLaughlin, one of the founders of the Pasadena Digital History Collaborative (PDHC), talked to us about assigning subject headings to photographs. Through my HistoryMakers fellowship, I had been taught to use a given subject heading if at least 20 percent of the oral history was about that subject. I knew that would be hard to apply to a photograph, so I was looking forward to Dan’s thoughts on the assignment of subject headings. The advice that he shared which has stuck with me is, “if someone was looking for an example of “x” from a certain time period, what would “x” be?”. In other words, if there is a faint outline of a bird in the background, birds should not be a subject heading because there is nothing to be learned about birds from that image. Just like the HistoryMakers had to establish some local practices, Dan shared the subject heading handbook that the PDHC uses for consistency when dealing with images that their catalogers come across frequently. One thing that the PDHC encourages that The HistoryMakers did not is the use of a notes field to record any of the relevant findings that a cataloger comes across while performing subject heading research. I understand that this could spiral out of control for super detailed individuals but it does create richer metadata records that enable more contextual linkages over time. Cataloging managers have tough decisions to make when deciding how detailed the records should be throughout a given project. Dan showed us some resources on the Los Angeles and Pasadena Public Library websites that would help us to identify the people, buildings, and businesses in the images that we would be working with on our homework assignment. Both of the images that I worked with had some significant historical context that I was happy to include in the notes field of the record. Thanks to this class, I now know who Baron Michele Leone is!
The following Tuesday, Dan came back to our class to give commentary while everyone took turns showing our assigned images on the projector and sharing how we arrived at our particular choice of LCSH terms. The class seemed to drag on as the same individuals chimed in to give me and my classmates, additional terms to search in the LOC subject authority’s website. Once again the subjectivity of cataloging photographs met with the criteria of the assignment. After four or five terms, I’m ready to move on to the next photograph, but some people seemed intent on staring at an image until they have exhausted all of the possibilities. In conclusion, I would love to be photograph cataloguer with a sensible manager that understands when enough is enough….according to me, J
In class this week, we continued our discussion of Dublin Core metadata elements. We spent a significant amount of time looking at how different institutions manage the “rights” field in their metadata records. Some require users to contact the department to determine terms of access and others use blanket statements about fair use, public domain, and relevant copyright laws. My favorite came from East Carolina University who had a rights statement related to orphan works which essentially asked end users to let them know if a particular image fell under copyright and should be taken down. I liked it because it did not assume that the cataloger was an authority on the content of the digital object, and encouraged the general public to participate in the identification of the origins of the image. Linda went on to help us differentiate between the “type” and “format” elements in Dublin Core. I find that the distinctions are easier to discern if the cataloger has a good grasp on what the record is describing, the analog item or the digital object. For instance, a metadata record for a physical photograph (analog) would have a dc: type value of “image” and a dc: format value of “8x10”; and a metadata record for a scanned photograph (digital) would have a dc: type value of “image”, and a dc: format value of “image/jpeg”. It also helps if I can remember that the DCMI is the controlled vocabulary that populates “type”, while the MIME controlled vocabulary corresponds to “format”. Today’s class also featured discussions on medium, extent, coverage, description, and subject elements. My experience with archives have enabled me to get familiar with Library of Congress subject headings but there are so many more to learn about; I’m looking forward to utilizing the Art and Architecture Thesaurus (AAT), Thesaurus for Graphic Materials (TGM), Union List of Artists Names, and Thesaurus for Geographic Names (TGN). Our midterm assignment requires us to select a metadata strategy and identify metadata elements; determine if they will be required, searchable, or hidden, if they should utilize a controlled vocabulary, and what our data entry protocols will be.
Wednesday, 5 February 2014
In class this week, we had another overview of the Dublin Core metadata schema and discussed at a greater detail, the elements related to intellectual property. Other than its wide application (on account of the flexible field definitions), Dublin Core is important because it is the lowest common mapping element for all metadata schemas; which is critical for harvesting metadata. Most of the content of this week’s lecture was very familiar from my previous archives jobs, and last semester, but several details did help me connect some dots in my understanding. For instance, when a fellow archivist volunteered to enter descriptive metadata about the Black LGBT collection, he asked if we should put the titles of the artifacts in brackets. I said no, because I had not used that convention before, but I learned today that putting “made up titles” in brackets is mandated by AACR2, the data standard for libraries. Linda added that additional brackets in metadata records should be omitted because they can interfere with searching and retrieval. We also discussed qualifiers for “dc: title”, and “dc: date” fields. I also learned that “unknown” is not appropriate to put in a date field, it would be better to leave the field blank. The logic is that the energy required to troubleshoot or analyze the data is wasted on entries that provide no information. Another mistake that catalogers make is to put “circa” in a date formatted field, which the software will not process; a solution is to make two date fields, one formatted for text, the other for traditional date information. When we talked about the subjectivity involved in identifying an object’s “dc: creator”, “dc: contributor”, and “dc: publisher”; I could see why pinning down a local procedure/standard is critical for the consistency of the data entry. We spent the second half of the class working in Photoshop, using a scrip to convert a batch of 30 TIFF images into JPEGS, then individually rotating, cropping, and adding descriptive metadata to their records.
Today we discussed some general rules that an archivist should consider when deciding which metadata schemas or elements to use on given project. I made a checklist based on the content of Marie Kennedy’s 2008 article on the Texas Digital Library website, “Nine questions to guide you inchoosing a metadata schema”. I liked the way that the article used examples from the University of Southern California medical library’s digital collection to demonstrate the points. Questions like, who will be using the collection, or do you have the funding to maintain the metadata over time seem impossible to answer for a small community archive. I would definitely use this article at the onset of a project to indicate the great deal of human and financial resources that need to go into the endeavor, but in the back of my mind, I would be prepared to have “unknown” as an answer and make other concessions. Other concepts like NISO standards, and consistent data entry protocols should be in place regardless of the institutional structure.
In class, we also discussed the “One to One Principle” in terms of metadata records. In the past, I had learned about it in terms of FRBR (Functional Requirements for Bibliographic Records), with a memorable, albeit foggy understanding of what could be done in describing Mary Shelley’s Frankenstein as a book, a movie, and an audiobook. Similar to metadata records, all related but conceptually different entities should be represented by separate records. In other words, the container matters. A digital object should have a separate record than the analog object, even though the intellectual content is the same. We can use the “dc: source” or “dc: relation” elements to explain what the record in question is referring to. Linda mentioned how this concept is becoming increasingly important as large databases like WorldCat had been full of description for analog objects for a long time; with the influx of digital objects (often with similar intellectual content) catalogers need to be clear about what they are describing within the record. Is this a book at my local library about Anne Frank, or an e-book that I can access online, certain metadata fields should make this distinction, when the title, subject or date fields are all the same.