Library cataloguing breaks our data


Use case: user desires to view a collection of masters theses by subject.

Solution 1: register departmental names as corporations in MARC 710

Catch 1: the departments have merged to create large departments (for example, the old departments of English, Romance languages and Germanic languages  became the department of modern foreign languages). The data that would have identified theses in English is now lost when the authority in 710 is updated.

Catch 2: the metadata does uniformly differentiate between theses on different aspects of study, for example, performative music is not distinguished uniformly from theoretical music, linguistics of English is lumped together with American cultural studies and literary studies. Thus, unrelated theses are placed together because they come from the same department, but not the same study track.

Catch 3: it is difficult to identify theses from the institution because the institution has changed name, and because the institution was formed by a merger and the merged institutions were also subject to various name changes.

Solution 2: Restructure the metadata so that the theses belong to a series with a standard title, create a controlled vocabulary that is used to differentiate the various theses on the basis of topic and study track, use solution 1 retroactively.

Catch 1: Reality*.

Solution 3: Use RDF.

Catches: we’ll work them out.

*Actually, the biggest problem is that the users want to present the data in their own system, which would involve either caching the data gleaned via SRU  (not possible for the departmental staff) or screenscraping the OPAC (not a nice solution).

