Home > Uncategorized > Why RDA is doomed to failure

Why RDA is doomed to failure

April 23, 2010 7 Comments »

[Note: edited for clarity thanks to rsinger's comment, below]

Doomed, I say! DOOOOOOOOOOMMMMMMMED!

My reasoning is simple: RDA will fail because it’s not “better enough.”

Now, those of you who know me might be saying to yourselves, “Waitjustaminute. Bill doesn’t know anything at all about cataloging, or semantic representations, or the relative merits of various encapsulations of bibliographic metadata. I mean, sure, he knows a lot about…err….hmmm…well, in any case, he’s definitely talking out of his ass on this one.”

First off, thanks for having such a long-winded internal monologue about me; it’s good to be thought of.

And, of course, you’re right on all counts. I don’t know what I’m talking about in any of those realms.

And yet I’m still willing to make a strong statement?

Yes. I am. Here’s why.

[Oh, and if you're convinced I'm wrong -- please say so. I'd love to be wrong about this.]

First, an assertion

The purpose of any bibliographic metadata is to facilitate three things:

  • Description/Identification. If you know what you want, does the metadata give you enough information to determine if the described item is what you want? Alternately, if you’re holding an item (or an alternate metadata representation of it), can you find the record that describes it?
  • Machine finding. Can a machine, given a good-enough query, find a work via a search of the metadata?
  • Machine grouping. Given the metadata, can a machine help a person find items “like this one”?

Take issue with one or more of those statements. I don’t care. The point I’m really trying to make is that any standard that doesn’t put unmediated machine reasoning at the forefront of what the metadata needs to support is living in a deep, deep hole.

Computer cycles are pretty cheap, and programmers are pretty smart. We can figure out how to do useful things with virtually any data, but only if we can reliably get at those data.

Getting 75% of the way there

Three-fourths of the problem can be addressed with one simple concept.

A solid equality relationship.

By this I mean that “=” had better damn well mean “equal,” as opposed to “probably the same, but there might be other representations, too.” If I want to say “A = B” (where A and B are authors, or works, or subjects, or anything that can be nailed down) there’s better be no false positives and no false negatives. Ever. MARC’s use of “hopefully-unique strings” is ridiculously insufficient in the modern era.

RDA does pretty well with this, with URIs for appropriate concepts, so that’s good.

What’s wrong with it?

Well, it’s gonna cost money to access the spec, for starters. That’s just dumb.

But it’s also not flexible/extensible enough. It’s true that I’m not a cataloger. I do have an MS in computer science, though, and there is stuff in the various versions of the RDA spec which lead me to believe that the committee desperately, desperately needed some hardcore geeks on it. Computer science has basically done nothing but develop methods for abstraction and composition for decades, and that isn’t reflected enough here.

Language such as, “If it is determined that a mechanism for providing a direct link between a note and the instance of the element to which it relates is required,…” worries me. if? IF????? That’s not a spec. That’s a guideline. Nail it down, for god’s sake. When is it appropriate or inappropriate? How do you add links to multiple (but not all) instances of the element?

The spec also seems to describe at least half a dozen kinds of titles. One of these is “Abbreviated title.” Do we really want an abbreviated title? No. We want a title with an “abbreviated” modifier, so we can use that same modifier for, say, a corporate name or publisher or anything else. [Note: see rsinger's comment below, indicating this was a piss-poor example on my part.]

Well, sure, but it’s still better than the AACR2!

[This section updated to disabiguate my use of 'MARC' when I really meant 'AACR2 as commonly talked about in term of MARC tags']

Of course it is. It’s just not better enough!

We’re not just talking about writing a spec. We’re talking about replacing every single tool in the library toolchain, from the ILS to editing software to OPACs to scripts that keep it all put together. We’ll be asking programmers to learn new skills and new ways of thinking, vendors to produce functional software for untested data formats, and catalogers to essentially take their whole brain out of their heads and get a new one.

But that, frankly, is the easy part. The entire culture of the library is built around AACR2 concepts and MARC data structures. The thought processes, nomenclature — everything sometimes feels as if it’s built around three-digit tags. The majority of the (crucial!) specialized vocabulary librarians, and experts and specialists, use to communicate with each other is directly or indirectly tied to MARC

So, yeah, RDA is a hellofa lot better than AACR2/MARC. But in my view, it’s not better enough to justify all the pain. Switching is incredibly, astoundingly expensive both in terms of cost and in terms of the devaluation of institutional knowledge. We can’t do it every few years. We need to be damn sure we’re getting it right.

Tags:

Comments (Close):7

Leave my own
  1. Ross Singer
    April 23, 2010 at 11:13 am

    Hmm, there’s a lot here and while I think some of this would be easier to talk about synchronously, you have to go with the forum you have, not the forum you want.

    First off, let me put it on the record that I don’t disagree with your thesis. I can’t say whether or not RDA will fail (or what that “failure” or “success” means, really) but its timidity in actually modeling the data leaves a lot to be desired.

    Now, on to your arguments… Equality (with regards to information) is always going to be subjective. Witness the agita that owl:sameAs is currently wreaking on the Linked Data universe (esp. the hardcore semantic web set) to see. Machine based linking is always going to have error. Homonyms, mistaken assumptions, and human error are just going to have to be accounted for. Without a doubt RDA need to drop the string matching qualities of the status quo in MARC/AACR2 in favor of real identifiers. Still, this isn’t going solve the equality issue 100% because, honestly, a cataloger may not be 100% sure of what s/he is describing.

    Also, abbreviated titles are actual things. Like “JAMA”. I’m not sure the actual provenance of these titles, but they are distinct from the actual title (and generally considered important and used).

    My last point would be how you compare “RDA” and “MARC” in your last part. Really, you’re comparing RDA with AACR2 (esp. since the powers that be are trying to figure how RDA will be transmitted via MARC). The major issue is that RDA doesn’t distance itself nearly enough from AACR2 to be entirely worthwhile. Everything is still a literal and there is still a very “record-centric” mindset (even in the RDF schemas). This is most obvious when you see things like “titleOfTheWork” and “projectionOfCartographicContentExpression” instead of, I don’t know, just modeling the damned FRBR entities like they should.

    So, instead, we have a somewhat-major change in cataloging rules that will require a lot of time and energy and still provide no “real” relationships between resources and entities.

  2. Laura
    April 23, 2010 at 12:23 pm

    One minor quibble. RDA is intended to be a replacement for AACR2 — a descriptive standard, rather than MARC — a transmission standard. Granted MARC has evolved over the years to do both description and transmission in practice since there have been rules akin to application profiles in terms of how to enter data into a MARC record.

  3. Wally Grotophorst
    April 23, 2010 at 12:24 pm

    If you wonder whether this disconnect between computer science and library science (specifically cataloging) is real, stroll down your QA76 range of shelves sometime and marvel at the distribution of shelving locations for something like Oracle how-to books.

  4. George Duimovich
    April 27, 2010 at 10:02 am

    In “Directions in Metadata” Karen Coyle notes that the current vendors have been reporting near ZERO feedback / customer demand for anything related to RDA. True, it’s still early – the spec hasn’t been formally released – but in a slow moving community, any change seems to need a lot of “ramp up” time, for both the library community and its vendors.

    Very too bad, since there’s a sense of urgency that’s missing in all of this discussion. I think the OSS community is going to shape up to be best positioned to respond to changes, but moving forward with some reasonable consensus from libraries is going to be the challenge. There still remains a gulf between the well-informed IT & catalogers vs. the laggards from the catalog card generation who don’t understand how our MARC/AACR2 standards present huge data issues that prevent us from moving forward.

  5. Karen Coyle
    April 27, 2010 at 11:47 am

    If you look at the diagram called “Singapore Framework” on the Dublin Core site [1], it illustrates all of the necessary elements of a functioning, modern metadata scheme. The framework is based on RDF, but it could really be based on any other foundation technology. What we don’t seem to have learned in the library world is that the cataloging rules do not a metadata schema make. The rules are about how you make decisions, but you need to have defined data elements, vocabularies, and, above all, you need to have some sense of what functionality you wish your metadata to support. I feel like we go about it entirely backwards, first creating rules, then trying to fit it all into a data format.

    [1]http://dublincore.org/documents/singapore-framework/

  6. Irvin Flack
    May 5, 2010 at 1:56 am

    Following up on Karen’s and Ross’s comments, I’m reminded about that joke about the guy looking for his lost keys under the street light — not because that’s where he dropped them but because that’s where he could see. Or, to throw in another metaphor: you visit a surgeon, you get an operation. Cataloguers are experts on the rules — so that’s what RDA at heart still is, a set of rules.

  7. Bruce
    May 21, 2010 at 7:54 pm

    If you wonder whether this disconnect between computer science and library science (specifically cataloging) is real, stroll down your QA76 range of shelves sometime and marvel at the distribution of shelving locations for something like Oracle how-to books.