Category Archives: Uncategorized

Sending MARC(ish) data to Refworks

Refworks has some okish documentation about how to deal with its callback import procedure, but I thought I’d put down how I’m doing it for our vufind install (mirlyn2-beta.lib.umich.edu) in case other folks are interested.

The basic procedure is:

Send your user to a specific refworks URL along with a callback URL that [...]

MARC-HASH: The saga continues (now with even less structure)

After a medium-sized discussion on #code4lib, we’ve collectively decided that…well, ok, no one really cares all that much, but a few people weighed in.

The new format is: A list of arrays. If it’s got two elements, it’s a control field; if it’s got four, it’s a data field.

SO….it’s like this now.

{   "type" : "marc-hash",   "version" [...]

MARC-HASH control field, now with less structure

Why do I ever, ever think that MARC might not rely on order? I don’t know.

In any case, control fields will now be just an array of duples:

control: [   ['001', 'value of the 001'],   ['006', 'value of the 006']   ['006', 'another 006'] }

MARC-Hash: a proposed format for JSON/YAML/Whatever-compatible MARC records

In my first shot at MARC-in-JSON, which I appropriately (and prematurely) named MARC-JSON, I made a point of losing round-tripability (to and from MARC) in order to end up with a nice, easy-to-work-with data structure based mostly on hashes. “Who really cares what order the subfields come in?” I asked myself.

Well, of course, it turns [...]

A plea: use Solr to normalize your data

[Only, of course, if you're using Solr. Otherwise, that'd be dumb.]

We’ve been working on Mirlyn2-Beta, our installation of VuFind for some time now (don’t let the fancy-pants name scare you off), and the further we get into it, the more obvious it is that I want to move as much data normalization into Solr itself [...]

Enough with the freakin’ LC Call Number normalization!

OK. I’m done with it, and this time I mean it.

I’ve updated and improved the lc normalization code, documented the algorithm, and put it all into Google Code. In the next couple weeks, I’ll be turning it into a Solr text filter so we can do some decent sorting on call-number search results.

Ask, and you shall receive, and it shall be AWESOME!

The good folks at ticTocs heard the call for open data, and they responded…exactly as I asked them to. Which makes me think I should have asked for a pony, too, but I’m still very, very happy!

Anyone can now download a simple tab-delimited text file describing all the journal table of contents RSS files they’ve [...]

TicTocs: Give us a file! Pretty pretty pretty please!

For those who haven’t heard, ticTOCs is a service that provides web-based access to a database of Journal RSS/Atom Table of Contents feeds. Awesome.

In their blog at News from TicTocs, a post titled I want to be completely honest with you about ticTOCs notes that:

As for the API - yes, we’ve been asked this several times, [...]

Five rules to make your open source more open

[I've noticed that a sure way to get people to look at stuff (as measured by, say, digg) is to include a number. So I did. Five. ]

Over at Bibliographic Wilderness, Jonathan Rothkind has a great followup to an ongoing discussion on the Blacklight list called How to build shared open source in which he [...]

And then I finally shut the hell up

I had a great — great! I tell you — 30 second conversation with Ken Varnum (of RSS4Lib fame) that went something like this (much paraphrasing, obviously):

B: You’re gonna have to fix that interface. The standard header won’t work. K: Well, no, we’re going leave it as it is. B: It’s not gonna work. K: We’ve decided to [...]