Author Archives: Bill

Setting up your OPAC for Zotero support using unAPI

unAPI is a very simple protocol to let a machine know what other formats a document is available in. Zotero is a bibliographic management tool (like Endnote or Refworks) that operates as a Firefox plugin. And it speaks unAPI.

Let’s get them to play nice with each other!

How’s it all work?

Zotero looks for a well-constructed <link> [...]

Thinking through a simple API for HathiTrust item metadata

EDITS:

Added “recordURL” per Tod’s request Made a record’s title field an array and call it titles, to allow for vernacular entries Changed item’s ingest to lastUpdate to accurately note what the actual date reflects. This gets updated every time either the item or the record to which it’s attached gets changed. Fixed a couple typos, including one where [...]

Adding LibXML and Java STAX support to ruby-marc with pluggable XML parsers

JRuby is my ruby platform of choice, mostly because I think its deployment options in my work environment are simpler (perhaps technically and certainly politically), but also because I have high, high hopes to use lots of super-optimized native java libraries. The CPAN is what keeps me tethered to Perl, and whether or not you [...]

Adding LibXML and Java STAX support to ruby-marc with pluggable XML parsers

JRuby is my ruby platform of choice, mostly because I think its deployment options in my work environment are simpler (perhaps technically and certainly politically), but also because I have high, high hopes to use lots of super-optimized native java libraries. The CPAN is what keeps me tethered to Perl, and whether or not you [...]

An exercise in Solr and DataImportHandler: HathiTrust data

Many of the folks who read this blog (hi, both of you! Mom, say hello to Dad!) are aware, at least tangentially, of the HathiTrust. Currently hosted by us at the University of Michigan, the most public interface to its data is a VuFind installation you can access at catalog.hathitrust.org (or, for you smart-phone types, [...]

Dead-easy (but extreme) AJAX logging in our VuFind install

One of the advantages of having complete control over the OPAC is that I change things pretty easily. The downside of that is that we need to know what to change.

Many of you that work in libraries may have noticed that data are not necessarily the primary tool in decision-making. Or, say, even a part [...]

The sad truths about journal bundle prices

[Notes taken during a talk today, Ted Bergstrom: "Some Economics of Saying Nix To Big Deals and the Terrible Fix". My own thoughts are interspersed throughout; please don't automatically ascribe everything to Dr. Bergstrom.

Check out his stuff at Ted Bergstrom's home page.]

Journals are a weird market — libraries buy as agents of professors, using someone [...]

More Ruby MARC Benchmarks: Adding in MARC-XML

It turns out that UVA’s reluctance to use the raw MARC data on the search results screen is driven more by processing time than parsing time. Even if they were to start with a fully-parsed MARC object, they’re doing enough screwing around with that data that the bottleneck on their end appears to be all [...]

Benchmarking MARC record parsing in Ruby

[Note: since I started writing this, I found out Bess & Co. store MARC-XML. That makes a difference, since XML in Ruby can be really, really slow]

[UPADTE It turns out they don't use MARC-XML. They use MARC-Binary just like the rest of us. Oops. ]

[UP-UPDATE Well, no, they do use MARC-XML. I'm not afraid to [...]

Building a solr text filter for normalizing data

[Kind of part of a continuing series on our VUFind implementation; more of a sidebar, really.]

In my last post I made the case that you should put as much data normalization into Solr as possible. The built-in text filters will get you a long, long way, but sometimes you want to have specialized code, and [...]