• Dead-easy (but extreme) AJAX logging in our VuFind install

    One of the advantages of having complete control over the OPAC is that I change things pretty easily. The downside of that is that we need to know what to change.

    Many of you that work in libraries may have noticed that data are not necessarily the primary tool in decision-making. Or, say, even a part of the process. Or even thought about hard. Or even considered.

    For many decisions I see going on in... <more>

  • The sad truths about journal bundle prices

    [Notes taken during a talk today, Ted Bergstrom: “Some Economics of Saying Nix To Big Deals and the Terrible Fix”. My own thoughts are interspersed throughout; please don’t automatically ascribe everything to Dr. Bergstrom.

    Check out his stuff at Ted Bergstrom’s home page.]

    Journals are a weird market – libraries buy as agents of professors, using someone else’s money, in deals of enormous complexity and uncertain value from companies that basically have a monopoly.

    ... <more>
  • More Ruby MARC Benchmarks: Adding in MARC-XML

    It turns out that UVA’s reluctance to use the raw MARC data on the search results screen is driven more by processing time than parsing time. Even if they were to start with a fully-parsed MARC object, they’re doing enough screwing around with that data that the bottleneck on their end appears to be all the regex and string processing, not the parsing. Their specs for what gets displayed are complex enough that they want... <more>

  • Benchmarking MARC record parsing in Ruby

    [Note: since I started writing this, I found out Bess & Co. store MARC-XML. That makes a difference, since XML in Ruby can be really, really slow]

    [UPADTE It turns out they don’t use MARC-XML. They use MARC-Binary just like the rest of us. Oops. ]

    [UP-UPDATE Well, no, they do use MARC-XML. I’m not afraid to constantly change my story. This is why I’m the best investigative reporter in the business]

    The other day... <more>

  • Building a solr text filter for normalizing data

    [Kind of part of a continuing series on our VUFind implementation; more of a sidebar, really.]

    In my last post I made the case that you should put as much data normalization into Solr as possible. The built-in text filters will get you a long, long way, but sometimes you want to have specialized code, and then you need to build your own filter.

    Huge Disclaimer: I’m putting this up not because I’m the... <more>

  • Going with and "forking" VUFind

    Note: This is the second in a series I’m doing about our VUFind installation, Mirlyn. Here I talk about how we got to where we are. Next I’ll start looking at specific technologies, how we solved various problems, and generally more nerd-centered stuff.

    When the University Library decided to go down the path of an open-source, solr-based OPAC, there were (and are, I guess) two big players: VUFind and Blacklight.

    I... <more>

  • Easy Solr types for library data

    [Yet another bit in a series about our Vufind installation]

    While I’m no longer shocked at the terrible state of our data every single day, I’m still shocked pretty often. We figured out pretty quickly that anything we could do to normalize data as it went into the Solr index (and, in fact, as queries were produced) would be a huge win.

    There’s a continuum of attitudes about how much “business logic” belongs in the... <more>

  • Sending unicode email headers in PHP

    I’m probably the last guy on earth to know this, but I’m recording it here just in case. I’m sending record titles in the subject line of emails, and of course they may be unicode. The body takes care of itself, but you need to explicitly encode a header like “Subject.”

     $headers['To'] = $to; $headers['From'] = $from; <more>
  • Rolling out UMich's "VUFind": Introduction and New Features

    For the last few months, I've been working on rolling out a ridiculous-modified version of Vufind, which we just launched as our primary OPAC, Mirlyn, with a slightly-different version powering catalog.hathitrust.org, a temporary metadata search on the HathiTrust data until the OCLC takes it over at some undetermined date.

    (Yeah, the HathiTrust site is a lot better looking.)

    [Our Aleph-based catalog lives on at mirlyn-classic) -- I'll be interested to see... <more>

  • Sending MARC(ish) data to Refworks

    Refworks has some okish documentation about how to deal with its callback import procedure, but I thought I’d put down how I’m doing it for our vufind install (mirlyn2-beta.lib.umich.edu) in case other folks are interested.

    The basic procedure is:

    • Send your user to a specific refworks URL along with a callback URL that can enumerate the record(s) you want to import in a supported form
    • Your user logs in (if need... <more>