One of the advantages of having complete control over the OPAC is that I change things pretty easily. The downside of that is that we need to know what to change.
Many of you that work in libraries may have noticed that data are not necessarily the primary tool in decision-making. Or, say, even a part of the process. Or even thought about hard. Or even considered.
For many decisions I see going on in... <more>
[Notes taken during a talk today, Ted Bergstrom: “Some Economics of Saying Nix To Big Deals and the Terrible Fix”. My own thoughts are interspersed throughout; please don’t automatically ascribe everything to Dr. Bergstrom.
Check out his stuff at Ted Bergstrom’s home page.]
Journals are a weird market – libraries buy as agents of professors, using someone else’s money, in deals of enormous complexity and uncertain value from companies that basically have a monopoly.... <more>
It turns out that UVA’s reluctance to use the raw MARC data on the search results screen is driven more by processing time than parsing time. Even if they were to start with a fully-parsed MARC object, they’re doing enough screwing around with that data that the bottleneck on their end appears to be all the regex and string processing, not the parsing. Their specs for what gets displayed are complex enough that they want... <more>
[Note: since I started writing this, I found out Bess & Co. store MARC-XML. That makes a difference, since XML in Ruby can be really, really slow]
[UPADTE It turns out they don’t use MARC-XML. They use MARC-Binary just like the rest of us. Oops. ]
[UP-UPDATE Well, no, they do use MARC-XML. I’m not afraid to constantly change my story. This is why I’m the best investigative reporter in the business]
The other day... <more>
[Kind of part of a continuing series on our VUFind implementation; more of a sidebar, really.]
In my last post I made the case that you should put as much data normalization into Solr as possible. The built-in text filters will get you a long, long way, but sometimes you want to have specialized code, and then you need to build your own filter.
Huge Disclaimer: I’m putting this up not because I’m the... <more>
Note: This is the second in a series I’m doing about our VUFind installation, Mirlyn. Here I talk about how we got to where we are. Next I’ll start looking at specific technologies, how we solved various problems, and generally more nerd-centered stuff.
[Yet another bit in a series about our Vufind installation]
While I’m no longer shocked at the terrible state of our data every single day, I’m still shocked pretty often. We figured out pretty quickly that anything we could do to normalize data as it went into the Solr index (and, in fact, as queries were produced) would be a huge win.
There’s a continuum of attitudes about how much “business logic” belongs in the... <more>
I’m probably the last guy on earth to know this, but I’m recording it here just in case. I’m sending record titles in the subject line of emails, and of course they may be unicode. The body takes care of itself, but you need to explicitly encode a header like “Subject.”
$headers['To'] = $to; $headers['From'] = $from;
For the last few months, I've been working on rolling out a ridiculous-modified version of Vufind, which we just launched as our primary OPAC, Mirlyn, with a slightly-different version powering catalog.hathitrust.org, a temporary metadata search on the HathiTrust data until the OCLC takes it over at some undetermined date.
(Yeah, the HathiTrust site is a lot better looking.)
Refworks has some okish documentation about how to deal with its callback import procedure, but I thought I’d put down how I’m doing it for our vufind install (mirlyn2-beta.lib.umich.edu) in case other folks are interested.
The basic procedure is:
- Send your user to a specific refworks URL along with a callback URL that can enumerate the record(s) you want to import in a supported form
- Your user logs in (if need... <more>