April 23, 2010 – 10:20 am
[Note: edited for clarity thanks to rsinger's comment, below]
Doomed, I say! DOOOOOOOOOOMMMMMMMED!
My reasoning is simple: RDA will fail because it’s not “better enough.”
Now, those of you who know me might be saying to yourselves, “Waitjustaminute. Bill doesn’t know anything at all about cataloging, or semantic representations, or the relative merits of various encapsulations of bibliographic [...]
Jonathan Rochkind, in response to a long (and, IMHO, mostly ridiculous) thread on NGC4Lib, has been exploring the boundaries between a data model and its expression/serialization (
see here, here, and here
) and I thought I’d jump in.
What this post is not
There’s a lot to be said about a good domain model for bibliographic data. I’m [...]
April 13, 2010 – 10:31 am
Library of Congress Subject Headings (LCSH) in particular.
I’ve always been down on LCSH because I don’t understand them. They kinda look like a hierarchy, but they’re not really. Things get modifiers. Geography is inline and …weird.
And, of course, in our faceting catalog when you click on a linked LCSH to do an automatic search, you [...]
March 11, 2010 – 10:48 pm
Lately on the #code4lib IRC channel, several of us have been knocking around different versions (in several programming languages) of programs to read in a ginormous file and do some processing on each line. I noted some speedups related to multi-threading, and someone (maybe rsinger?) said, basically, that to bother with threading for a one-off [...]
[This is in response to a thread on the blacklight mailing list about getting MARC data into Solr.]
What’s the question?
The question came up, “How much time do we spend processing the MARC vs trying to push it into Solr?”. Bob Haschart found that even with a pretty damn complicated processing stage, pushing the data to [...]
I’ve been messing with easier ways of adding parsers to ruby-marc’s MARC::Reader object. The idea is that you can do this:
require 'marc'
require 'my_marc_stuff'
mbreader = MARC::Reader.new('test.mrc') # => Stock marc binary reader
mbreader = MARC::Reader.new('test.mrc' :readertype=>:marcstrict) # => ditto
MARC::Reader.register_parser(My::MARC::Parser, :marcstrict)
mbreader = MARC::Reader.new('test.mrc') # => Uses My::MARC::Parser now
xmlreader [...]
February 26, 2010 – 12:29 am
For reasons I’m still not entirely clear on (I wasn’t there), the Code4Lib 2010 conference this week inspired renewed interest in a JSON-based format for MARC data.
When I initially looked at MARC-HASH almost a year ago, I was mostly looking for something that wasn’t such a pain in the butt to work with, something that [...]
February 18, 2010 – 10:58 am
NOTE 2: It turns out that I did find a minor bug in the system, but that in general LCCN normalization is working correctly. I just happened to hit a weirdness with a bad LCCN and a little bug in the parser on their end. Which is getting fixed. So…good news all around, and huge [...]
February 16, 2010 – 3:43 pm
[Note: in this post I'm just going to focus on the "get stuff into Solr" part. My normal focus -- MARC data -- will
make an appearance in the next post when I talk about using this in addition to / instead of solrmarc.]
Working with Solr
I love me the Solr. I love everything about it except [...]
February 5, 2010 – 3:46 pm
Yea! My first gem ever released!
[YUCK! It was a disaster in a few ways! Don't look at this! It's hideous! There's a new jruby_producer_consumer gem on gemcutter that is slightly different from this in that it works. Ignore the stuff below.]
[In working on a threaded JRuby-based MARC-to-Solr project, I realized that my threading stuff was...ugly. [...]