February 2010 – Robot Librarian

2010-02-26 / Bill Dueber

New interest in MARC-HASH / JSON

EDIT: This is historical — the recommended serialization for marc in json is now Ross Singer’s marc-in-json. The marc-in-json serialization has implementations in the core marc libraries for Ruby and PHP, and add-ons for Perl and Java. C’mon, Python people! For reasons I’m still not entirely clear on (I wasn’t there), the Code4Lib 2010 conference this week inspired renewed interest in a JSON-based format for MARC data. When I initially looked at MARC-HASH almost a year ago, I was mostly looking for something that wasn’t such a pain in the butt to work with, something that would marshall into multiple…

Comments closed

2010-02-18 / Bill Dueber

OCLC still not (NO! They are!) normalizing their LCCNs

NOTE 2: It turns out that I did find a minor bug in the system, but that in general LCCN normalization is working correctly. I just happened to hit a weirdness with a bad LCCN and a little bug in the parser on their end. Which is getting fixed. So…good news all around, and huge kudos to Xiaoming Liu for his quick response! **NOTE** It strikes me that I haven’t seen a case where bad data results from sending a valid LCCN. The only verified problem is one of false negatives. Send a valid lccn, you’ll get back either good…

Comments closed

2010-02-16 / Bill Dueber

Indexing data into Solr via JRuby (with threads!)

[Note: in this post I’m just going to focus on the “get stuff into Solr” part. My normal focus — MARC data — will make an appearance in the next post when I talk about using this in addition to / instead of solrmarc.] Working with Solr I love me the Solr. I love everything about it except that the best way to interact with it is via Java. I don’t so much love me the java. So…taking Erik Hatcher’s lead and advice, as I will do whenever he offers either, I wrote some code to work within JRuby to…

Comments closed

2010-02-05 / Bill Dueber

jruby_producer_consumer dead-simple producer/consumer for JRuby

Yea! My first gem ever released! [YUCK! It was a disaster in a few ways! Don’t look at this! It’s hideous! There’s a new jruby_producer_consumer gem on gemcutter that is slightly different from this in that it works. Ignore the stuff below.] [In working on a threaded JRuby-based MARC-to-Solr project, I realized that my threading stuff was…ugly. And I didn’t really understand it. So I dug in today and wrote this.] I’ve just pushed to Gemcutter my first gem — a JRuby-only producer/consumer class that works with anything that provides #each called jruby_producer_consumer. It’s JRuby-only because it uses (a) A…

Comments closed

Month: February 2010

New interest in MARC-HASH / JSON

OCLC still not (NO! They are!) normalizing their LCCNs

Indexing data into Solr via JRuby (with threads!)

jruby_producer_consumer dead-simple producer/consumer for JRuby