<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments for Robot Librarian</title>
	<atom:link href="http://robotlibrarian.billdueber.com/comments/feed/" rel="self" type="application/rss+xml" />
	<link>http://robotlibrarian.billdueber.com</link>
	<description>Disclaimer: I'm not actually a robot.</description>
	<lastBuildDate>Fri, 12 Mar 2010 04:04:19 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>Comment on Why bother with threading in jruby? Because it&#8217;s easy. by Bill</title>
		<link>http://robotlibrarian.billdueber.com/why-bother-with-threading-in-jruby-because-its-easy/comment-page-1/#comment-433</link>
		<dc:creator>Bill</dc:creator>
		<pubDate>Fri, 12 Mar 2010 04:04:19 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/why-bother-with-threading-in-jruby-because-its-easy/#comment-433</guid>
		<description>&lt;p&gt;The assumption is that the producer is faster than the consumer  (otherwise, why bother to have multiple consumers).  A regular Queue (not sized) would grow without bound based on the speed difference between consumption and production.  We don&#039;t, for example, want 10K lines in memory while we&#039;re waiting for consumers to turn them into MARC objects or whatnot.&lt;/p&gt;

&lt;p&gt;A SizedQueue will block on both enqueue  (if it&#039;s full) and dequeue (if there&#039;s nothing in it), so it&#039;s exactly what we need for this kind of thing.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>The assumption is that the producer is faster than the consumer  (otherwise, why bother to have multiple consumers).  A regular Queue (not sized) would grow without bound based on the speed difference between consumption and production.  We don&#8217;t, for example, want 10K lines in memory while we&#8217;re waiting for consumers to turn them into <acronym title="MAchine Readable Cataloging">MARC</acronym> objects or whatnot.</p>

<p>A SizedQueue will block on both enqueue  (if it&#8217;s full) and dequeue (if there&#8217;s nothing in it), so it&#8217;s exactly what we need for this kind of thing.</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on Why bother with threading in jruby? Because it&#8217;s easy. by Jonathan Rochkind</title>
		<link>http://robotlibrarian.billdueber.com/why-bother-with-threading-in-jruby-because-its-easy/comment-page-1/#comment-432</link>
		<dc:creator>Jonathan Rochkind</dc:creator>
		<pubDate>Fri, 12 Mar 2010 03:41:11 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/why-bother-with-threading-in-jruby-because-its-easy/#comment-432</guid>
		<description>&lt;p&gt;What&#039;s the purpose of using a SizedQueue instead of an ordinary Queue?  What if the producer produces so much faster than the consumers consume, that the threads*4 size is exhausted, what happens?  Does the producer just block waiting for there to be room to enqueue?&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>What&#8217;s the purpose of using a SizedQueue instead of an ordinary Queue?  What if the producer produces so much faster than the consumers consume, that the threads*4 size is exhausted, what happens?  Does the producer just block waiting for there to be room to enqueue?</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on Why bother with threading in jruby? Because it&#8217;s easy. by Jonathan Rochkind</title>
		<link>http://robotlibrarian.billdueber.com/why-bother-with-threading-in-jruby-because-its-easy/comment-page-1/#comment-431</link>
		<dc:creator>Jonathan Rochkind</dc:creator>
		<pubDate>Fri, 12 Mar 2010 03:35:12 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/why-bother-with-threading-in-jruby-because-its-easy/#comment-431</guid>
		<description>&lt;p&gt;Nice. You wrote that one? Ruby&#039;s pretty sweet, huh?&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Nice. You wrote that one? Ruby&#8217;s pretty sweet, huh?</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on Ask, and you shall receive, and it shall be AWESOME! by Santy Chumbe</title>
		<link>http://robotlibrarian.billdueber.com/ask-and-you-shall-receive-and-it-shall-be-awesome/comment-page-1/#comment-420</link>
		<dc:creator>Santy Chumbe</dc:creator>
		<pubDate>Wed, 10 Mar 2010 13:09:22 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/?p=51#comment-420</guid>
		<description>&lt;p&gt;Is the API of journaltocs (www.journaltocs.hw.ac.uk) the Shetland pony you were waiting for?&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Is the <acronym title="Application Programming Interface">API</acronym> of journaltocs (www.journaltocs.hw.ac.uk) the Shetland pony you were waiting for?</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on New interest in MARC-HASH / JSON by GregPendlebury</title>
		<link>http://robotlibrarian.billdueber.com/new-interest-in-marc-hash-json/comment-page-1/#comment-401</link>
		<dc:creator>GregPendlebury</dc:creator>
		<pubDate>Sun, 07 Mar 2010 23:10:30 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/?p=204#comment-401</guid>
		<description>&lt;p&gt;Bill, I&#039;d be interested in hearing if anyone has put up a central site for this in terms of schema definition and/or collaboration. This is an area I think could gain some traction here, but going off on our own without a public schema doesn&#039;t seem productive.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Bill, I&#8217;d be interested in hearing if anyone has put up a central site for this in terms of schema definition and/or collaboration. This is an area I think could gain some traction here, but going off on our own without a public schema doesn&#8217;t seem productive.</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on Pushing MARC to Solr; processing times and threading and such by Jonathan Rochkind</title>
		<link>http://robotlibrarian.billdueber.com/pushing-marc-to-solr-processing-times-and-threading-and-such/comment-page-1/#comment-378</link>
		<dc:creator>Jonathan Rochkind</dc:creator>
		<pubDate>Fri, 05 Mar 2010 19:25:38 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/?p=214#comment-378</guid>
		<description>&lt;p&gt;Hey, I should read more carefully before I post, but instead I&#039;ll just multi-post.&lt;/p&gt;

&lt;p&gt;I see the serialization to XML itself is non-trivial too.&lt;/p&gt;

&lt;p&gt;json!&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Hey, I should read more carefully before I post, but instead I&#8217;ll just multi-post.</p>

<p>I see the serialization to <acronym title="Extensible Markup Language">XML</acronym> itself is non-trivial too.</p>

<p>json!</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on Pushing MARC to Solr; processing times and threading and such by Jonathan Rochkind</title>
		<link>http://robotlibrarian.billdueber.com/pushing-marc-to-solr-processing-times-and-threading-and-such/comment-page-1/#comment-377</link>
		<dc:creator>Jonathan Rochkind</dc:creator>
		<pubDate>Fri, 05 Mar 2010 19:24:02 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/?p=214#comment-377</guid>
		<description>&lt;p&gt;Oh, I see, performance with toXML.&lt;/p&gt;

&lt;p&gt;What i wonder/worry about, is if the added time for toXML isn&#039;t actually the serialization to xml, but simply that if you&#039;re pushing a larger stored field to solr, that&#039;s going to slow things down.&lt;/p&gt;

&lt;p&gt;We still need to store our marc either way, of course. The UWisconsin approach of storing marc in an rdbms instead of a solr stored field may or may not speed up indexing, since it&#039;s still gonna take time to store it.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Oh, I see, performance with toXML.</p>

<p>What i wonder/worry about, is if the added time for toXML isn&#8217;t actually the serialization to xml, but simply that if you&#8217;re pushing a larger stored field to solr, that&#8217;s going to slow things down.</p>

<p>We still need to store our marc either way, of course. The UWisconsin approach of storing marc in an rdbms instead of a solr stored field may or may not speed up indexing, since it&#8217;s still gonna take time to store it.</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on Pushing MARC to Solr; processing times and threading and such by Jonathan Rochkind</title>
		<link>http://robotlibrarian.billdueber.com/pushing-marc-to-solr-processing-times-and-threading-and-such/comment-page-1/#comment-376</link>
		<dc:creator>Jonathan Rochkind</dc:creator>
		<pubDate>Fri, 05 Mar 2010 19:16:25 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/?p=214#comment-376</guid>
		<description>&lt;p&gt;What&#039;s HLB?&lt;/p&gt;

&lt;p&gt;Both ruby-marc and marc4j will generate marc-xml, but do you mean optimizing speed of it? (Don&#039;t forget marc-json possibilities! heh).&lt;/p&gt;

&lt;p&gt;Not sure if you&#039;re still happy with marc4j or might prefer ruby-marc, I realized one thing missing from the ruby stack (if you didn&#039;t want to use marc4j) (as far as I know) is the marc8-utf8 conversion stuff, and heuristic guess detection of marc records that aren&#039;t really the encoding they claim to be.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>What&#8217;s HLB?</p>

<p>Both ruby-marc and marc4j will generate marc-xml, but do you mean optimizing speed of it? (Don&#8217;t forget marc-json possibilities! heh).</p>

<p>Not sure if you&#8217;re still happy with marc4j or might prefer ruby-marc, I realized one thing missing from the ruby stack (if you didn&#8217;t want to use marc4j) (as far as I know) is the marc8-utf8 conversion stuff, and heuristic guess detection of marc records that aren&#8217;t really the encoding they claim to be.</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on New interest in MARC-HASH / JSON by Bill</title>
		<link>http://robotlibrarian.billdueber.com/new-interest-in-marc-hash-json/comment-page-1/#comment-371</link>
		<dc:creator>Bill</dc:creator>
		<pubDate>Thu, 04 Mar 2010 23:26:46 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/?p=204#comment-371</guid>
		<description>&lt;p&gt;Heh. Yeah, the major element in the hash is a big ol&#039; array of fields, composed of arrays of subfields. My original take on MARC-Hash was mostly as a hash, and resulted in my first real, painful schooling in how batshit-insane MARC is.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Heh. Yeah, the major element in the hash is a big ol&#8217; array of fields, composed of arrays of subfields. My original take on <acronym title="MAchine Readable Cataloging">MARC</acronym>-Hash was mostly as a hash, and resulted in my first real, painful schooling in how batshit-insane <acronym title="MAchine Readable Cataloging">MARC</acronym> is.</p>]]></content:encoded>
	</item>
	<item>
		<title>Comment on New interest in MARC-HASH / JSON by Naomi Dushay</title>
		<link>http://robotlibrarian.billdueber.com/new-interest-in-marc-hash-json/comment-page-1/#comment-370</link>
		<dc:creator>Naomi Dushay</dc:creator>
		<pubDate>Thu, 04 Mar 2010 22:57:14 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/?p=204#comment-370</guid>
		<description>&lt;p&gt;by &quot;hash&quot; do you mean &quot;array&quot;?   Because order matters in Marc, but ruby hashes do not guarantee order, right?&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>by &#8220;hash&#8221; do you mean &#8220;array&#8221;?   Because order matters in Marc, but ruby hashes do not guarantee order, right?</p>]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.330 seconds -->
