<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Thinking through a simple API for HathiTrust item metadata</title>
	<atom:link href="http://robotlibrarian.billdueber.com/thinking-through-a-simple-api-for-hathitrust-item-metadata/feed/" rel="self" type="application/rss+xml" />
	<link>http://robotlibrarian.billdueber.com/thinking-through-a-simple-api-for-hathitrust-item-metadata/</link>
	<description>Disclaimer: I'm not actually a robot.</description>
	<lastBuildDate>Thu, 08 Jul 2010 20:37:05 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Beta version of the HathiTrust Volumes API available &#187; Robot Librarian</title>
		<link>http://robotlibrarian.billdueber.com/thinking-through-a-simple-api-for-hathitrust-item-metadata/comment-page-1/#comment-272</link>
		<dc:creator>Beta version of the HathiTrust Volumes API available &#187; Robot Librarian</dc:creator>
		<pubDate>Tue, 15 Dec 2009 20:05:27 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/?p=154#comment-272</guid>
		<description>&lt;p&gt;[...] put up a beta version of the HathiTrust Volumes API previously discussed on this blog and via [...]&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>[...] put up a beta version of the HathiTrust Volumes <acronym title="Application Programming Interface">API</acronym> previously discussed on this blog and via [...]</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Stephanie Collett</title>
		<link>http://robotlibrarian.billdueber.com/thinking-through-a-simple-api-for-hathitrust-item-metadata/comment-page-1/#comment-248</link>
		<dc:creator>Stephanie Collett</dc:creator>
		<pubDate>Tue, 17 Nov 2009 19:17:42 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/?p=154#comment-248</guid>
		<description>&lt;p&gt;I prefer #3 to #4 as well. Putting the items in best-match order would make short work for simple clients. However, we also plan to also use the API to spot check the metadata for our submissions. Multiple matches would alert us to metadata issues (on either end) that would fall silent in algorithm #4.&lt;/p&gt;

&lt;p&gt;I&#039;d also like to propose a new feature if it would be simple to implement. I&#039;d like to be able to look up records by Hathi Trust ID.&lt;/p&gt;

&lt;p&gt;I&#039;m building a simple web client for looking up detailed item information. The target audience is internal staff working on the bibliographic issues for our Hathi Trust submissions. I&#039;d like users to be able to query by Hathi ID along with the other identifiers. If they query by Hathi ID, it would be nice to at least show the title, and possible tie the item to other information by using the returned identifiers in the record like the OCLC number.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>I prefer #3 to #4 as well. Putting the items in best-match order would make short work for simple clients. However, we also plan to also use the <acronym title="Application Programming Interface">API</acronym> to spot check the metadata for our submissions. Multiple matches would alert us to metadata issues (on either end) that would fall silent in algorithm #4.</p>

<p>I&#8217;d also like to propose a new feature if it would be simple to implement. I&#8217;d like to be able to look up records by Hathi Trust ID.</p>

<p>I&#8217;m building a simple web client for looking up detailed item information. The target audience is internal staff working on the bibliographic issues for our Hathi Trust submissions. I&#8217;d like users to be able to query by Hathi ID along with the other identifiers. If they query by Hathi ID, it would be nice to at least show the title, and possible tie the item to other information by using the returned identifiers in the record like the <acronym title="Online Computer Library Center">OCLC</acronym> number.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Rochkind</title>
		<link>http://robotlibrarian.billdueber.com/thinking-through-a-simple-api-for-hathitrust-item-metadata/comment-page-1/#comment-235</link>
		<dc:creator>Jonathan Rochkind</dc:creator>
		<pubDate>Wed, 04 Nov 2009 22:28:55 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/?p=154#comment-235</guid>
		<description>&lt;p&gt;Ah, but wait, maybe I am misunderstanding things. The pipe/ampersand confusion made me realize I don&#039;t understand that.&lt;/p&gt;

&lt;p&gt;What&#039;s the difference between asking for?&lt;/p&gt;

&lt;p&gt;?yourID1=oclc:00470409&#124;lccn:68001537&amp;yourID2=oclc:67890987&#124;isbn:987652348X&lt;/p&gt;

&lt;p&gt;Or asking for:&lt;/p&gt;

&lt;p&gt;?yourId1=oclc:67890987&amp;yourId2=lccn:68001537&amp;yourId3=oclc:67890987&amp;yourId4=isbn:987652348X&lt;/p&gt;

&lt;p&gt;What does the pipe grouping do for you?  Maybe this is related to the solution #3 vs #4 thing, cause maybe I can get what I want by constructing my request the right way even if you do #4. But the pipe grouping thing kinda seems like unneccesary complexity to me.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Ah, but wait, maybe I am misunderstanding things. The pipe/ampersand confusion made me realize I don&#8217;t understand that.</p>

<p>What&#8217;s the difference between asking for?</p>

<p>?yourID1=oclc:00470409|lccn:68001537&amp;yourID2=oclc:67890987|isbn:987652348X</p>

<p>Or asking for:</p>

<p>?yourId1=oclc:67890987&amp;yourId2=lccn:68001537&amp;yourId3=oclc:67890987&amp;yourId4=isbn:987652348X</p>

<p>What does the pipe grouping do for you?  Maybe this is related to the solution #3 vs #4 thing, cause maybe I can get what I want by constructing my request the right way even if you do #4. But the pipe grouping thing kinda seems like unneccesary complexity to me.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Rochkind</title>
		<link>http://robotlibrarian.billdueber.com/thinking-through-a-simple-api-for-hathitrust-item-metadata/comment-page-1/#comment-234</link>
		<dc:creator>Jonathan Rochkind</dc:creator>
		<pubDate>Wed, 04 Nov 2009 22:24:59 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/?p=154#comment-234</guid>
		<description>&lt;p&gt;Definitely prefer #3 to #4.&lt;/p&gt;

&lt;p&gt;A middle ground is that you can rank them internally, and put the one you think is &#039;best&#039; &lt;em&gt;first&lt;/em&gt;. So the client can easily just take the first one and ignore the others. But I as a client am going to sometimes want to see all matches, not have the ones your algorithm considered &#039;not as good&#039; hidden from me.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Definitely prefer #3 to #4.</p>

<p>A middle ground is that you can rank them internally, and put the one you think is &#8216;best&#8217; <em>first</em>. So the client can easily just take the first one and ignore the others. But I as a client am going to sometimes want to see all matches, not have the ones your algorithm considered &#8216;not as good&#8217; hidden from me.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Tod Olson</title>
		<link>http://robotlibrarian.billdueber.com/thinking-through-a-simple-api-for-hathitrust-item-metadata/comment-page-1/#comment-231</link>
		<dc:creator>Tod Olson</dc:creator>
		<pubDate>Wed, 04 Nov 2009 03:08:03 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/?p=154#comment-231</guid>
		<description>&lt;p&gt;Regarding the &quot;records&quot; data, keep it in. It feels like the sort of data element you don&#039;t miss until it&#039;s not there.  At the very least, that data make it very easy for the JSON consumer to tell whether there were multiple records returned, without having to grovel all of the items and sift out the record IDs.&lt;/p&gt;

&lt;p&gt;Perhaps the &quot;records&quot; data should also provide a &quot;recordURL&quot;  It would be analogous to &quot;itemURL&quot; in that the API would be responsible for formatting the URL to a record.  Then every consumer of this information would not have to know the &quot;how to link to a record&quot; convention, just as they do not need to know how to construct the handle URL for an item.&lt;/p&gt;

&lt;p&gt;One bit of clarification: in the text under &quot;Making the request,&quot; I read that to mean that these two URLs return identical informatioon:
http://catalog.hathitrust.org/api/volumes/oclc/00470409.json
http://catalog.hathitrust.org/api/volumes/oclc/470409.json&lt;/p&gt;

&lt;p&gt;In the multivolume request example, did you mean &quot;…yourID2=oclc:67890987&#124;isbn:987652348X&quot;, with a &quot;&#124;&quot; rather than a &quot;&amp;&quot;?&lt;/p&gt;

&lt;p&gt;On the final question about when the records don&#039;t match, I&#039;m leaning toward #4, the best matches.  I&#039;m a little concerned about cases where some important number changes, like when OCLC records merge.  (or an ISBN is corrected or whatever.)  So if I send OCLC, LCCN, and ISBN and there&#039;s no matching OCLC number, would the service then fall back to LCCN?  Or would a miss on the OCLC mean the whole request fails?  In any case, experience with the new API will tell us whether the matching needs to be tweaked.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Regarding the &#8220;records&#8221; data, keep it in. It feels like the sort of data element you don&#8217;t miss until it&#8217;s not there.  At the very least, that data make it very easy for the JSON consumer to tell whether there were multiple records returned, without having to grovel all of the items and sift out the record IDs.</p>

<p>Perhaps the &#8220;records&#8221; data should also provide a &#8220;recordURL&#8221;  It would be analogous to &#8220;itemURL&#8221; in that the <acronym title="Application Programming Interface">API</acronym> would be responsible for formatting the <acronym title="Uniform Resource Locator">URL</acronym> to a record.  Then every consumer of this information would not have to know the &#8220;how to link to a record&#8221; convention, just as they do not need to know how to construct the handle <acronym title="Uniform Resource Locator">URL</acronym> for an item.</p>

<p>One bit of clarification: in the text under &#8220;Making the request,&#8221; I read that to mean that these two URLs return identical informatioon:
<a href="http://catalog.hathitrust.org/api/volumes/oclc/00470409.json" rel="nofollow">http://catalog.hathitrust.org/api/volumes/oclc/00470409.json</a>
<a href="http://catalog.hathitrust.org/api/volumes/oclc/470409.json" rel="nofollow">http://catalog.hathitrust.org/api/volumes/oclc/470409.json</a></p>

<p>In the multivolume request example, did you mean &#8220;…yourID2=oclc:67890987|isbn:987652348X&#8221;, with a &#8220;|&#8221; rather than a &#8220;&amp;&#8221;?</p>

<p>On the final question about when the records don&#8217;t match, I&#8217;m leaning toward #4, the best matches.  I&#8217;m a little concerned about cases where some important number changes, like when <acronym title="Online Computer Library Center">OCLC</acronym> records merge.  (or an <acronym title="International Standard Book Number">ISBN</acronym> is corrected or whatever.)  So if I send <acronym title="Online Computer Library Center">OCLC</acronym>, LCCN, and <acronym title="International Standard Book Number">ISBN</acronym> and there&#8217;s no matching <acronym title="Online Computer Library Center">OCLC</acronym> number, would the service then fall back to LCCN?  Or would a miss on the <acronym title="Online Computer Library Center">OCLC</acronym> mean the whole request fails?  In any case, experience with the new <acronym title="Application Programming Interface">API</acronym> will tell us whether the matching needs to be tweaked.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Bill</title>
		<link>http://robotlibrarian.billdueber.com/thinking-through-a-simple-api-for-hathitrust-item-metadata/comment-page-1/#comment-230</link>
		<dc:creator>Bill</dc:creator>
		<pubDate>Tue, 03 Nov 2009 19:12:55 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/?p=154#comment-230</guid>
		<description>&lt;p&gt;Jonathan -- http://hdl.handle.net/2027/mdp.39015079651611 goes to the page-turner for that particular bound volume. http://catalog.hathitrust.org/Record/003384758 show the metadata for the &lt;em&gt;record&lt;/em&gt; onto which that item hangs. A single serial record will have many items.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Jonathan &#8212; <a href="http://hdl.handle.net/2027/mdp.39015079651611" rel="nofollow">http://hdl.handle.net/2027/mdp.39015079651611</a> goes to the page-turner for that particular bound volume. <a href="http://catalog.hathitrust.org/Record/003384758" rel="nofollow">http://catalog.hathitrust.org/Record/003384758</a> show the metadata for the <em>record</em> onto which that item hangs. A single serial record will have many items.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Rochkind</title>
		<link>http://robotlibrarian.billdueber.com/thinking-through-a-simple-api-for-hathitrust-item-metadata/comment-page-1/#comment-229</link>
		<dc:creator>Jonathan Rochkind</dc:creator>
		<pubDate>Tue, 03 Nov 2009 18:53:55 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/?p=154#comment-229</guid>
		<description>&lt;p&gt;Oh, I mean #3, not #2.  Null should not prevent a mis-match.&lt;/p&gt;

&lt;p&gt;But if I&#039;m sending LCCN=a&amp;ISBN=b, probably because I think those both refer to the same &#039;thing&#039;, and you have multiple records that refer to that &#039;thing&#039; -- I want to see them all. I don&#039;t want you to just pick an arbitrary one and hide all the rest.&lt;/p&gt;

&lt;p&gt;I mean, if I just sent an ISBN, and you had two records with that ISBN (quite possible), you&#039;d give me both of them, right, not just pick one you think is &#039;best&#039;?&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Oh, I mean #3, not #2.  Null should not prevent a mis-match.</p>

<p>But if I&#8217;m sending LCCN=a&amp;<acronym title="International Standard Book Number">ISBN</acronym>=b, probably because I think those both refer to the same &#8216;thing&#8217;, and you have multiple records that refer to that &#8216;thing&#8217; &#8212; I want to see them all. I don&#8217;t want you to just pick an arbitrary one and hide all the rest.</p>

<p>I mean, if I just sent an <acronym title="International Standard Book Number">ISBN</acronym>, and you had two records with that <acronym title="International Standard Book Number">ISBN</acronym> (quite possible), you&#8217;d give me both of them, right, not just pick one you think is &#8216;best&#8217;?</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Rochkind</title>
		<link>http://robotlibrarian.billdueber.com/thinking-through-a-simple-api-for-hathitrust-item-metadata/comment-page-1/#comment-228</link>
		<dc:creator>Jonathan Rochkind</dc:creator>
		<pubDate>Tue, 03 Nov 2009 18:51:23 +0000</pubDate>
		<guid isPermaLink="false">http://robotlibrarian.billdueber.com/?p=154#comment-228</guid>
		<description>&lt;p&gt;For your last question, I would pick #2.&lt;/p&gt;

&lt;p&gt;Also, despite the fact that I asked for it, you&#039;re right that that records stuff is confusing. I&#039;m confused about the difference between http://hdl.handle.net/2027/mdp.39015079651611  and  http://catalog.hathitrust.org/Record/003384758&lt;/p&gt;

&lt;p&gt;I guess in part because HT is still working out. There doesn&#039;t seem to be any reason to ever send the user to that /Record url, since it doesn&#039;t even point to the item and provide access to searching or full text if available!&lt;/p&gt;

&lt;p&gt;I still tend to err on the side of including extra info, cause someone might need it, and when they do you&#039;re unlikely to have time to go back and add it. On the other hand, maybe this is too confusing.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>For your last question, I would pick #2.</p>

<p>Also, despite the fact that I asked for it, you&#8217;re right that that records stuff is confusing. I&#8217;m confused about the difference between <a href="http://hdl.handle.net/2027/mdp.39015079651611" rel="nofollow">http://hdl.handle.net/2027/mdp.39015079651611</a>  and  <a href="http://catalog.hathitrust.org/Record/003384758" rel="nofollow">http://catalog.hathitrust.org/Record/003384758</a></p>

<p>I guess in part because HT is still working out. There doesn&#8217;t seem to be any reason to ever send the user to that /Record url, since it doesn&#8217;t even point to the item and provide access to searching or full text if available!</p>

<p>I still tend to err on the side of including extra info, cause someone might need it, and when they do you&#8217;re unlikely to have time to go back and add it. On the other hand, maybe this is too confusing.</p>]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.255 seconds -->
