Please: don’t return your books
February 12, 2013 at 4:16 pmCategory:Uncategorized
So, I’m at code4lib 2013 right now, where side conversations and informal exchanges tend to be the most interesting part.
Last night I had an conversation with the inimitable Michael B. Klein, and after complaining about faculty members that keep books out for decades at a time, we ended up asking a simple question:
> How much more shelving would we need if everyone returned their books?
Assuming we could get them all checked in and such, well, where would we put them?
I’m looking at this in the simplest, most conservative way possible:
- Assume they’re all paperbacks, so we don’t worry about how thick a cover is (cover width = 0)
- Assume items for which we don’t have page count information are “average”
Starting data
What’s my current situation at Michigan?
- Total bibs: about 10M (but that includes a bunch of HathiTrust items and other electronic-only items that could never be checked out)
- Total items checked out right now: 162,080
The first problem I run into is that I don’t know how many pages are in a given book. Well, in theory I can look in MARC field 300$a, and it will tell me.
Finding the number of pages in a book
I went through a recent dump of all our records and pulled out page counts from the 300 (those that matched the regular expression $$a\d+\s+[pP].).
Problem solved, right? Well, kind of
- 3,085,433 total bibs with page count data (about 30%)
- 40,872 checked out items with page count data (about 25%)
OK, so I don’t have data for everything. Plus, some of those are multi-volume works that list the total page count, even though only a single volume may be checked out.
We’ll have to drop down into statistics:
- Average number of pages in a checked-out item: 270
- Median number of pages in a checked-out item: 244
The median is lower, so we’ll go with that. Being conservative, remember?
Bringing it all together
Obviously we need to make a lot of assumptions.
- All paperbacks (== no space allowance for covers)
- 244 pages per item (the median of checked out items for which we have data)
- Pages = 244 * 162,080 = 39,547,520 pages
So…what’s the damage?
But how to do the calculation?
It turns out that simply googling book spine width calculator a few come up.
I picked one and input 39,547,520 pages and assumed 50lb paper (the lightest paper in the tool).
Total width: 77,241.25 inches, or 6437 feet, or 1.22 miles
1.22 miles???
Well, we had a lot of assumptions,but most of them were pretty conservative. And I have no idea if the book spine calculator is at all accurate.
But…it’s gonna be a big number no matter what. Add in that many of them are hardcover, and this seems like a pretty good guess at a lower end.
What is this good for again?
Oh, nothing at all. Just a little fun while I’m at code4lib.
Next steps
Well, the best next step would be to walk away. This is a huge waste of time.
But…we could look in the 020s for a hint of whether it’s hardcover or paperback (which is really hard. And maybe try to figure out if multiple volumes of a multi-volume work are all checked out and take that into account.
But really: this is enough for me. Whether Michael wants to pursue it further on his own, well, that’s up to him.
I had a class in Library School in the 1990′s on operations research in the library. There was some interesting stuff on adjusting loan periods so that some number of books was always checked out in order to make sure there was enough shelf space.
A bunch of researchers applied Operations Research (queueing theory/monte carlo simulations, and more math stuff) to the relationship between library loan periods/policies and shelving needs (as well as user satisfaction etc)
Here is an excerpt from some of Buckland’s research in the 1960′s. It doesn’t actually talk about deliberately making the loan period longer in order to reduce the need for shelf space though. I’m pretty sure either his book or subsequent research dealt with this issue however. On the other hand, this research on user’s behavior from 1968 sounds relevant for today.
http://people.ischool.berkeley.edu/~buckland/lancasterlru.pdf
During 1967 and 1968 a series of measurements were undertaken which showed
that library users could find the books they were looking for about 6 times out of 10;
that the major cause of nonavailability was that the book was out on loan to someone else;
that borrowed books tended to remain out for the full length of the loan period;
that in practice a loan period was determined not by written policies but by when overdue fines began;
that disappointed would-be borrowers did not often avail themselves of the procedures for recalling books back from loan;
and that in-library book use tended to have a stable relationship to circulation in any given library (Hindle and Buckland, 1978).
A Monte Carlo simulation was used to avoid the limitations of queuing theory. A flow chart of borrowing activities was programmed so that a computer could simulate the sequence of users seeking a single book, its repeatedly being borrowed and returned, and how often a copy was not available when sought. The simulation was flexible enough to show the effects of changes in the pattern and level of demand, in the length of the loan period, and/or of changing the number of copies of that book.
For more details see Buckland’s book on this “Book availability and the library user”: http://mirlyn.lib.umich.edu/Record/000014253
Tom