Re: interesting Times article about Amazon search suggestions

From: Andrew Larrick (alarri@law.columbia.edu)
Date: Mon Mar 20 2006 - 17:54:08 PST


Charles,

I'm not sure I'm going to bite at your devil's advocacy of preferring
computer-generated identification of similar items to classification (per
your reference to Dewey) or subject-cataloging mechanisms. Certainly, in
either case, we are relying upon "other people" to make that similarity
determination -- library catalogers following an ordained (but transparent
and well-described) set of rules or the usage and browsing habits of the
mass of other users as that feedback was collected in an automated system.
Perhaps you are right that the very disorganization of that browsing mass
makes it harder for concerted bias to creep in than does the elaboration
and use of a rules-based cataloging or classification system, but it also
seems to lead (per this Amazon example) to a potentially dangerous layer of
opacity (i.e. literally no one quite has their finger on how it works,
which makes it easier to secretly meddle with). It seems to me that we may
profitably make use of them both, in appropriate contexts.

I thought that this little story about Amazon might pique the interest of
my colleagues, because I've been forced to think a lot lately about the
library catalog and, in particular, about the difficulty many law students
(and staff, and faculty, and etc.) seem to have with it -- both in terms of
getting a nice initial result set (contrasting alternatives there would
seem to have more to do with pattern-matching to do "natural language"
ranking) and with the problem of finding additional "similar" items after a
first good hit (for which the collaborative-filtering systems are a
contrasting example). Our students often want to do very broad searches
and find the "best" books to use for those searches -- ideally they come to
us and we help, or instruct them in some effective heuristics like scoping
to the reserve room, limiting by date, looking for open records for
treatises that are kept updated, etc. But our collection is of a nature
where - even if they do try to use LC subject headings in a reasonably
intelligent way - we have a lot of density of items piled up on a handful
of the whole universe of subject headings.

>From things that have been said to reference staff on the desk, and that
I've observed on patron computers, I suspect a lot of our patrons are also
(first?) trying to find books in Amazon. This drew my attention to what
seemed like a cautionary story of the Amazon databases function as a
searching tool coming into at least potential conflict with Amazon's real
core function as a bookseller.

I am also *far* from convinced that we ought to be thinking about a
'library version' of a search tool that used collaborative-filtering or
suggested searches based on monitoring the browsing patterns of our user
groups -- we neither have the amount of patron traffic on our systems that
would require to work well nor are we without serious ethical objections to
the required monitoring (even for pure aggregating purposes). But we
certainly *are* in competition with systems that use such computerized
searching and matching technologies.

Andrew

p.s. RLGs "RedLightGreen" doesn't do this collaborative-filtering type
stuff, but it is a very interesting experiment in integrating
relevance-ranked "natural language" searching into library systems and in
leveraging LC subject headings into a "user friendly" way to push "related
items" at users.



This archive was generated by hypermail 2b29 : Wed Nov 14 2007 - 20:46:24 PST