{"id":208,"date":"2008-12-22T15:18:39","date_gmt":"2008-12-22T20:18:39","guid":{"rendered":"http:\/\/dilettantes.code4lib.org\/?page_id=208"},"modified":"2009-01-05T12:22:19","modified_gmt":"2009-01-05T17:22:19","slug":"in-search-of-a-really-%e2%80%9cnext-generation%e2%80%9d-catalog","status":"publish","type":"page","link":"https:\/\/rossfsinger.me\/blog\/papers\/in-search-of-a-really-%e2%80%9cnext-generation%e2%80%9d-catalog\/","title":{"rendered":"In Search of a Really \u00e2\u20ac\u0153Next Generation\u00e2\u20ac\u009d Catalog"},"content":{"rendered":"<blockquote><p>This is a preprint for a column I wrote in the Journal of Electronic Resources Librarianship called <em>Memo from the Systems Office<\/em>.\u00c2\u00a0 The edited version appeared in Volume 20, Issue 3<span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.jtitle=The%20Journal%20of%20Electronic%20Resources%20Librarianship&amp;rft.issn=1941-126X&amp;rft.vol=20&#038;rft.issue=3&#038;rft.year=2008&#038;rft.id=info:doi\/10.1080\/19411260802412752\">&nbsp;<\/span><\/p><\/blockquote>\n<p><strong>In Search of a Really \u00e2\u20ac\u0153Next Generation\u00e2\u20ac\u009d Catalog<\/strong><\/p>\n<p>Ever since North Carolina State University Libraries launched their Endeca based OPAC replacement in the beginning of 2006, the library world has been completely obsessed with ditching their old, tired catalog interfaces (and with good reason) for the greener pastures of more sophisticated indexing, more accurate relevance ranking, dust jackets and the most coveted feature of all:\u00c2\u00a0 facets.\u00c2\u00a0 Despite the fact that Medialab had brought AquaBrowser to the U.S. market nearly a year earlier, NC State set the rules of the game and the target that the rest of the profession was to aim for.\u00c2\u00a0 And, indeed, it was quite a radical and welcome change; the interface to the catalog had barely changed since it had been migrated from command line terminals to the web.\u00c2\u00a0 The ink had barely dried on NC State\u00e2\u20ac\u2122s press release before a whole host of similar products were announced by the library vendors.<\/p>\n<p>We are now starting to see the fruits of their labor, as well as a handful of open source projects, making their way into production.\u00c2\u00a0 Alongside AquaBrowser and Endeca, we now have Innovative Interface\u00e2\u20ac\u2122s Encore, Ex Libris\u00e2\u20ac\u2122 Primo, Prism 3 by my employer, Talis and OCLC\u00e2\u20ac\u2122s Worldcat Local.\u00c2\u00a0 In the near future, VTLS and SirsiDynix should also be rolling out similar products.\u00c2\u00a0 If the price tag on any of these offerings is too much for your library to afford, Villanova University, the University of Virginia, and Plymouth State University all have created comparable functionality in their free and open source projects:\u00c2\u00a0 VuFind, Blacklight and Scriblio, respectively.<\/p>\n<p>An environmental scan of all the options certainly finds more similarities between the products than differences:\u00c2\u00a0 facets, dust jackets, user created tagging, bookmarkable URLs.\u00c2\u00a0 Much of the functionality is thanks to the Solr project from the Apache Foundation:\u00c2\u00a0 a full text indexer that produces Google-like search results as well as a simple faceting engine to further limit the scope of the search context.\u00c2\u00a0 In most respects, Solr\u00e2\u20ac\u2122s feature set is nearly indistinguishable from Endeca\u00e2\u20ac\u2122s Profind product.\u00c2\u00a0 Couple that to the fact that Solr is an open source application that is free to integrate into your product, it is easy to see why it is a popular choice.\u00c2\u00a0 Many of the next generation catalog replacements (although not all) use Solr to do the heavy lifting, making the differences mainly the minutiae of how the data is indexed and presented.\u00c2\u00a0 There are exceptions, of course, such as AquaBrowser\u00e2\u20ac\u2122s visualized display of search results or the way that Worldcat Local includes other OCLC data sources, such as their union catalog, name authorities and article databases in their search results.\u00c2\u00a0 It is probably not coincidental that these are two of the products that are not based on Solr.<\/p>\n<p>In all fairness, these products do offer a significant advantage to Endeca:\u00c2\u00a0 since Profind has no out of the box interface, all development must be done locally.\u00c2\u00a0 On top of the sticker price, a library would also need to find the resources to build something to actually use it, requirements outside the reach of a great majority of institutions.\u00c2\u00a0 The library vendors, on the other hand, offer relatively turn-key solutions, giving libraries a much simpler path to improved functionality.<\/p>\n<p>As much of an improvement as these OPAC replacements are (and, certainly, they are a vast improvement over the status quo) they are all still based on some fundamentally flawed principles:\u00c2\u00a0 they are all still relatively closed world silos intended to index MARC records.\u00c2\u00a0 This is no criticism of the MARC format, catalogers or cataloging practices, but the way that data is represented in catalog records is ill-suited for a next generation OPAC.\u00c2\u00a0 The records are sparse, in many cases very sparse.\u00c2\u00a0 Records are seldomly updated.\u00c2\u00a0 There is also no distinction of discrete concepts within the record:\u00c2\u00a0 what exists is a blob of metadata about a work with strings identifying the creator or subject.\u00c2\u00a0 To relate different blobs, the OPAC matches on those strings.\u00c2\u00a0 This constraint is an unnecessary holdover in a\u00c2\u00a0 modern library system.\u00c2\u00a0 The creators, the subjects, the publishers, all of the distinct concepts that appear in the self-contained record should be first-class citizens, their own distinct records with their own distinct behaviors, rather than being merely strings to search on.\u00c2\u00a0 Another shortcoming is that all records are displayed more or less equally.\u00c2\u00a0 Despite the differences in a map and journal, and, more importantly, a user\u00e2\u20ac\u2122s expectation of what they would use a map or a journal for, they both appear in roughly the same template with many of the same labels and options.\u00c2\u00a0 What we have now are systems that ape Amazon\u00e2\u20ac\u2122s look and feel with a minute fraction of the kind of data that makes Amazon compelling.<\/p>\n<p>These new, shiny silos also do very little to harness the potential of the communities they serve.\u00c2\u00a0 There is practically nothing the users can do to influence the way the system works and very little they can do in adding useful data to the records, outside of comments, ratings and tagging.\u00c2\u00a0 These, sadly, have minimal value given the quite small size of the populations of majority of libraries especially compared to the number of resources in the collection.\u00c2\u00a0 There is almost no effort made to integrate the data into the larger information ecosystem of the web, instead requiring data sources to be pulled in and indexed internally to be acknowledged.\u00c2\u00a0 As the catalog takes a less central role to the library, this approach becomes less practical.<\/p>\n<p>There are some glimmers of hope.\u00c2\u00a0 VuFind, for example, has a plugin for crude author pages, based on name authority records, pulling biographical data from Wikipedia along with local holdings for the author\u00e2\u20ac\u2122s works.\u00c2\u00a0 Worldcat Local goes further with this, integrating their Worldcat Identities service with authors, showing all works by a given person noting what is available locally and what might be held at nearby libraries.\u00c2\u00a0 Prism 3, while not utilizing it at this stage, is built on Talis\u00e2\u20ac\u2122 semantic web-based Platform, giving it the potential of tapping into, integrating, and, conversely, feeding open data from all over the net.<\/p>\n<p>Scriblio takes a more direct approach to integrating into the wider web.\u00c2\u00a0 Built on the blogging platform WordPress, it taps into the existing social framework optimized for Web 2.0 style services (what Casey Bisson, Scriblio\u00e2\u20ac\u2122s creator, calls \u00e2\u20ac\u0153the Google Economy\u00e2\u20ac\u009d).\u00c2\u00a0 Instead of recreating bookmarking services or methods to define identity or prevent comment spam, Scriblio instead is able defer to services such as del.icio.us, OpenID and Akismet, utilize Technorati and Google Blog Search to discover content linking to the catalog, as well as trackbacks.\u00c2\u00a0 WordPress\u00e2\u20ac\u2122s broad selection of plugins give Scriblio the opportunity to include data from third parties such as last.fm for records about musical works, Flickr for images, and maps from Google or Yahoo.<\/p>\n<p>Despite being a perfect netizen, it is a little unclear how effective Scriblio is at being a library discovery system.\u00c2\u00a0 Many of the criticisms levied against the other next generation catalog replacements also apply to Scriblio:\u00c2\u00a0 the data is largely unenhanced and immutable; the community has little effect on the records; and there is no distinction in the display for different kinds of data.\u00c2\u00a0 Scalability becomes another issue.\u00c2\u00a0 WordPress was never designed for the amount of data that a large research library would have.\u00c2\u00a0 After all, it is a personal publishing platform intended for content creation with minimal emphasis on searching.\u00c2\u00a0 Scriblio\u00e2\u20ac\u2122s interface in particular seems to be optimized for the display of monographs.\u00c2\u00a0 Serials, databases and how they relate to one another, an integral part of the modern research library, may not translate nearly as well.<\/p>\n<p>Another intriguing alternative is BiblioCommons.\u00c2\u00a0 BiblioCommons, by a company with the same name, adds a rich social network to the library by leveraging user contributed content, borrowing history and creating personal profiles that then create communities of interest around similar tastes.\u00c2\u00a0 The focus is on the users, user behavior and user experience rather than the bibliographic metadata which would allow communities to shape their libraries in ways that work best for them.\u00c2\u00a0 Creating services based on user data requires quite large populations to have enough activity to make informed decisions.\u00c2\u00a0 This model works well for public libraries with large pools of borrowers.\u00c2\u00a0 Being more general in scope, smaller public library districts can feed into the same user base, provided the groups are relatively culturally and linguistically homogenous.\u00c2\u00a0 Academic libraries do not have this luxury, however; it remains to be seen how BiblioCommons\u00e2\u20ac\u2122 model adapts to meet their needs.<\/p>\n<p>The downside of BiblioCommons is that, even after nearly two years of presentations and announcements, as of this writing there are no production instances and all of the betas are private.\u00c2\u00a0 There is no way to know how well their approach scales or works or fits in with the rest of the information landscape because it has yet to be subjected to the rigors of real world usage.\u00c2\u00a0 Regardless of how BiblioCommons fares, their method and user-centric approach is worth a look.<\/p>\n<p>As much as the interfaces are changing in style from their ancestor the card catalog, so too are catalogers realizing that the underlying data must evolve, as well.\u00c2\u00a0 Development\u00c2\u00a0 of RDA (Resource Description and Access), FRBR and the Dublin Core Abstract Model (DCAM) are inspired, in part, by the inefficiencies in applying MARC and traditional cataloging techniques to the modern information universe.\u00c2\u00a0 There is a classic \u00e2\u20ac\u0153chicken or the egg\u00e2\u20ac\u009d scenario here, though.\u00c2\u00a0 RDA is unlikely to gain much support until it is adopted by the cataloging modules in the library management systems, but, as this would probably be the most significant change to online library systems since their creation, would require a major overhaul on the part of the vendors, further slowing its uptake.<\/p>\n<p>Rather than the integrated library system, perhaps it should be the discovery systems that drive the adoption and proliferation of RDA and the DCAM into libraries.\u00c2\u00a0 With their focus on resources rather than records, integrating data from other sources and generating a collaborative web of data, this seems a perfect fit for the kinds of next generation search and discovery systems needed in our increasingly distributed and digitized landscape.<\/p>\n<p>Disintegrate the bibliographic data from the inventory control system, let it incorporate and display however it wants or needs to and leave the circulation, purchasing and serials prediction to the back office application.\u00c2\u00a0 Like it or not, it is the direction the current crop of catalog replacements are taking us anyway; it is time to shed the trappings of the card catalog and reconfigure our assets to\u00c2\u00a0 work with the web instead of around it.\u00c2\u00a0 Until we start to work with the data as it is intended, rather than how it has traditionally been structured, these next generation tools that we find so innovative will merely underwhelm.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This is a preprint for a column I wrote in the Journal of Electronic Resources Librarianship called Memo from the Systems Office.\u00c2\u00a0 The edited version appeared in Volume 20, Issue 3&nbsp; In Search of a Really \u00e2\u20ac\u0153Next Generation\u00e2\u20ac\u009d Catalog Ever since North Carolina State University Libraries launched their Endeca based OPAC replacement in the beginning [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":145,"menu_order":0,"comment_status":"open","ping_status":"open","template":"","meta":{"footnotes":""},"class_list":["post-208","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/pages\/208","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/comments?post=208"}],"version-history":[{"count":3,"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/pages\/208\/revisions"}],"predecessor-version":[{"id":230,"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/pages\/208\/revisions\/230"}],"up":[{"embeddable":true,"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/pages\/145"}],"wp:attachment":[{"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/media?parent=208"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}