{"id":204,"date":"2008-12-22T15:11:43","date_gmt":"2008-12-22T20:11:43","guid":{"rendered":"http:\/\/dilettantes.code4lib.org\/?page_id=204"},"modified":"2009-01-05T12:22:46","modified_gmt":"2009-01-05T17:22:46","slug":"the-knowledgebase-kibbutz","status":"publish","type":"page","link":"https:\/\/rossfsinger.me\/blog\/papers\/the-knowledgebase-kibbutz\/","title":{"rendered":"The Knowledgebase Kibbutz"},"content":{"rendered":"<blockquote><p>This is a preprint for a column I wrote in the Journal of Electronic Resources Librarianship called <em>Memo from the Systems Office<\/em>.\u00c2\u00a0 The edited version appeared in Volume 20, Issue 2<span class=\"Z3988\" title=\"ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.jtitle=The%20Journal%20of%20Electronic%20Resources%20Librarianship&amp;rft.issn=1941-126X&amp;rft.vol=20&#038;rft.issue=2&#038;rft.year=2008&#038;rft.id=info:doi\/10.1080\/19411260802272776\">&nbsp;<\/span><\/p><\/blockquote>\n<p><strong>The Knowledgebase Kibbutz<\/strong><\/p>\n<p><strong><\/strong>As libraries&#8217; collections increasingly go digital, so too does their dependence on knowledgebases to access and maintain these electronic holdings.\u00c2\u00a0 Somewhat different than other library-based knowledge management systems (catalogs, institutional repositories, etc.), the data found in the knowledgebases of link resolvers or electronic resource management systems is not generally modeled, created or updated by librarians (although, admittedly, a lot of local work is done to modify or fix this data).\u00c2\u00a0 Much like the subscription based resources they track, it is difficult to know what exactly libraries are allowed to <em>do <\/em>with their knowledgebase data.\u00c2\u00a0 What <em>is<\/em> known is that a handful of companies are doing roughly the same work and their customers are often simultaneously fixing the same errors and inaccuracies in the data these vendors are aggregating.\u00c2\u00a0 The entire process is proprietary, inefficient and redundant and is controlled by bevy of players that have no incentive to change.\u00c2\u00a0 A centralized, standardized approach, maintained by librarians, publishers and vendors could not only reduce total cost of ownership, but also improve the quality of the data as well as the services built upon such a repository.<\/p>\n<p>Last spring, the UK Serials Group (UKSG) issued a report titled &#8220;Link Resolvers and the Serials Supply Chain&#8221; [1] aimed squarely at this issue (full disclosure:\u00c2\u00a0 my employer, Talis, is a member of the UKSG Knowledge Bases and Related Technologies working group [2]).\u00c2\u00a0 It discussed the reasons the current environment evolved the way it did, the problems and inefficiencies that exist with the status quo, and a possible, centralized alternative to improve the situation.\u00c2\u00a0 Interestingly, the proposal put forth assumed a single organization would be responsible to shepherd such a service instead of relying on a distributed, social software based, Web 2.0 method to maintain this knowledge (although, obviously, someone or some thing would still have to host such a service).\u00c2\u00a0 In the course of the report, the authors give a passing, somewhat dismissive, mention of the Jointly Administered Knowledge Environment (JAKE), which, although it ran out of steam years before the term &#8220;Web 2.0&#8221; was even coined, was based entirely on those principles.<\/p>\n<p>JAKE, although now dead and buried (jake-db.org was decommissioned January 31st, 2007 after years of neglect [3]), was hardly a &#8220;failure&#8221; as the UKSG report would indicate.\u00c2\u00a0 Started in 1999 by the Cushing\/Whitney Medical Library at Yale, its functionality went on off to spawn the basis of two different currently available link resolvers (OCLC Openly Informatics&#8217; WorldCat Link Manager [4], <span class=\"secondary-bf\">n\u00c3\u00a9e 1Cate<\/span>, and Simon Fraser University&#8217;s CUFTS [5]).\u00c2\u00a0 Despite leading to no actual working copies, a Google search for &#8220;jointly administered knowledge environment&#8221; yields over 25,000 hits of which a sizable proportion are library web pages pointing at one of the four former JAKE locations.\u00c2\u00a0 While this certainly appears to be an example of &#8220;the tragedy of the commons&#8221; (many people benefiting from\/few people contributing to), the reality of why JAKE failed to take firmer root is probably more complicated than that.\u00c2\u00a0 Since it had no direct interaction with commercial link resolvers and, as such, provided a questionable real-life benefit to library users, the motivation for librarians not directly involved in the project to maintain it was fairly minimal.\u00c2\u00a0 Since JAKE predated the rise of the social web, there was no precedent on how to cultivate a community to keep it running.\u00c2\u00a0 Since JAKE did not fit into the workflows of the librarians or the publishers, there was little incentive to invest a lot of effort in the submittal or upkeep of the data.\u00c2\u00a0 JAKE was doomed not because it wasn&#8217;t useful but because it could not articulate its usefulness.<\/p>\n<p>The danger in the UKSG approach is the creation of another closed, subscription based service like Crossref or WorldCat.\u00c2\u00a0 While certainly the credibility and existing relationships with the major stakeholders (publishers, aggregators, libraries, vendors) that an organization like OCLC or Crossref already has would be desirable and give an immediate boost to such a project, past performance does not show much hope for clearing up the issue of what subscribers are actually allowed to do with the data found in a centralized, paid access knowledgebase.\u00c2\u00a0 It also does nothing to address the openness required to build a community on this service, which in turn raises questions on how the data contained within would be maintained.\u00c2\u00a0 Along with people contributing to the database and journal information, it would also need application developers, from libraries, library vendors, publishers and database vendors and beyond, to write applications that not only depend on the existence of this service but on the accuracy of its data.\u00c2\u00a0 If business interests are staked on the success of this venture, and even more importantly, <em>multiple and diverse business interests<\/em>, the chances of its survival become more likely.<\/p>\n<p>By no means would this project be trivial.\u00c2\u00a0 From a technical, political or administrative perspectives there are countless numbers of pitfalls.\u00c2\u00a0 Modeling this data is hard.\u00c2\u00a0 The only evidence needed to back up this statement is to look at the arcane ways libraries and publishers have tried to model it in the past (MARC21 Format for Holdings Data [6], ONIX for Serials Online Holdings [7], etc.).\u00c2\u00a0 Agreement on what criteria even constitutes being added to such a knowledgebase would result in months of debates.\u00c2\u00a0 Adoption of this resource would require librarians to feel confident enough in its accuracy; just because the data will be available to correct does not mean that anybody will want to update vast percentages of it.\u00c2\u00a0 The major link resolver and ERMS vendors have little incentive to participate:\u00c2\u00a0 regardless of how awkward or inefficient their current processes are, they work and it does not require a massive refactoring in workflows and code to maintain the status quo.\u00c2\u00a0 Perhaps the biggest question would be, who would host this service?\u00c2\u00a0 Who has the infrastructure to support a project like this?\u00c2\u00a0 Who will pay for the servers, bandwidth and upkeep?\u00c2\u00a0 Would the stakeholders feel comfortable if this was provided by a vendor, regardless of how open the community is?\u00c2\u00a0 When the community reaches an impasse, who will have the authority to make executive decisions?\u00c2\u00a0 These are all currently unanswered and the prospect of an organization like NISO (or another similar body) being tasked to solve them means that there are no solutions in sight.<\/p>\n<p>The upside of this service would be so great, though.\u00c2\u00a0 Outside of the obvious potential it brings to the evolution of the knowledgebase as we currently know it:\u00c2\u00a0 a standard for publishers and providers to point to for their holdings submissions; a centralized source to add and disseminate targets provided by librarians when the vendor community can not or will not; and a means to eliminate the redundancies of maintaining multiple knowledgebases between vendor offerings, the community contributed knowledgebase also creates opportunities for new services.\u00c2\u00a0 Besides a crop of new or improved OpenURL resolver, ERMS and Metasearch offerings that would likely spring forth, more tangentially related applications would be possible as well.\u00c2\u00a0 With a centralized registry of primarily electronic resources, a uniform identifier can be given to items such as databases or e-book packages, since there is, shamefully, nothing that addresses that need currently.\u00c2\u00a0 Preprint\/post-print\/working paper coverage can be associated with original resources.\u00c2\u00a0 Relationships can be defined between publications and web services that fall outside of the traditional library purview:\u00c2\u00a0 journal tables of contents found at Cite-U-Like [8], full text coverage at Google Book Search, and more.\u00c2\u00a0 By keeping the data open, people from outside the library community can utilize, reconstitute and extend our data providing libraries with services that they would not have imagined or had access to create.<\/p>\n<p>Even the pitfalls are surmountable.\u00c2\u00a0 Organizations such as the Internet Archive&#8217;s Open Library and Wikipedia have had to deal with the realities of openly editable and therefore highly dynamic content.\u00c2\u00a0 Mechanisms, whether automated or manual, could be put into place to monitor or spot check edits for accuracy.\u00c2\u00a0 Libraries could lock their resolvers to specific edits of a particular resource until they (or a predetermined &#8220;trusted&#8221; agent:\u00c2\u00a0 a consortial partner or their vendor, say) approves the most recent edition.\u00c2\u00a0 Library software vendor buy in can be achieved in two ways:\u00c2\u00a0 comprehensive publisher and aggregator support or simple economics.\u00c2\u00a0 As the UKSG report states, the publishing community would relish a single, definitive means for formatting and distributing their holdings.\u00c2\u00a0 Instead of many ad-hoc arrangements for resolver vendors, subscription agents, and ERMS products, having a consistent and streamlined process is a much easier business case to sell to the publishers.\u00c2\u00a0 If the publishers commit to solely producing their holdings through a centralized service, the vendors have little choice but to acknowledge and use it.\u00c2\u00a0 As to the latter approach, it is time consuming and therefore expensive to maintain a large and accurate knowledgebase.\u00c2\u00a0 Simon Fraser University&#8217;s subscription costs for their CUFTS link resolver are almost exclusively used for cost recovery for their knowledgebase maintenance.\u00c2\u00a0 OhioLINK and the Colorado Alliance of Research Libraries (CARL) must face similar predicaments for their OLinks [9] and GoldRush [10] products, respectively.\u00c2\u00a0 The smaller traditional commercial vendors are probably looking at an even bleaker return on investment.\u00c2\u00a0 This is, no doubt, why OCLC Openly&#8217;s WorldCat Link Manager&#8217;s knowledgebase is used behind the scenes by the majority of link resolvers on the market today.\u00c2\u00a0 However, if the link resolver and ERMS suppliers outside of the big three (Ex Libris, Serials Solutions and OCLC Openly) contributed to a collaborative knowledgebase, they might, together, have a large enough market to influence the data providers to contribute to it (which, in turn, may be an end around to the first solution).\u00c2\u00a0 The non-participating vendors could use the community as a means to share unsupported targets created by their customers by providing an import\/export mechanism to their current, proprietary knowledgebases.\u00c2\u00a0 There are multiple benefits for having a resource like this; it does not necessarily have to power the link resolver product directly as long as it can be integrated seamlessly.\u00c2\u00a0 The vendors jumped at supporting Google&#8217;s demands for Google Scholar export, so why not this?<\/p>\n<p>Most likely, the best solution would be for somebody to just get a handful of the stakeholders together and create something that works, preferably with an OpenURL link resolver or electronic resource management system modified or built to use it (open source or commercial, it does not matter).\u00c2\u00a0 The knowledgebase crisis is not going away, and as the digital universe expands, especially to new and different formats, it will only get more difficult to manage.\u00c2\u00a0 By tapping into the power of the entire community, from the beginning of the publishing chain to the end user, the knowledgebase becomes self-sustaining and finds new and interesting uses along the way.\u00c2\u00a0 The world does not need another closed-access, subscription based library data silo.\u00c2\u00a0 We have enough of those already.<\/p>\n<p>Notes:<\/p>\n<ol>\n<li><a href=\"http:\/\/uksg.org\/sites\/uksg.org\/files\/uksg_link_resolvers_final_report.pdf\" target=\"_blank\">http:\/\/uksg.org\/sites\/uksg.org\/files\/uksg_link_resolvers_final_report.pdf<\/a><\/li>\n<li><a href=\"http:\/\/uksg.org\/kbart\" target=\"_blank\">http:\/\/uksg.org\/kbart<\/a><\/li>\n<li><a href=\"http:\/\/sourceforge.net\/mailarchive\/forum.php?thread_name=FB9F8B50-9938-470E-8AE7-16E8AE5EDD62%40umich.edu&amp;forum_name=jake-list\" target=\"_blank\">http:\/\/sourceforge.net\/mailarchive\/forum.php?thread_name=FB9F8B50-9938-470E-8AE7-16E8AE5EDD62%40umich.edu&amp;forum_name=jake-list<\/a><\/li>\n<li><a href=\"http:\/\/www.oclc.org\/linkmanager\/\" target=\"_blank\">http:\/\/www.oclc.org\/linkmanager\/<\/a><\/li>\n<li><a href=\"http:\/\/cufts.lib.sfu.ca\/\" target=\"_blank\">http:\/\/cufts.lib.sfu.ca\/<\/a><\/li>\n<li><a href=\"http:\/\/www.loc.gov\/marc\/holdings\/echdhome.html\" target=\"_blank\">http:\/\/www.loc.gov\/marc\/holdings\/echdhome.html<\/a><\/li>\n<li><a href=\"http:\/\/www.editeur.org\/onixserials\/ONIX_SOH1.1.html\" target=\"_blank\">http:\/\/www.editeur.org\/onixserials\/ONIX_SOH1.1.html<\/a><\/li>\n<li><a href=\"http:\/\/www.citeulike.org\/journals\" target=\"_blank\">http:\/\/www.citeulike.org\/journals<\/a><\/li>\n<li><a href=\"http:\/\/olinks.ohiolink.edu\/\" target=\"_blank\">http:\/\/olinks.ohiolink.edu\/<\/a><\/li>\n<li><a href=\"http:\/\/www.coalliance.org\/index.php?option=com_weblinks&amp;catid=21&amp;Itemid=41\" target=\"_blank\">http:\/\/www.coalliance.org\/index.php?option=com_weblinks&amp;catid=21&amp;Itemid=41<\/a><\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>This is a preprint for a column I wrote in the Journal of Electronic Resources Librarianship called Memo from the Systems Office.\u00c2\u00a0 The edited version appeared in Volume 20, Issue 2&nbsp; The Knowledgebase Kibbutz As libraries&#8217; collections increasingly go digital, so too does their dependence on knowledgebases to access and maintain these electronic holdings.\u00c2\u00a0 Somewhat [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":145,"menu_order":0,"comment_status":"open","ping_status":"open","template":"","meta":{"footnotes":""},"class_list":["post-204","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/pages\/204","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/comments?post=204"}],"version-history":[{"count":4,"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/pages\/204\/revisions"}],"predecessor-version":[{"id":232,"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/pages\/204\/revisions\/232"}],"up":[{"embeddable":true,"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/pages\/145"}],"wp:attachment":[{"href":"https:\/\/rossfsinger.me\/blog\/wp-json\/wp\/v2\/media?parent=204"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}