The WAG the Dog Web Localizer

Table of Contents:

Thoughts behind the localizer
Using the localizer
What the localizer does
Downloading the localizer
Acknowledgements

What is the localizer?

The Web Localizer is an attempt to create a framework that takes web resources that are not written or intended for your use or community and rewrites them so they can work within your controlled environment. The goal is to extend the library (or any service, group, community or individual that can define its relation to other objects on the web) into the places and interfaces that people are already using (and probably are not “endorsed” or “supported” by the library). Google Scholar or Elsevier’s Scirus are perfect examples of these sorts of services. The creators don’t actually care if the user has access to the asset that a particular link points to, that is for the user and content provider to work out. However, the library should care.

We waste a lot of time and breath worrying about “appropriate copy”, so it is understandable that we should be concerned that our users may go to these sites and not have access to resources that the library pays for.

Library websites and resources tend to be awkward and non-intuitive (for many reasons, certainly one of which is the rather wide diversity of content and services libraries provide) and most likely wouldn’t be the first location that springs in a user’s mind when trying to solve problems while on the web. My overarching plan is to deemphasize the importance of “the library website” and instead push our content contextually out to users in the places they would think to look. Ah, I notice you just searched for “Hegel’s Dialectic” in Wikipedia (which, oddly, has no matches). This search would return 3 results in our catalog. It also returns 253 results from our metasearch application. Would you like to see those results? And so on. My views on this are briefly shown here.

Similarly, it makes little sense for a link to a resource on an external web page (say, a subject guide from MIT) to deny me access to a resource that my library has rights to, simply because the link goes through the remote site’s proxy server instead of my own. The localizer addresses this problem, as well.

Any relevant services we can push to the user, where the user actually is, would be possible. You could push links to your virtual reference system to the exact spot the user is, rather than requiring them to leave and find your link (most likely somewhere on your website). With OpenURL, you could do some post-processing and find similar items for the user… You could get the LCSHs of a particular journal, if it falls in certain ranges, you could present links to appropriate databases or journals with canned queries.

Localization basically requires two approaches:

Service Autodiscovery, which is the ideal, but requires cooperation on the part of the content provider. Let me go on record as saying this is probably one of the more important proposals to aid in the effort of keeping libraries relevant.
A lot of screen scraping. This, obviously, is an unsustainable approach (in the long run), but with an active community of “parser developers” this is possibly a way to make localization work until the above ideal is realized. See Jake as an example.

With these in place, an institution would set up rules for how certain types of metadata or sites are handled.

How to use the Web Localizer:

<\/script>’;d.body.appendChild(o)}else{e=d.createElement(‘script’);e.setAttribute(‘src’,h);d.body.appendChild(e)}}else{alert(‘Sorry, unsupported browser.’);}})())”>Web Localizer

Drag the above link to your toolbar (Internet Explorer users: Right click on the link and select “Add to favorites..”. Ignore any warnings it gives you.)

Go to: Google Scholar

Perform a search, click on the bookmark that was created on the toolbar.

Note, all “localization” performed is in the context of Georgia Tech at the moment. Customization comes later.

Another example: Go to Prairie View A&M’s Database listing and click on the bookmarklet.

What you should see:

Links to hosts that are available through Georgia Tech’s EZProxy server should be proxied [1].
Books are checked to see if they are held locally, both in GaTech’s catalog and the GIL Universal Catalog [2].
Articles and citations are checked against SFX to see if there are fulltext holdings [3].

1. The EZProxy hosts file is being scanned against every link on the page to see if there is a match, and if so, rewrites the URL to go through EZProxy.

2. The localizer follows the link to Openworldcat and, if present, grabs the ISBN. It then does an
xISBN concordance search and emulates “FRBRization” with the resulting ISBN set against GT’s local catalog via z39.50 If there are no results, it attempts the same search against the GIL Universal Catalog, also via z39.50.

3. The localizer takes what it assumes to be the “source title” (which is first part of the last line in a paragraph) and checks to see if it returns a result against Jake. The idea is that there is no point in overburdening our SFX server with bogus queries (there’s all sorts of stuff that appears in that last line). If Jake returns a hit, the localizer uses the SFX API to determine if fulltext is available for the article in question. If so, an icon indicating so is presented. If the API doesn’t return a fulltext hit, a “check in SFX” icon appears (since a) it is not 100% accurate, b) I’m not checking for other services).

Downloading the PHP Web Localizer for your own purposes:

The PHP Web Localizer is available on the WAG the Dog Sourceforge Project page.
There are very few requirements to getting it running (Web server, PHP and PHP/Yaz are basically all), so you can begin localizing quickly.

Acknowledgements:

This is by no means my idea. The localizer is an outgrowth of the WAG the Dog project with Peter Binkley and Art Rhyno. It also takes a whole lot of ideas from Dan Chudnov and Jeremy Frumkin and their amazing aforementioned article on autodiscovery (alliteration).

Comments

Leave a Reply