Before I left for Guatemala, Ian Davis at Talis asked if I could give him a dump of our MARC records to load into Talis Platform. I had been talking in the #code4lib channel about how I was pushing the idea of using Talis Source to make simple, ad-hoc union catalogs; we could make one for Georgia Tech & Emory (we have joint degree programs) or Arche or Georgia Tech/Atlanta-Fulton Public Library, etc. My thinking was that by utilizing the Talis Platform, we could forgo much of the headache in actually making a union catalog for somewhat marginal use cases (the public library one notwithstanding).
About a week after I got back from Guatemala, I had an email from Richard Wallis with some urls to play around with to access my Bigfoot store. He showed me search services, facet services and augment services. I was unable to be really dive into it much at the time but since I’m working on a total site search project for the library, I thought this would be a good chance to kick the tires a bit to include catalog results.
After two days of poking around, I have made some opinions of it, have some recommendations for it, and wrote a Ruby library to access it.
1) The Item Service
This is certainly the most straightforward and for many people, the most useful service of the bunch. The easiest way to think of the item service is an HTTP based Lucene service (a la Solr or Lucene-WS) of your bib records. It returns something OpenSearch-y (it claims to be a RSS 1.0 document), but it doesn’t validate. That being said, FeedTools happily consumed it (more on that later) and the semantics should be familiar to anyone that has looked at OpenSearch before. Each item node also contains a Dublin Core representation of the record and a link to a marcxml representation. I’m not sure if there’s a description document for Bigfoot.
Although the query syntax is pure Lucene (title:”The Lexus and the Olive Tree”), the downside is that it’s not documented anywhere what the indexes are and I doubt there would be any way to add new ones (for example, my guess is I wouldn’t be able to get an index for 490/440$v that I use for the Umlaut). I don’t see returning the results as OAI_DC being too much of a problem, since the RSS item includes a title (which would have been tricky between the DC and the marcxml). My Ruby library might not generate valid DC, I haven’t really looked into it.
The docs also mention you can POST items to your Bigfoot store, but they don’t mention what your data needs to look like (MARC?) or what credentials you need to add something (I mean, it must be more than just your store name, right?). My hope is to add this functionality to bigfoot-ruby soon (especially since my data is from a bulk export from last October).
2) The Facet Service
This one is intriguing, definitely, since Faceted searching is all the rage right now. The search syntax is basically the same as the Item Service, except you also send a comma delimited list of the fields you would like to query. What you get back is either an XML or XHTML document of your results.
For each field you request, you get back a set of terms (you can specify how many you want, with a default of 5) that appear most frequently in your field. You also get an approximation for how many results you would get in that facet and a url to search on that facet. It’s quite fast, although, realistically, you can’t do much with the output of facet search alone.
Again, it’s difficult to know what you can facet on (subject, creator and date are all useful — I’m sure there are others) and the facet that (for me, at least) held the most promise — type — is too overly broad to do much with (it uses Leader position 7, but lumps the BKS and SER types all in a label called “text”). I would like to see Talis implement something like my MARC::TypedRecord concept so one could facet on things like government document or conference. You could separate newspapers from journals and globes from maps. Still, the text analysis of the non-fixed fields is powerful and useful and beats the hell out of trying to implement something like that locally.
In bigfoot-ruby, I have provided two ways to do a faceted search: you can just do the search and get back Facet objects containing the terms and search urls or you can facet with items which executes the item searches automatically (in turn getting a definitive number of results for the query, as well). Since I didn’t bother to implement threading, getting facets with items can be pretty slow.
3) The Augment Service
To be honest, I’m having a hard time figuring out useful scenarios for the augment service. The idea is that you give it the URI of an RSS feed, and this service will enhance it with data from your Bigfoot store (at least, that is sort of how I understand it works). Richard’s example for me was to feed it the output of an xISBN query (which isn’t in RSS 1.0, AFAIK, but, for the sake of example…) and the augment service would fill in the data for ISBNs your library holds. The API example page mentions Wikipedia, but I don’t know where other than the Talis Platform that you can get Wikipedia entries formatted properly. I tried sending it the results of an Umlaut2 OpenSearch query, but it didn’t do anything with it. Presumably this RSS 1.0 feed needs the bib data to be sent in a certain way (my guess is in OAI_DC, like the Item Service), but I’m not sure. The only use case I can think of for this service is a much simpler way to check for ISBN concordance (rather than isbn:(123456789X|223456789X|323456789X|etc.))
Overall, I’m really impressed with the Talis API. It is a LOT easier to use than, say, Z39.50 and by using OpenSearch seems more natural to integrate into existing web services than SRU.
Bigfoot-ruby is definitely a work in progress. I think I would like to split the Search class into ItemService and FacetService. I don’t like how results is an Array for items and a Hash for facets. Just seems sloppy. I need to document it, of course and I would like to implement Item POST. This project also made me realize how bloody slow FeedTools is. I am currently using it in both the Umlaut and the Finding Aids to provide OpenSearch, but I think it’s really too sluggish to justify itself.
Thanks, Talis, for getting me started with Bigfoot and giving me the opportunity to play around with it. Also, thanks to Ed Summers for fixing SVN on Code4lib.org. You wouldn’t be able to download it and futz around with it yourself, otherwise.
Leave a Reply