Sick of hearing about linked data? You’re not alone

‘This looks a little bit complicated’ … you don’t say… #lodlam #lasum2016 @lissertations 8 Dec 2016

I’m not attending ALIA Information Online this year, largely because the program was broadly similar to NDFNZ (which I attended last year) and I couldn’t justify the time off work. Instead I’m trying to tune into #online17 on Twitter, in between dealing with mountains of work and various personal crises.

As usual, there’s a lot of talk about linked data. Pithy pronouncements on linked data. Snazzy slides on linked data. Trumpeting tweets about linked data.

You know what?

I’m sick of hearing about linked data. I’m sick of talking about linked data. I’m fed up to the back teeth with linked data plans, proposals, theories, suggestions, exhortations, the lot. I’ve had it. I’ve had enough.

What will it take to make linked data actually happen?

Well, for one thing, ‘linked data’ could mean all sorts of things. Bibframe, that much-vaunted replacement for everyone’s favourite 1960s data structure MARC, is surely years away. RDF and its query language SPARQL are here right now, but the learning curve is steep and its interoperability with legacy library data and systems is difficult. Whatever OCLC is working on has the potential to monopolise and commercialise the entire project. If people use ‘linked data’ to mean ‘indexed by Google’, well, there’s already a term for that. It’s called SEO, or ‘search engine optimisation’, and marketing types are quite good at it. (I have written on this topic before, for those interested.)

Furthermore, linked data is impossible to implement on an individual level. Making linked data happen in a given library service, including—

  • modifying one’s ILS to play nicely with linked data
  • training your cataloguing and metadata staff (should you have any) on writing linked data
  • ensuring your vendors are willing to provide linked data
  • teaching your floor staff about interpreting linked data
  • convincing your bureaucracy to pay for linked data and
  • educating the public on what the hell linked data is

—requires the involvement of dozens of people and is far above my pay grade. Most of those people can be relied upon to care very little, or not at all, about metadata of any kind. Without rigorous description and metadata standards, not to mention work on vocabularies and authority control, our linked data won’t be worth a square inch of screen real estate. The renewed focus on library customer service relies on staff knowing what materials and services their library offers. This is impossible without good metadata, which in turn is impossible without good staff. I can’t do it alone, and I shouldn’t have to.

Here, the library data ecosystem is so tightly wrapped around the MARC structure that I don’t know if any one entity will ever break free. Libraries demand MARC records because their software requires it. Their software requires MARC records because vendors wrote it that way. Vendors wrote the software that way because libraries demand it. It’s a vicious cycle, and one that vendors currently have little incentive to break.

I was overjoyed to hear recently of the Oslo Public Library’s decision a few years ago to ditch MARC completely and catalogue in RDF using the Koha open-source ILS. They decided there was no virtue in waiting for a standard that may never come, and decided to Make Linked Data Happen on their own. The level of resultant original cataloguing is quite high, but tools like MARC2RDF might ameliorate that to an extent. Somehow, I can’t see my workplace making a similar decision. It’d be awesome if we did, though.

I don’t yet know what will make linked data happen for the rest of us. I feel like we’ve spent years convincing traditionally-minded librarians of the virtues of linked data with precious little to show for it. We’re having the same conversations over and over. Making the same pronouncements. The same slides. The same tweets. All for something that our users will hopefully never notice. Because if we do our jobs right and somehow pull off the biggest advancement in library description since the invention of MARC, our users will have no reason to notice—discovery of library resources will be intuitive at last.

Now that would be something worth talking about.

Linked data: the saviour of libraries in the internet age?

Another day, another depressing article about the future of libraries in the UK. I felt myself becoming predictably frustrated by the usual ‘libraries are glorified waiting rooms for the unemployed’ and ‘everything’s on the internet anyway’ comments.

I also found myself trying to come up with ways to do something about it. Don’t get me wrong, I like a good whinge as much as the next man, but whinging only sustains me for so long. Where possible I like to find practical solutions to life’s problems. The issue of mass library closures in the UK might seem too much for one librarian to solve—especially a student librarian on the other side of the world with absolutely no influence in UK politics. But I won’t let that put me off.

Consider the following: Google is our first port of call in any modern information search, right? When we want to know something, we google it. That’s fine. Who determines what appears in search results? Google’s super-secret Algorithm, harnessing an army of spiders to index most corners of the Web. How do web admins try and get their sites to appear higher in search results? Either the dark art of search engine optimisation (SEO), which is essentially a game of cat-and-mouse with the Algorithm, or the fine art of boutique metadata, which is embedded in a Web page’s <meta> tags and used to lure spiders.

Despite falling patronage and the ubiquity of online information retrieval, libraries are absolutely rubbish at SEO. When people google book or magazine titles (to give but one example), libraries’ OPACs aren’t appearing in search results. People looking for recreational reading material are libraries’ target audience, and yet we’re essentially invisible to them.

Even if I accept the premise that ‘everything’s on the internet’ (hint: no), how do people think content ends up on the internet in the first place? People put things online. Librarians could put things online if their systems supported them. Librarians could quite easily feed the internet and reclaim their long-lost status as information providers in a literal sense.

The ancient ILS used by my workplace is an aggravating example of this lack of support. If our ILS were a person it would be a thirteen-year-old high schooler, skulking around the YA section and hoping nobody notices it’s not doing much work. Our OPAC, for reasons I really don’t understand, has a robots.txt warding off Google and other web crawlers. The Web doesn’t notice it and patrons don’t either. It doesn’t help that MARC is an inherently web-unfriendly metadata standard; Google doesn’t know or care what a 650 field is, and it’s not about to start learning.

(Screenshot below obscures the name of my workplace in the interests of self-preservation)

cuut16fviaa6gqm

Down with this sort of thing.

Perhaps in recognition of this problem, vendor products such as SirsiDynix’s Bluecloud Visibility promise to convert MARC records to linked data in Bibframe and make a library’s OPAC more appealing to web crawlers. I have no idea if this actually works or not (though I’m dying to find out). For time-poor librarians and cash-strapped consortia, an off-the-shelf solution would have numerous benefits.

But even the included Google screenshot in the article, featuring a suitably enhanced OPAC, has its problems. Firstly, the big eye-catching infobox to the right makes no mention of the library, but includes links to Scribd and Kobo, who have paid for such prominence. Secondly, while the OPAC appears at the top of the search results, the blurb in grey text includes boring bibliographical information instead of an eye-catching abstract, or even something like ‘Borrow “Great Expectations” at your local library today!’. Surely I’m not the only one who notices things like this…?

I’m keen to do a lot more research in this area to determine whether the promise of linked data will make library collections discoverable for today’s users and bring people back to libraries. I know I can’t fix the ILS. I can’t re-catalogue every item we have. I can’t even make a script do this for me. For now, research is the most practical thing I can do to help solve this problem. Perhaps one day I’ll be able to do more.

Further reading

Fujikawa, G. (2015). The ILS and Linked Data: a White Paper. Emeryville, CA: Innovative Interfaces. Retrieved from https://www.iii.com/sites/default/files/Linked-Data-White-Paper-August-2015.pdf

Papadakis, I. et al. (2015). Linked Data URIs and Libraries: The Story So Far. D-Lib 21(5-6), May-June 2015. Retrieved from http://dlib.org/dlib/may15/papadakis/05papadakis.html

Schilling, V. (2012). Transforming Library Metadata into Linked Library Data: Introduction and Review of Linked Data for the Library Community, 2003–2011. ALCTS Research Topics in Cataloguing and Classification. Retrieved from http://www.ala.org/alcts/resources/org/cat/research/linked-data