Five things I learned from #VALATechCamp

VALA Tech Camp logo

A few days ago I had the pleasure and privilege of attending the inaugural VALA Tech Camp, a two-day symposium for librarians in tech and technologists in libraries. I learnt a lot and had an excellent time, thanks in large part to the Herculean efforts of the organising committee. Below are a few scattered and not entirely comprehensive thoughts on the event:

Coding is easy! Coding is hard! When the committee asked for suggestions on what to include in the camp, I asked for fairly basic stuff—an intro to Python, for example, for those of us at the n00b end of the spectrum. A 2-hour crash course in Python wound up being the first event on day 1, so I felt more or less obliged to attend. I had previously tried several times to teach myself Python (out of books, on Codecademy, from YouTube videos) but had realised I needed an actual person to teach me the basics.
By the end of the session I had achieved the following:

I was not expecting 56 people to be so supportive of my own personal Wow! signal, so that was super nice. The workshop really did feel like the booster I needed to get me started in Python.
Later in the day (and continuing on day 2) was ‘Hacky Hour’, essentially free time to work on coding projects. I started out doing some web scraping with ParseHub and Beautiful Soup, then got bored and wound up with a Trove API key trying to rewrite Libraries Australia SOLR queries as Trove API queries (with mixed results), then got bored again and started writing a Bash script to extract metadata from a PDF into a CSV or TXT file.
The latter occupied my time and imagination even after I returned to the hotel, culminating in me figuring out how to export metadata from a PDF to a CSV, then to OpenRefine, then to MARC! I was thrilled to have actually achieved something concrete that I could take back to work and actually use. If that was all I got out of VALA tech camp, it would have been worth it.

There’s a huge gap between what tech can do and what people think tech can do. Ingrid Mason spoke at length about the gap in not just digital literacy, but digital infrastructure literacy. You might know how to use wifi, but would you know how to fix your wifi if it broke? (I know I wouldn’t, and I’m more tech literate than the average person.)
There’s also the problem of extremely clever people constantly creating new ways to do things and new ways to solve problems, including library problems, but how much of that knowledge trickles down to us at the coalface? It’s something I’m keen to explore and maybe, hopefully, change.

I was surprised by how much I already knew. One of my problems in tech is that I know I have a very uneven skillset. I am a total Python n00b, yet I can cobble together a Bash script. I’m totally across LOD and RDF triples, but didn’t know how SPARQL worked (until I attended the SPARQL talk!) I understood the mechanics of web scraping, but not how to properly harness web scraping tools. Even the talks where I came armed with a little background knowledge (like UX, APIs, the importance of good documentation) I left feeling twice as knowledgeable, which is an excellent outcome.
I particularly enjoyed the SPARQL talk because it explained linked data concepts in a way ordinary people could understand. Their use of Wikidata as an example SPARQL interface was an inspired choice—I felt it helped make an otherwise arcane and distant concept really concrete and accessible to a lay user.

Tech people are less intimidating than I thought. The attendee profile of VALA Tech Camp certainly skewed older, maler and more experienced than NLS8, which at first was a bit scary for this young, female n00b, but this is precisely why I went in the first place: to learn, and to find out what others are doing. I struck up some great convos with attendees of all genders doing excellent things. I wound up on an all-ladies table for the first Hacky Hour, the ‘Number 1 Ladies Solving Each Other’s Data Collection Problems’ table (moniker by me). In each situation people were only too happy to help and to chat.
Interestingly, I realised that in order for me to do better in tech, I would probably feel more comfortable in a women-only environment, like PyLadies or RubyGirls or something. I’ll look into local chapters and see if I could contribute. Seeing other women do super well in library tech was really empowering and wonderful, and I’d love to see more of it.

You can do the thing! 👍 Several short talks focussed on getting out there and just making stuff happen, including Justine on podcasting in libraries and Athina on running a cryptoparty in a public library. It was really inspiring to hear of people taking initiative and making excellent things happen.
On a much smaller scale, I found myself much more able to get out there and do things I find really difficult. Yes, I can go and make small talk to people! Yes, I can summon the courage to thank people for writing things that have meant a lot to me! Yes, I can do the thing! Yes I can.

Yes I can.

Sick of hearing about linked data? You’re not alone

‘This looks a little bit complicated’ … you don’t say… #lodlam #lasum2016 @lissertations 8 Dec 2016

I’m not attending ALIA Information Online this year, largely because the program was broadly similar to NDFNZ (which I attended last year) and I couldn’t justify the time off work. Instead I’m trying to tune into #online17 on Twitter, in between dealing with mountains of work and various personal crises.

As usual, there’s a lot of talk about linked data. Pithy pronouncements on linked data. Snazzy slides on linked data. Trumpeting tweets about linked data.

You know what?

I’m sick of hearing about linked data. I’m sick of talking about linked data. I’m fed up to the back teeth with linked data plans, proposals, theories, suggestions, exhortations, the lot. I’ve had it. I’ve had enough.

What will it take to make linked data actually happen?

Well, for one thing, ‘linked data’ could mean all sorts of things. Bibframe, that much-vaunted replacement for everyone’s favourite 1960s data structure MARC, is surely years away. RDF and its query language SPARQL are here right now, but the learning curve is steep and its interoperability with legacy library data and systems is difficult. Whatever OCLC is working on has the potential to monopolise and commercialise the entire project. If people use ‘linked data’ to mean ‘indexed by Google’, well, there’s already a term for that. It’s called SEO, or ‘search engine optimisation’, and marketing types are quite good at it. (I have written on this topic before, for those interested.)

Furthermore, linked data is impossible to implement on an individual level. Making linked data happen in a given library service, including—

  • modifying one’s ILS to play nicely with linked data
  • training your cataloguing and metadata staff (should you have any) on writing linked data
  • ensuring your vendors are willing to provide linked data
  • teaching your floor staff about interpreting linked data
  • convincing your bureaucracy to pay for linked data and
  • educating the public on what the hell linked data is

—requires the involvement of dozens of people and is far above my pay grade. Most of those people can be relied upon to care very little, or not at all, about metadata of any kind. Without rigorous description and metadata standards, not to mention work on vocabularies and authority control, our linked data won’t be worth a square inch of screen real estate. The renewed focus on library customer service relies on staff knowing what materials and services their library offers. This is impossible without good metadata, which in turn is impossible without good staff. I can’t do it alone, and I shouldn’t have to.

Here, the library data ecosystem is so tightly wrapped around the MARC structure that I don’t know if any one entity will ever break free. Libraries demand MARC records because their software requires it. Their software requires MARC records because vendors wrote it that way. Vendors wrote the software that way because libraries demand it. It’s a vicious cycle, and one that vendors currently have little incentive to break.

I was overjoyed to hear recently of the Oslo Public Library’s decision a few years ago to ditch MARC completely and catalogue in RDF using the Koha open-source ILS. They decided there was no virtue in waiting for a standard that may never come, and decided to Make Linked Data Happen on their own. The level of resultant original cataloguing is quite high, but tools like MARC2RDF might ameliorate that to an extent. Somehow, I can’t see my workplace making a similar decision. It’d be awesome if we did, though.

I don’t yet know what will make linked data happen for the rest of us. I feel like we’ve spent years convincing traditionally-minded librarians of the virtues of linked data with precious little to show for it. We’re having the same conversations over and over. Making the same pronouncements. The same slides. The same tweets. All for something that our users will hopefully never notice. Because if we do our jobs right and somehow pull off the biggest advancement in library description since the invention of MARC, our users will have no reason to notice—discovery of library resources will be intuitive at last.

Now that would be something worth talking about.

Linked data: the saviour of libraries in the internet age?

Another day, another depressing article about the future of libraries in the UK. I felt myself becoming predictably frustrated by the usual ‘libraries are glorified waiting rooms for the unemployed’ and ‘everything’s on the internet anyway’ comments.

I also found myself trying to come up with ways to do something about it. Don’t get me wrong, I like a good whinge as much as the next man, but whinging only sustains me for so long. Where possible I like to find practical solutions to life’s problems. The issue of mass library closures in the UK might seem too much for one librarian to solve—especially a student librarian on the other side of the world with absolutely no influence in UK politics. But I won’t let that put me off.

Consider the following: Google is our first port of call in any modern information search, right? When we want to know something, we google it. That’s fine. Who determines what appears in search results? Google’s super-secret Algorithm, harnessing an army of spiders to index most corners of the Web. How do web admins try and get their sites to appear higher in search results? Either the dark art of search engine optimisation (SEO), which is essentially a game of cat-and-mouse with the Algorithm, or the fine art of boutique metadata, which is embedded in a Web page’s <meta> tags and used to lure spiders.

Despite falling patronage and the ubiquity of online information retrieval, libraries are absolutely rubbish at SEO. When people google book or magazine titles (to give but one example), libraries’ OPACs aren’t appearing in search results. People looking for recreational reading material are libraries’ target audience, and yet we’re essentially invisible to them.

Even if I accept the premise that ‘everything’s on the internet’ (hint: no), how do people think content ends up on the internet in the first place? People put things online. Librarians could put things online if their systems supported them. Librarians could quite easily feed the internet and reclaim their long-lost status as information providers in a literal sense.

The ancient ILS used by my workplace is an aggravating example of this lack of support. If our ILS were a person it would be a thirteen-year-old high schooler, skulking around the YA section and hoping nobody notices it’s not doing much work. Our OPAC, for reasons I really don’t understand, has a robots.txt warding off Google and other web crawlers. The Web doesn’t notice it and patrons don’t either. It doesn’t help that MARC is an inherently web-unfriendly metadata standard; Google doesn’t know or care what a 650 field is, and it’s not about to start learning.

(Screenshot below obscures the name of my workplace in the interests of self-preservation)


Down with this sort of thing.

Perhaps in recognition of this problem, vendor products such as SirsiDynix’s Bluecloud Visibility promise to convert MARC records to linked data in Bibframe and make a library’s OPAC more appealing to web crawlers. I have no idea if this actually works or not (though I’m dying to find out). For time-poor librarians and cash-strapped consortia, an off-the-shelf solution would have numerous benefits.

But even the included Google screenshot in the article, featuring a suitably enhanced OPAC, has its problems. Firstly, the big eye-catching infobox to the right makes no mention of the library, but includes links to Scribd and Kobo, who have paid for such prominence. Secondly, while the OPAC appears at the top of the search results, the blurb in grey text includes boring bibliographical information instead of an eye-catching abstract, or even something like ‘Borrow “Great Expectations” at your local library today!’. Surely I’m not the only one who notices things like this…?

I’m keen to do a lot more research in this area to determine whether the promise of linked data will make library collections discoverable for today’s users and bring people back to libraries. I know I can’t fix the ILS. I can’t re-catalogue every item we have. I can’t even make a script do this for me. For now, research is the most practical thing I can do to help solve this problem. Perhaps one day I’ll be able to do more.

Further reading

Fujikawa, G. (2015). The ILS and Linked Data: a White Paper. Emeryville, CA: Innovative Interfaces. Retrieved from

Papadakis, I. et al. (2015). Linked Data URIs and Libraries: The Story So Far. D-Lib 21(5-6), May-June 2015. Retrieved from

Schilling, V. (2012). Transforming Library Metadata into Linked Library Data: Introduction and Review of Linked Data for the Library Community, 2003–2011. ALCTS Research Topics in Cataloguing and Classification. Retrieved from