w/c 27 Feb 2012 – Discovery News Roundup

March 4, 2012

Here’s my round up of news from the world of Discovery and beyond over the past few weeks. As with previous posts, many of the items were gleaned from the #ukdiscovery twitter hashtag which you can dip into whenever you like by opening up this FiveFilters ‘newspaper’ pdf [update: URL fixed].

Last week the Discovery team published Issue 6 of the Discovery Newsletter which included the following articles among others:

  • an article on how the Copac Collections Management Tool project is aiming to help collections managers.
  • an introduction to ‘Will’s World’ – one of the JISC-funded large-scale exemplar projects.
  • an invitation for supply chain organisations such as system vendors and publishers to engage with the Discovery initiative.

If you’d like to receive future newsletters by email you simply need to drop us a line at rdtf-discovery@sero.co.uk and you’ll be added to the distribution list.

It was interesting to read Harvard’s announcement of the changes they will be undergoing in order to unify their 73 (!) libraries. Much of the announcement concentrated on structural changes but this sentence caught my eye and it seems to suggest that some game changing LIS developments could be in the offing: “The changes will position the Library to lead in scholarly communication and open access, to design next generation search and discovery services, and to accelerate digitization and digital preservation.

Of course Harvard’s Library Lab team are already involved in designing next generation search and discovery services as part of the Digital Public Library of America (DPLA) Beta Sprint initiative – the scale of the data they’re dealing with is pretty impressive but it was the live demo of their “pre-alpha” ShelfLife/LibraryCloud system that took my breath away and got me thinking about new possibilities for discovery interfaces.

When I first read this short blogpost from the Louie B. Nunn Center for Oral History, University of Kentucky I initially dismissed it as not quite newsworthy enough to include in this digest … but I kept thinking about the story after I had clicked away from it.  It seems to me that the ‘Oral History Metadata Synchronizer’ (OHMS) tool that they’ve developed with their digital library division has huge potential for improving the visibility of audio collections and connecting them to other relevant resources. The story of how the Nunn Center have used OHMS to preserve and share interviews with survivors of the Haiti earthquake is a moving reminder that metadata is (at the risk of getting poetic and misty eyed) more than sterile information, and the discovery it enables is human as much as it is digital.

Staying on the subject of audio collections, the Music Library Association is currently working on a final version of their Music Discovery Requirements document and they are currently inviting thoughts and suggestions. This presentation by Nara Newcomer provides useful background on the aim of the Music Discovery Requirements document.

The Discovery programme is particularly focused on the business case for adopting open metadata so it was interesting to read this white paper from Nielsen which reports on the effect of supplying (or not supplying) metadata within the book industry. One of the key conclusions reads: “Overall we see clear indications that supplying a set of full enhanced metadata for product records helps to maximise sales, and that this relationship between enhanced metadata and sales is even stronger for the online retail sector.” Of course UK university libraries are not in the business of book retail and this report could simply serve to make publishers more commercially protective over the metadata they create but all the same it is good to have some high profile research published in this area. It’s a pity that they don’t separate out enhanced metadata from the provision of a cover images in their analysis – from research I’ve been involved in previously I suspect there might be some interesting findings that remain hidden by the approach they’ve taken.

Europeana have published data for 2.4 million items under an open metadata licence as part of its Linked Open Data pilot. The data is provided by eight national libraries and a number of cultural heritage organisations (including some from the UK) and there’s also a convincing animation on the ‘what and why’ of linked data which, pleasingly, keeps the end user at the forefront of the discussion. Europeana also launched the ‘European Library Standards Handbook’ which is their guide for libraries who are providing content to data aggregators – it includes a legal overview as well as a technical guide. If you are interested in linked open data then you might want to follow the University of Bristol’s ‘Bricolage’ project which is JISC-funded and will be publishing catalogue metadata from their Penguin Archive and Geology Museum collections.

Earlier this week I found myself having one of those ‘am I the only person not at this event?’ moments as my Twitterstream gradually filled up with all manner of interesting and diverting tweets from the OCLC EMEA Regional Council Annual Meeting.  Owen Stephens captured some of the knowledge that was shared around the topic of APIs in his blogposts written on the day. One of the sessions that seemed to be particularly well received was Alison Cullingford’s presentation on recent survey findings from the RLUK Unique and Distinct Collections project so it will be interesting to read the report when it is published. The meeting also brought news that an open data commons licence is being considered for WorldCat:

WorldCat: open data commons licence is being considered and will be discussed with OCLC membership through Global Council #EMEARC

— Simon Bains (@simonjbains) February 29, 2012

I will not pretend to be an expert but these guides that the Archives Hub have added to their website look very useful for anyone who is interested in accessing Archives Hub data using SRU and OAI-PMH interfaces.

I’ll finish up by sharing some interesting news in the wider world of open data and metadata:

  • The JISC Managing Research Data Programme is doing some heavy lifting in terms of building a registry of metadata standards  (for UK university research datasets) – I’m sure they would be pleased to hear from you if you have any insights you’d like to share with them.
  • The Government’s call for input to their consultation on “open standards for software interoperability, data and document formats” is ongoing and it doesn’t close until 3 May so there’s plenty of time left to think about what the direct and indirect supply chain ripples might be.
  • In my last news digest I mentioned that ‘big data’ suddenly seemed to be everywhere – This week Nick Edouard’s reflective post over on the BuzzData blog struck a chord with me, particularly his point that “Open-data initiatives are good for many reasons, not least because they can radically improve internal data-sharing.” Often the discussion around open data tends towards a leap of faith/altruistic model but keeping focused on the ‘what’s in it for us?’ question seems a surer way of securing the internal resources needed to release data in the first place.

In closing, a couple of blogposts I’ve read recently have got me thinking about the importance of identifying a vision that other people can quickly understand and get behind:

I think that the Discovery vision packs a similar punch but perhaps it could be more emotive?: “[Our vision] is about making resources more discoverable both by people and machines.” Is that a vision which speaks to you? Have you found the words to succinctly describe your institution’s vision for resource discovery? Please do share your thoughts in the comments below.


w/c 6 Feb 2012 – Discovery News Round-up

February 9, 2012

Here’s my round up of news from the world of Discovery and beyond over the past few weeks. Many of the items were gleaned from the #ukdiscovery twitter hashtag which you can dip into whenever you like by opening up this FiveFilters ‘newspaper’ pdf that I generated.

Last week Joy Palmer shared plans for the next phase of guidance materials and workshops here on the Discovery blog and is looking for your feedback on the outlined approach so please do wade in and let us know what you think. And bonus points for anyone who can suggest a better title for the event than ‘Un’developer hands-on development event. The best I can come up with is ‘Can’t Code, Won’t Code’ so the field is wide open.

The National Information Standards Organization (NISO) are currently inviting public comment on the working group recommendations that have come out of the joint NISO and NFAIS (the National Federation of Advanced Information Services) project to develop Recommended Practice on Online Supplemental Journal Article Materials. The main aim of the project is to improve the ‘discoverability and findability’ of journal supplemental materials for librarians and would-be readers by establishing and maintaining links to the related article. The comment period runs until 29th February and, although the recommendations are aimed mainly at publishers, they are also interested in feedback from the wider scholarly community. [via @simonhodson99]

One of the key NISO/NFAIS recommendations is around consistency and, interestingly, this was also one of the key discussion points raised during recent focus groups run by the JISC/AHRC-funded Open Access e-Books research project (OAPEN-UK). So far the project have heard from humanities and social sciences (HSS) monograph publishers, authors/readers and institutional representatives and next week they are running focus groups for research funders, e-book aggregators and learned societies. Incidentally, if you are interested in taking part in one of those focus groups then further details can be found on their Events page. [via @publishersrcly]

A couple of weeks ago it seemed to be ‘Big Data’ week on my twitter stream – all and sundry were tweeting about it and it wasn’t just the data geeks any more. It certainly seemed to suggest, as reported in this Museum Geek post, that “the era of Big Data has begun” but it struck me that the conversation around big data seems to be moving on from mostly logistical or functional discussions about gathering, storing, sharing and making use of data to a realisation that generating and circulating more data doesn’t solve anything on its own (see GigaOm’s article which likens it to virtual landfill via @paulmiller). In the world of building websites there’s a saying that ‘content is king’ but in the world of data it would appear that ‘content + context = king and queen’. Which had me pondering whether the Discovery initiative could usefully consider establishing Open Paradata Guidelines to sit alongside our Open Metadata Principles. And coming from a humanities background myself I found Michael Kramer’s assertion that “data is always already meta-data” an interesting point to mull over.

The Data Catalogs website, which was launched last summer, aims to be “the most comprehensive list of open data catalogs in the world”. I’m sure it’s relatively early days yet but there are already 212 catalogues listed and the list of experts involved in the website is impressive. It looks like it will grow into a useful centralised resource, particularly if a more advanced search is added, but I noticed that not all of the entries state what their metadata license is – it seems to me that there’s an opportunity to improve consistency and clarity by making that a mandatory field. What did impress/surprise me though is that any visitor to the website can improve a record simply by clicking on the ‘Please help improve this page by adding more information’ link at the bottom of the record and editing the fields that appear [via @rufuspollock]. If you are interested in the issues around licensing open data then Naomi Korn and Professor Charles Oppenheim’s practical guide is worth a read.

And finally, a few items of interest from the wider world of Discovery:

  • This article about book mashups on the Programmable Web ‘API News’ blog got me thinking about countless possibilities for making library and museum and gallery collections more visible and connected in new ways. Then this morning someone tweeted about the strangely hypnotic Flight Radar website and I wondered if one day I might find myself gazing at a map that shows books flying overhead as they wend their way from place to place as inter-library loans.
  • March is looking set to be Culture Hack Month, with events taking place on both sides of the Pennines. Hack for Culture takes place on the 3rd and 4th March in Liverpool and is bringing interested parties together “to explore the possibilities offered by joint experimentation with a wide variety of hidden cultural data sets”.  The 24 hour-long CultureCode Hack takes place towards the end of March in Newcastle and will give cultural and arts organisations with open data the opportunity to work with developers and designers to create something new. You can take a peek at the hacks that were developed the Culture Hack North event in Leeds last year to get an idea of what can be produced in such a short amount of time.

w/c 16 Jan 2012 – Discovery News Round-up

January 16, 2012

This is the first of my regular round-up of what’s happening in the world of resource discovery. Twice a month I’ll be sharing what I’ve found during my internet travels and also highlighting things that have caught my eye under the #UKDiscovery Twitter hashtag. You can also see the latest tweets from that hashtag compiled into an eye-pleasing PDF format (created via the FiveFilters PDF newspaper maker).

Firstly, I want to share the output of the JISC Activity Data Synthesis project which I was involved with last year. The project website was published in November 2011, which already seems like a lifetime ago, but hopefully the collective wisdom gathered together there will be useful for some time to come.

JISC Activity Data website screenshot

The JISC Activity Data programme was a collection of nine projects which, although not directly part of the Discovery initiative, covered some relevant terrain – particularly around issues of licensing, metadata and open data. Other strong themes that emerged during the course of the programme were ‘big data’ (particularly so for the Exposing VLE Activity Data project), data storage and data visualisation. If you’re interested in getting to grips with data visualisation then the online talk that Tony Hirst kindly did for us as part of our virtual exchange sessions is well worth a watch. Five of the projects were focused on library activity data so they are worth exploring if that’s the domain you’re involved with: AEIOU, LIDP, RISE, SALT and OpenURL.

Now onto the highlights of things I’ve come across over the past few weeks: