Some preliminary highlights from the Discovery programme

November 15, 2012

The Discovery programme is nearing its completion date of December 2012. Most of the projects have finished or are wrapping up. Our efforts are now directed towards gathering together all that we have learned and produced in the programme.

The programme has covered a lot of ground so pulling everything together will take us some time. While that happens I thought it might be worth listing a selection of preliminary highlights of the programme. This blog is based on a talk I gave at the RLUK conference so the focus is on libraries and archives rather than museums.

Future approaches to Discovery

It is not clear what the future is for resource discovery. It is unlikely that there will just be one approach to resource discovery for libraries, museums and archives. The future is likely to be plural. While discovery has not developed firm answers on what the future is. We have experimented with a range of approaches and have identified those that are promising.

These approaches are recorded in the Discovery case studies and guidance site. They can be used to inform future plans in libraries, museums and archives. Or if the approaches seem promising enough they can be emulated or the tools that have been developed can be used. We are planning to produce a toolkit so that all these tools are in one place.

What is clear is that we are not alone in experimenting with these kind of approaches. This is a global movement with many and diverse institutions exploring similar approaches. The case studies and guidance recognise this by including explorations of the approaches of the Wellcome Trust, the Rijksmuseum and the Victoria and Albert Museum.

Innovative cataloguing

Resource discovery starts with cataloguing. The focus of the programme was not on cataloguing but a couple of interesting innovative approaches have emerged from the project.

The Institute of Education decided to explore new ways of cataloguing their collection. This involves the creation of basic records in Drupal, enriching of these records using professional cataloguer input then exporting of these records into the LMS. This may sound a roundabout way of doing things as I have written it but it was 3.5 times quicker and therefore cheaper than the current approaches. This allows the catalloguer to concentrate on the record enrichment by adding index terms. Full figures are available on their blog. They also developed lightweight ways to catalogue uncatalogued material which offers a significant saving in researcher time when using the material. More detail on this on the blog.

The second exploration of catalogues focused on the collection as a whole. The Copac collections management project used the copac data to create tool to allow librarians to analyse their collections and make decisions on which items can be removed from the collection and which are rare and need to be retained. This tool has been trialled by a number of libraries. During their trial, the University of Manchester found that the tool was 86% more effective than manual checking of the collection. Details on how this figure was arrived at can be found in the case study.

Greater impact through linking

Linking items in collections with relevant items in other collections offers the possibility of enabling richer resource discovery services and supports new and emerging research interests. Linked data is an intriguing option for enabling this. I don’t think the discovery programme has come up with a definitive answer on whether linked data is the future for libraries, museums and archives. But I think that the evidence is fairly strong that it will be a part of the future.

The programme included a number of projects experimenting with linked data for libraries and archives and there is work to be done to gather all of these together. However there are some headlines that we can report now:

  • The use case in archives seems to be strong as linking resources by place and person is something that should be useful to researchers and students
  • The step change project worked with Axiell to update CALM so that archives can create linked data records from within CALM. This functionality will be included in the next update and has the potential to benefit the large number of archives that use CALM. This linked data creation functionality is also available as a stand alone tool called Alicat.
  • Cambridge were able to create linked data records for 2.3 million books for their project which cost just under £40,000.
  • The ArchivesHub project Linking Lives has worked to use people as hooks to explore archive collections. This uses linked data and the model they have developed is being reused internationally. 
  • The Pelagios project has created a way to use linked data to identify ancient places in archive collections and there is a vibrant community growing around their approach.

Of course the Discovery programme is not alone in investigating linked data. The Library of Congress, OCLC, The British Library, Europeana and the DPLA are all using or investigating some form of linked data technology in pursuing their aims.

Linked data is not the only option for bringing different collections together and allowing people to use them in new ways. This can also be done with APIs and there are two discovery exemplar projects doing just this for Shakespeare and for WW1. Work on these is still underway but both are looking promising and offer some very interesting lessons for how to aggregate collections to enable new forms of resource discovery and research.

Enhanced shared services

We already have many shared services that help people discover those resources. Throughout the programme we have worked with those services to enhance them to help realise the resource discovery taskforce vision. It s worth a separate post on all of the ways the services have been developed so for now, I will just list the services that have been developed in the programme:

Business case

These are challenging economic times so it was important to address the business case for libraries, museums and archives to invest effort in improving resource discovery. The results of this work can be seen in the business case section of the discovery guidance. We worked with senior managers from libraries, museums and archives throughout the programme to ensure what we were doing address their needs. As part of this work we produced a series of videos where a selection of senior managers talk about their needs, challenges and predictions and they make for interesting viewing.

What’s next?

We are in the process of reviewing the Discovery programme and the resource discovery taskforce vision that kicked it all off. This review will produce a set of recommendations on what we should do next. These will be available in January. We will be looking to pull all of the outputs from the programme into a form that makes it easy for people to learn from the programme and to use what has been produced. We are also in the process of putting together an event for 2013 that brings together people from around the world that are working on addressing resource discovery challenges and seeing what we can learn from each other. More information on all of these things to follow soon.


New Discovery open metadata projects

February 3, 2012

Five new Discovery projects started this week. They are all focused on the creation and release of open metadata from libraries, museums and archives in line with the Discovery open metadata and technical principles.

The projects are:

  • Bricolage – will publish catalogue metadata as Linked Open Data for two of its most significant collections: the Penguin Archive, a comprehensive collection of the publisher’s papers and books; and the Geology Museum, a 100,000 specimen collection housing many unique and irreplaceable resources. University of Bristol
  • Open Education Metadata UK – will publish metadata sourced from four significant UK education collections as Open Data in a variety of formats, for anyone to reuse as linked data in their own applications. In addition, subsets of two collections which have high latent potential for linked data will be catalogued. Institute of Education
  • Open Book – will release open metadata for the Fitzwilliam’s Designated Collection (over 150,000 records) and linked open data for the internationally important collection of illuminated manuscripts in the Fitzwilliam Museum (approximately 500 manuscripts records). The Fitzwilliam Museum, University of Cambridge
  • Music Collections at Cardiff University: Advancing the Resource – focuses on a collection of manuscript and printed music from the eighteenth and nineteenth centuries, a resource of nearly 3000 items largely unknown to the wider scholarly community. This project will catalogue the material online, and make the data available through the Archives Hub and COPAC, as well as RISM (UK) (Répertoire International des Sources Musicales). Cardiff University
  • Trenches to Triples – will provide Linked Data markup to 200 collection level descriptions and 6,000 item level catalogue entries relating to the First World War from the Liddell Hart Centre for Military Archives and will also provide a demonstrator for using Linked Data to make appropriate connections between image databases, Serving Soldier, and detailed catalogues. King’s College London

The projects are just getting started but will all have blogs which will record their progress. Look out for further information on the projects via the discovery site. All of the learning and outputs from these projects will be summarised on the Discovery website to ensure that others can benefit from what the projects learn and produce.

I have written an overview of all the current Discovery work on the JISC website.


Discovery exemplars

January 13, 2012

The Discovery programme is, in many ways, a slippery beast. It is not building one specific thing, but it is rather advocating a range of approaches that, if taken by libraries, museums and archives, should lead to better resource discovery services. This can make it difficult to explain. This is compounded by the fact we are learning as we go so messages are starting simple and high level and getting gradually richer and more granular as we learn more. Despite this, persuading people to adopt new approaches to licensing, technology and institutional processes is the key to achieving the aims of the Discovery programme. To help cope with this contradiction we resolved to build two exemplar services that show what is possible if the Discovery principles are adopted by collection owners and service builders.

We have now funded these projects and the work is starting to get underway.

EDINA are building Shakespeare’s Registry. An aggregation of online sources of digital resources relating to William Shakespeare, covering performance, interpretative and contextual resources in order to demonstrate the value and principles of metadata aggregation as part of the JISC/RLUK Discovery initiative.

Mimas are working with King’s College London to develop an api to enable people to explore content about World War One. They will work with other partners to develop two innovative interfaces built on top of the api. More detail on the background and intentions of this exemplar can be found on the dedicated blog.

The projects will comply with the principles laid out in the Discovery open metadata and technical principles. As well as developing useful resources they will learn valuable lessons about how to best go about building resources that comply with the principles. Both aggregations will use apis to aggregate the content and will do so via open interfaces rather than negotiating special access to the content. Both projects will focus on encouraging others to build on top of the apis they develop rather than focusing on their own vision for an interface. This also means that both projects need to take an open approach to the metadata they aggregate and adopt suitable Creative Commons or Open Data commons licences (pdf).

So, both projects have a lot on their plate and have challenging timescales. Both are scheduled to deliver the exemplar by July 2012. They both have the potential to be rich and interesting resources and will definitely learn useful lessons. We will update you on their progress via this blog.


Phase 2 of the Discovery programme and some new projects

November 22, 2011

The Discovery programme runs until the end of 2012. To enable us to learn as we go we have split the programme into 3 distinct phases. Phase 1 is complete and we have moved into phase 2 of the discovery programme. Phase 1 focused on open metadata and is summarised on the JISC website. You will be able to read summaries of the lessons from phase 1 on the Discovery site later this week.

Phase 2 is described on the JISC website. In this phase we will continue the efforts to understand the best approaches to open metadata and to produce advice, guidance and advocacy on the productive approaches we uncover. However it also includes projects designed to reuse open metadata to address problems and meet use cases for libraries, museums and archives and their users.

There are 9 of these projects

Contextual wrappers 2 – will develop resource discovery services for people interested in museum resources by using open, integrated and contextualised collections information from across University Museums. Fitzwilliam Museum, University of Cambridge.
Problem or use case it is addressing: Enriching museum metadata to support new forms of discovery

Copac collection management – developing a tool that libraries can use to analyse their collections in comparison to other university libraries by using the data stored in Copac. Mimas
Problem or use case it is addressing: Enabling libraries to analyse their collections and evaluate them against other libraries to make effective judgements on what to keep and what to discard.

Digital Bodleian – provide linked data, apis and a new user interface for the Bodleian’s substantial collection of digital assets. University of Oxford
Problem or use case it is addressing: Providing coherent discovery for varied collections and enabling and promoting reuse of the metadata about those collections

DiscoverEDINA – a three pronged project that is developing software to extract metadata from and embed new metadata in multimedia content; developing a crowd sourcing tool for the enriching the metadata about the content in JISC MediaHub; enhance the open linked data available from SUNCAT. EDINA.
Problem or use case it is addressing: Enriching metadata and enabling and promoting reuse

Pelagios 2 – developing a toolkit for Ancient World resources so that collection providers can expose their metadata and find and visualise geospatial connections between open collections of Ancient World resources. Open University.
Problem or use case it is addressing: making it easy to make open metadata available, visualising it and enabling and promoting reuse.

Search25 – will provide a new open resource discovery experience at the M25 consortium regional level, offering researchers the advantages of both local/regional and national resource discovery.
Problem or use case it is addressing: Providing coherent discovery for varied collections and enabling and promoting reuse of the metadata about those collections

Servicecore – will develop the core repository tool to enable users to find papers that are on similar topics in the UK’s repositories
Problem or use case it is addressing: Enriching metadata about articles in repositories and enabling and promoting reuse of that metadata.

Step Change – will embed functionality to create linked data into the cataloguing interface of the CALM archive software, will enhance the UK archival thesaurus so it can be used as a semantic tagging tool and will work with Historypin to enable the geographical exploration of archive content.
Problem or use case it is addressing: Enriching metadata about archives contents, visualising it and enabling and promoting reuse of that metadata.

The cutting edge – will create a new online resource to support teaching and research into tools with sharp edges, it will do this by bringing together metadata from several important collections housed within the Great North Museum.
Problem or use case it is addressing: Providing coherent discovery and exploration for varied collections and enabling and promoting reuse of the metadata about those collections

These projects all started recently and will run until July 2012. They are a varied and exciting series of projects and I am looking forward to seeing what they produce and learn.