Discovery

A metadata ecology for UK education & research
  • Home
 

Embedding the principles: Our approach to guidance materials and workshops (so far)

February 1, 2012

update from Joy Palmer

Over the last couple of months we’ve been meeting regularly to firm up our approaches for the case studies and supporting guidance materials (‘we’ being me, David Kay, Owen Stephens, Jane Stevenson, Adrian Stevenson, Diana Massam, and more recently Ed Bremner from UKOLN). We’re also committed to running a series of workshop events from May through to September, so there’s been lots of discussion of who we’re targeting and what we want to achieve in all of this.  We’ll be saying more soon about the case studies, but in terms of guidance materials and workshops, here’s what we’re thinking so far:

Guidance materials

The Discovery key ‘themes’ are the cornerstone for the case studies and also all guidance materials:

  • Adopting open licensing – guidance
  • Establishing reasonable terms and conditions
  • Using easily understood data models
  • Deploying persistent identifiers/URIs
  • Implementing (and using) APIs
  • Establishing data relationships (interoperability/standards)
  • Optimising content for discovery via the web
  • Ensuring data is sustainable
  • Collecting data to measure use

We have all agreed that these materials are not ‘how to’ guides, but guidance for the libraries, museums and archives sector on how to get started, questions to ask, as well as demystification/ explanation of key concepts for educated but non-technical audience (to boost confidence in decision-making).

We’ll will need to get the balance right – not assuming too much knowledge, but at the same time not patronising our audience.  They’ll be available in a variety of formats, including text guides, multimedia clips, examples from the case studies, and links to relevant sources (especially those created by UKOLN and CETIS).

As we create content and plan for the workshops we’re thinking about what questions audiences will come with. Here’s a few I prepared earlier – admittedly coming from my own experience:

What’s a persistent URI? How is this different to a URL? What does ‘persistent’ mean? How is this different to a unique URI? A cool URI? Why does it matter if my API is RESTful? Why is it better than Z39.50? My developer/systems provider tells me X API is better, but we’ve been using X for years – which is right?

Right now, Adrian Stevenson is developing the first sample ‘overview guide’ for us to look at and refine as a standard approach (probably on APIs).  The information for each will include:

Definition and context

e.g: What do we mean by an API? What are some examples? Are all APIs made equal?

 e.g. What do we mean by an ‘identifier’?

Common misconceptions/assumptions (evidence from our discussion shows we’ll have our work cut on on this one when it comes to defining an ‘identifier’)

Why does it matter? This will tie back to local and perhaps broader business cases and drivers.

What might this mean for you and your institution? Here we’ll link back to the case studies and examples, and also provide questions for people to use to assess their own contexts and inform ‘next step’ decisions.

Workshop Approaches

We’re thinking of running three workshops twice:

  1. A Licensing Clinic
  2. Making the most of your data on the web
  3. An ‘Un’ developer hands-on development event (Awful title, I know. Bear with me here, and read on – this is where we could use some feedback)

Audiences for the Licensing Clinic and Making the Most of your Data on the Web will be educated but largely non-technical. They are people in their organisations who are responsible for decision-making around collections and online discovery (or for informing decisions).  Their aim is to improve the discoverability of content and collections on the web.

We know individuals will be starting from many different contexts and scenarios (and as much as possible the case studies will capture this range, even while we know we can’t predict every nuance). Many will have technical developer resource available to them, but for many this resource will be very limited, if non-existent.

Attendees will participate in discussion, be willing to speak in detail about their own context and challenges, ask questions, and will come away with an action plan.

Each workshop will be designed to provide attendees with a stronger background knowledge, help them ask the right questions, and complete a plan to tackle them.

The Licensing Clinic will work to demystify the differing licensing types, and discuss tactics institutions can take in making licensing decisions. We know that based on the success of the OKFN recent workshop there is a real appetite for this type of content, but we hope Discovery’s will be different in its focus on practical action plans – the ‘clinic’ approach.

The Making the Most of Your Data on the Web workshop will work to demystify the key technical concepts for Discovery, demonstrate what they mean in real-world contexts,  build confidence among those who are making decisions in this area, and help them identify courses of action available to them.

The ‘Un’developer event? This workshop idea is less well-formed in our heads – both in terms of audience and objectives. But its emerged as an idea after quite a bit of discussion and our collective hunch that there is a community out there of ‘tech savvy web enthusiasts’ who aren’t professional developers/programmers, but who can do a lot for their organisation in terms of implementing change to enhance web discoverability (and already do). They are often responsible for their institution’s website, for example. Their job descriptions don’t say ‘programmer’; but every so often they write some sort of script to make things work; they may have tried mashing data or working with yahoo pipes or google APIs.  They might use Code Academy, but they’re perhaps not confident enough to post to stackoverflow.com.  We know a good proportion of the institutions we’re addressing through Discovery simply don’t have access to technical programmers – even system support can be a real challenge to secure.  But out there are a lot of individuals figuring out the web side of things themselves and getting on with it.  We want to figure out if by supporting this group, we can help embed some of the Discovery technical principles on the ground.

Sidenote: This is not all to say we’re not going to be doing anything specifically for developers.  Quite the contrary.  Other hack events are going to be planned in collaboration with DevSCI later on this year.

—-

So feedback is very welcome — particularly from those who might recognise themselves (or not) in that un-developer group.  Are we onto something here? Let us know your thoughts and questions (comment here, or tweet #ukdiscovery, or email me directly at joy.palmer@manchester.ac.uk)

5 Comments | Joy Palmer | Tagged: businesscase, devcsi, metadata, opendata, ukdiscovery | Permalink
Posted by joypalmer


Emerging bibliographic tools and technologies

October 19, 2011

On the 5th October 2011 I attended a workshop on ’emerging bibliographic tools’ organised by JISC. The idea of the workshop was to bring together a small group of people with experience of a wide variety of tools used to transform, publish, and otherwise manipulate bibliographic data.

The day kicked off (after introductions) with simply capturing the whole range of activity, formats and tools that the attendees through were relevant to exploiting bibliographic data. The nature of this session made it rather a whistlestop tour of technology and terminology, including:

  • Linked Data and RDF
  • NoSQL and related tools such as CouchDb, MongoDb (document stores) and Redis (a key-value store)
  • Big data (defined as ‘data bigger than your used to handling’) and Hadoop/MapReduce
  • Identifiers – the challenges of finding and exploiting appropriate ones such as DOI, ISBN, AuthorClaim and ORCID
  • Automatic metadata creation from full text resources
  • Visualisation tools – from Google Charts to R
  • Ontologies and representations – from MARC to BibJSON to RIS to BibTeX to Bibliographic Ontology to Schema.org
  • ‘Data reconciliation’ tools such as Google Refine and the Stanford Data Wrangler
  • Indexing technologies; Solr/Lucene, SolrMARC, Sphinx
  • Code libraries for MARC: PyMARC, ruby-marc, MARC::Record, MARC4J
  • Spidering/Web crawling technology: CrystalEye, PubCrawler, nutch
  • … and more

However, there was also time to discuss some aspects in more detail, going beyond just the tech, and starting to talk about the skills required to manipulate bibliographic data, and potential developments that might support those working with data, such as identifier lookups, visualisations, and data transformation services.

After lunch we picked up on these latter points looking for the opportunities, challenges and gaps that existed. The morning discussion had highlighted the incredible range of relevant technologies, and one of the challenges identified in the afternoon was keeping on top of existing and new initiatives, with the use of mentoring and online community support, identified as opportunities.

In the morning a healthcare metaphor was introduced with some discussion of a ‘Data Doctor’ role for organisations – someone with the technical skills, domain knowledge, and data expertise, who would be responsible for ensuring that the organisations data was in ‘good health’ (see also ‘data scientist‘ ). In the afternoon, this concept was expanded with the idea of a ‘data health check’ service, somewhere you could load data to identify possible problems, and crucially suggested workflows and resources for improving the data.

Perhaps the most crucial issues identified in the afternoon were around skills and sustainability. As we see an increasing need to manipulate data and publish it simultaneously in multiple formats to serve different audiences and needs, we need to find staff with appropriate skills, and ensure managers understand the business case for this work and the skills needed to support it.

At times, the range and scope of the technologies, tools and issues identified by the workshop was overwhelming, as acronyms and jargon flew freely around the room. However, the opportunities opened up by new ways of working with bibliographic (and other) data are exciting, and I strongly believe that we can take advantage of these to produce richer expressions of our data than ever before.

The technologies and tools identified by the workshop will form the basis of a short guide which will be published by the Discovery initiative.

Leave a Comment » | Uncategorized | Tagged: marc, metadata, tools, ukdiscovery | Permalink
Posted by ostephens


  • Links

    Discovery main site

  • RSS Presentations on Slideshare

    • Resource Discovery - a Bournemouth Perspective
    • Linked Data as an enabling framework for resource discovery across libraries, museums and archives
    • RLUK members meeting 25-11-11 discovery presentation
    • Aggregation Using Linked Data – LOCAH Project Experiences
    • Uk discovery-jisc-project-showcase
    • Using OpenUrl Activity Data Summary for RDTF Day 26 May 11
  • Bloggers

    • Adrian Stevenson
    • amcgregor
    • helenharrop
    • James Riding
    • joypalmer
    • ostephens
    • serodavid
    • ukdiscovery
  • Related reads

  • aggregation businesscase devcsi digest discodev Discovery jiscad jiscsalt Ken Chad libraries linking lives locah marc metadata mimas opendata opensource rdtf RLUK sxsw sxsw interactive thought piece tools ukdiscovery usability
  • RSS

    • RSS - Posts
    • RSS - Comments
  • Archives

    • December 2012
    • November 2012
    • October 2012
    • September 2012
    • August 2012
    • July 2012
    • June 2012
    • May 2012
    • April 2012
    • March 2012
    • February 2012
    • January 2012
    • November 2011
    • October 2011
    • September 2011
    • August 2011
    • July 2011


Blog at WordPress.com.