My mad idea: OpenTag

Supporting the ecosystem of open content content through a shared tag tool

This is an idea that has been buzzing about in the back of my head. Based on many many conversations with clever colleagues who will, no doubt, see the flaw in my plan. But here it is anyway. This will make little sense to people outside of educational technology.

We have a problem to solve:

In order to share educational resources more effectively, providers and users need a richer exchange of pedagogical and contextual metadata.

As I understand it, most repositories deal with Dublin Core metadata, some with variants on the richer IEEE LOM, such as UK LOM. The fields exist in the software. But just because a field exists doesn’t mean it has any information in it, or that the information is meaningful, or that the information is consistent with other repositories. The logic of what metadata should be shared is mostly agreed, but the extent to which each field is filled in, and with what vocabularies, is where the decisions are.

The question is about the consistency of the information collected about the resource, and how far other services can rely on there being that information in the metadata record.

For open content for education (OER if you want to call it that), I suggest there are several dimensions. Catherine Bruen from NDLR and myself sketched this out at the Oriole Retreat:


This diagram is DELIBERATELY not a definition of openness. It is a representation of the dimensions against which we judge “open”.. I deliberately include cost and access because as some readers might know, I wouldn’t rule out an ecosystem of resources free at the point of use, via oath, so that providers can have tracking information. Or a ubiquitous micropayments system that keeps costs down by providing usage data back to providers.  It’s not that these things are desirable compared to pure open. It’s that they might happen. I’m interested in what might happen as well as what we would like to happen. Reality is usually a compromise between vision and constraints.

I am thinking that the ideal is on the far edges, so that a very open OER has a larger image than a partly open one. So …  a resource might score highly on one, low on another.  We’ll come back to that in a minute.

Where is the information to map the resource against these dimensions? Well, the metadata is sometimes available from the depositor, though if the requirements are too onerous, the benefit of deposit vs the effort required can tip towards not depositing. Various solutions are being explored. For example, mandatory deposit (removing the element of choice), though often that reduces the metadata requirements on the depositor. Being able to incorporate metadata post-deposit helps, whether from the originator, from cataloguers and creators, or from re-users. The Learning Registry aims to derive metadata from multiple systems, aggregate it, and feed it back to the resource. SWORD makes deposit easier.

There is then, a question of what metadata is really required to meet the use cases (as explored in October 2010 ) in a way that gets a balance between depositor effort and user benefit. (Its that “reality is usually a compromise between vision and constraints” point again).

Services that do collect and display rich metadata often find that the richness gets lost on export. Because aggregators and other sorts third party services have not been able to rely on the metadata being there, so they haven’t built any functionality around it. It makes perfect sense: you don’t want a set of filters or icons that remain blank for most of the content in your service. You need, maybe, 50% completion for it to be useful to users rather than annoying.

To make the effort worth it you need:

  • A use case to build metadata for
  • Enough other services planning to use the same metadata
  • Enough content depositors willing to see the benefit in providing richer metadata

I think we have clearer use cases now than a few years ago

  • Use Case for End Use: OERu course authors
  • Use Case for Discovery: Learning Registry
  • Use Case for Deposit: into and between repository services such as Jorum and Connexions

In terms of enough services and enough depositors understanding the benefit of more consistent and complete metadata… I think we are getting there. The Learning Registry could potentially provide a driver for services starting to trust that they can rely on metadata being there. In fact have been thinking that this space is maturing quite fast. In November 2011. Kathi Fletcher’s OER roadmap work articulates this opportunity really well, and she’s working on OERPub which is like SWORD for OER.

So is this metadata issue worth another look?

Is there space for a shared solution to tagging open content for education with more consistent metadata?

The way tagging works is mostly like this:


You might even do a look up to an external authority file rather than create it yourself – this is the way Names works, for example.

So using that sort of model, what abouttaking some of the tagging job and creating a shared service for it? A service that accepts that different vocabularies will change at different speeds? That coordinates the provision of authority files in those areas that are key to educational content? That shares the work of creating more attractive tagging interfaces for different devices and third party systems? That gives third party developers information in one place about the extent of metadata they can expect to build for?

This is the sort of development taking place in the open access repositories space, and I think we should think whether we are ready to do that for open content for education yet.


I’m pretty sure this would work with SWORD and with the learning registry. It’s another piece of the puzzle.

It is not a standard. It is not a vocabulary. It is a vocabulary service provider. It doesn't mandate, it doesn't validate (though validation tools could be built around it).

It can support the work of advocates of accessibility, open licensing, etc by giving them vocabularies they can build tools and validators around. They can build all sorts of "how open are you" tools. And they can use the vocabularies of self classification as collections/filter criteria, to include only those resources that meet their requirements.

It can support the work of third party providers by letting them see what they can expect from content providers. It brokers between provision and use of metadata.

The working groups might already exist – the vocabularies might already exist – for example the OER Commons Evaluation Tool for pedagogy, Creative Commons tools for licensing, and this sort of guidance. I would love to see a shared global subject vocabulary tool. Even just for the top 20-50 categories. Imagine what that could do for subject-based services! (It would need much more tagging than, obviously,  but it would help developers and intermediaries focus in on relevant collections from around the world).

Now to go back to the use cases. The same sorts of interfaces that are built to support tagging could also be used to provide filtering. So if the content HAS to be editable using free software, or it HAS to be for group teaching, or it HAS to be renderable to screen readers, there is a focus for where that use case gets articulated, and the arguments for meeting it get pushed around the network of content providers. Enhancements to consistent metadata would be use case driven.

Am I mad? Perhaps I'm feeling overly optimistic today.

Feedback very welcome … I don't know the "how", I just know the "what" and "why" and some of the "who".



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s