Announcing Ozmeka - Linked-data enhancements to Omeka

At the University of Technology Sydney and Western Sydney University in 2015 we did extensive work on using Omeka as a linked-data platform for research collections. While the work was completed mid-year it took us a while to reach out to the Omeka community about it, which we're now doing.

In summary, we created an automated build process for firing up new machines using Vagrant, improved the Item Relations plugin in several ways including making it easier to find items you wish to link to, and made it easier to look up a variety of external thesauri when entering metadata.

You can read about the work on our blog: https://eresearch.uts.edu.au/2016/03/01/ozmeka.htm

Here's the text of the post, hop over to the blog for the formatted version, and we'd love any feedback here in the forum!

--------------

What is ozmeka?
Ozmeka is a set of plugins, or forks of plugins, to make Omeka into more of a linked data platform. That is, to enable Omeka repositories to embrace linked data principles, the first two of which are:

Use URIs as names for things.
Use HTTP URIs so that people can look up those names.
The new version of Omeka, known as Omeka S, promises to fix a lot of the issues we were targeting with our work on Ozmeka, but it is not ready for production use yet.

Where can I get it?
Everything we're talking about here is available on Github, under the Ozmeka project. There is a Github Ozmeka Repository where we did some lightweight project management, with milestones and issues, but the plugins are all in individual repositories. We'll talk about them below.

What specifically did you produce?
We made changes, which we hope improved a bunch of Omeka plugins and produced some scripts for pushing data into Omeka.

Item Relations
The Omeka Item Relations plugin already existed, and it was one of the main reasons we chose Omeka, as it allowed not only repository items to have URIs but for items to be related to each other using terms from standard vocabuaries. This meant that instead of having metadata like:

This item <dc:Creator> "Some name"
You can have:

This item <dc:Creator> #21
Where #21 is a reference to another item in Omeka. This is a huge step-up in metadata quality over using strings as values.

What did we do:

Added a lookup to item relations, so you can find the thing you're trying to relate to. Previously you had to know the ID. Yes, the user interface could be improved but it works efficiently.

Added Item Relations to the API, so that external code can relate items. This is used in some of the utilities.

Added English glosses to relations so they make sense when displayed on the page. For example, the creator relation from the Dublin Core metadata standard is ambiguous. Does This item Creator Somebody mean that somebody created the item or somebody was created by the item. We added the ability for relations to be displayed like: This item was created by Somebody.

Added the ability to create new Item Relations vocabularies (if only via the API).

These all seem like good candidates to roll into the ItemRelations plugin.

Added URIs to the core data model / worked on lookups
One of the limitations of Omeka v 2.x is that it only accepts simple text values for metadata. Omeka-s fixes this, and metadata values can be text, a URI or a reference to another item. After much soul searching, Thom added a new database field to allow any metadata item to have a URI, and linked this into some new work on external lookups (presently SCOT (the (Australian) Schools Online Thesaurus), Geonames and the preexisting US Library of Congress lookup), so that we can let people select metadata values from external services, or input a URI as a value. Given that this is the approach taken in Omeka-s, this looks like an obvious thing to retro-fit to Omeka v2.x.

Sundry
We also have a simple modal dialogue to ensure visitors accept Terms and Conditions, an extra option (filter by Subject) in Search and a light plugin which renders attached HTML files inline in an item.
The Seasons theme has also been slightly modified to present linked images on a page with their metadata instead of raw.

Auto-ozmeka making it easy to deploy a new server
Auto-ozmeka is a collection of scripts that uses Git, Vagrant, Ansible and Virtualbox to create a virtual machine. The VM will have a fresh installation of Omeka and the suite of Ozmeka plugins ready to use. Solr search is preconfigured.

Enable the Ozmeka plugins you wish to use in the web GUI, add any others you may need, and you're ready to start developing your linked-data website.

In the near future we expect to be extending this script to provision production environments as well.

Utilities
In addition to the plugins we developed a few tools for using Omeka over the API. Work started on this at Western Sydney, and continued at UTS. The main tool, xlsx2omeka.py takes a spreadsheet of data and imports it into an Omeka repository, including the ability to create collections and upload multiple different types of item and multiple attachements per item.

Our spreadsheet uploader works, but it's clumsy, and I (Peter) have started working on a new approach that uses CSV files rather than spreadsheets, and does not require as much fiddling around, but this one is not yet very mature and is undocumented. The idea with the new approach is to allow the same data to be uploaded to multiple types of repository, we're also working on an uploader for Fedora 4 which is very much a work in progress.

Hi,

Interesting! I'm going to try them.

About the import process, you can have a look to Archive Folder (https://github.com/Daniel-KM/ArchiveFolder), that can import ods (spreadsheet), xml METS and ALTO, xml EAD, and unformatted files.

Daniel Berthereau
Infodoc & Knowledge management