Omeka - Omeka and FAIR data principles

The Omeka Team has long had a commitment to interoperability and reuse facilitated through structured metadata – from the principles that fueled Omeka Classic’s initial design to the decision to put linked open data at the heart of the data model for Omeka S. As a result, we are committed to supporting the efforts of our users to adhere to the guidelines laid out in the FAIR Principles. At the same time, we recognize that the world of cultural heritage stewardship is implicated in centuries of dispossession and colonial appropriation, so we are equally invested in supporting efforts to implement the CARE principles for Indigenous Data Governance across our platforms. (We’ll discuss that in a subsequent post.)

FAIR Principles

Born out of an initial workshop in Leiden in 2014, the FAIR Guiding Principles were articulated and refined, then published in Scientific Data in 2016. The principles were designed to facilitate the creation and publication of research and collections data that is accessible and reusable both through human and machine discovery. Over the last dozen years, data management plans have become a regular feature of both government and private funding applications. In this way, digital humanists and stewards of digital cultural heritage collections have become accustomed to articulating a concrete plan for stewarding their data beyond the life of basic project development. Nonetheless, the products of that work have often been heterogeneous and difficult for other scholars to work with. Adherence to FAIR principles helps to smooth the edges of that heterogeneity and to increase the likelihood of data reuse. The principles “act as a guide to data publishers and stewards to assist them in evaluating whether their particular implementation choices are rendering their digital research artefacts Findable, Accessible, Interoperable, and Reusable.”

Omeka Classic’s architecture was set in the mid-2000s and as such does not offer as many FAIR-supporting elements as Omeka S, which was designed as a linked open data application in the mid-2010s. Nonetheless, both platforms have a number of features built into their core and available extensions that facilitate the publication of FAIR digital assets.

This work is as much social as it is technical. All description and data work happens in a context, and that context is essential to governing the way that the data is formed, published, and reused. Consensus on description standards must be achieved within a community of practice so that members of that community invest in and deploy the shared methods for knowledge representation within their field. As a result, a number of the key FAIR principles can only be determined by users and their colleagues. Once communities have settled on their approach to description, individual projects can use the FAIR Implementation Profile to evaluate the success of satisfying FAIR goals with their work. These guiding questions can help users understand the decisions they need to make to satisfy the principles of the model.

FAIR in the context of Omeka Classic and Omeka S

The following is an effort to articulate the Omeka features that support the individual elements of the FAIR principles as laid out in the implementation profile. The profile refers to both metadata and datasets. In Omeka Classic and Omeka S, datasets can be packaged and attached as media/files to Items, thus enjoying all of the descriptive capacity of Items in each platform.

Findable

F1. (Meta)data are assigned a globally unique and persistent identifier

What globally unique, persistent, resolvable identifier service do you use for metadata records, and/or datasets?

Omeka S: ARK (via EZID) or DOI (via DataCite) are available with the Persistent Identifier module; Omeka S also produces root installation URIs.
Omeka Classic: No persistent identifier feature.

F2. Data are described with rich metadata (defined by R1 below)

Which metadata schemas do you use for findability?

Omeka S: Omeka S comes pre-loaded with the following vocabularies: Dublin Core; Dublin Core Type; Bibliographic Ontology; and Friend of a Friend. Users can import any other LOD Vocab that might be domain appropriate.
Omeka Classic: Dublin Core metadata element set is used to structure description for items, files, and collections.

F3. Metadata clearly and explicitly include the identifier of the data they describe

What is the schema that links the persistent identifiers of your data to the metadata description?

Omeka S: JSON-LD represents PIDs in an Item description property value. In the API output, context is provided: media have stable URIs for the installation, Item sets are identified, and Sites where content is published are listed. These are stable to the installation, but not technically PIDs.
Omeka Classic: No persistent identifier feature.

F4. (Meta)data are registered or indexed in a searchable resource

In which registry are your metadata records, and/or datasets indexed?

User determined. The United States does not have a single aggregator of digital cultural heritage data. Europeana has a data repository (data model) that serves that purpose for the European Union. Other regions and fields have a range of options that support and disseminate their work.

Accessible

A1. (Meta)data are retrievable by their identifier using a standardised communications protocol

A1.1 The protocol is open, free, and universally implementable

Which standardized communication protocol do you use for metadata records, and/or datasets?

Omeka S: https and REST API.
Omeka Classic: https and REST API.

A1.2 The protocol allows for an authentication and authorisation procedure, where necessary

Which authentication & authorisation service do you use for metadata records, and/or datasets?

Omeka S: https and REST API.
Omeka Classic: https and REST API.

A2. Metadata are accessible, even when the data are no longer available

Which metadata preservation policy do you use? (Metadata preservation policy)

Omeka S: User determined; Files can be removed at any time.
Omeka Classic: User determined; Files can be removed at any time.

Interoperable

I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.

Which knowledge representation languages (allowing machine interoperation) do you use for metadata records, and/or datasets? (Knowledge representation language)

Omeka S: By default, Omeka embeds JSON-LD in resource browse and show pages for the purpose of machine-readable metadata discovery. Can be disabled in the Installation Administrative Settings for Display. Also, the Output Formats module can visibly expose supplementary formats for the human visitor: The options appear as follows: JSON-LD (application/ld+json); Notation3 (text/n3); N-Triples (application/n-triples); RDF/XML (application/rdf+xml); Turtle (text/turtle). Finally, all information and formats are also API accessible.
Omeka Classic: Standard output formats include omeka-xml, omeka-json, dcmes-xml, json, and rss2, which are all API accessible. Additionally, there is an available OAI-PMH Repository plugin.

I2. (Meta)data use vocabularies that follow FAIR principles

Which structured vocabularies do you use to annotate your metadata records, and/or datasets?

Omeka S: User determined; see Value Suggest module (available Vocabularies).
Omeka Classic: User determined; see LC Suggest plugin.

I3. (Meta)data include qualified references to other (meta)data

Which models, schema(s) do you use for your metadata records, and/or datasets?

Omeka S: User determined through the creation of Resource Templates from any uploaded LOD Vocab.
Omeka Classic: Dublin Core metadata element set is used to structure description of items, files, and collections; users may add appropriate elements to Item description through the use of Item Type metadata.

Reusable

R1. (Meta)data are richly described with a plurality of accurate and relevant attributes

R1.1. (Meta)data are released with a clear and accessible data usage license
Which usage license do you use for your metadata records, and/or datasets?

Omeka S: User determined. See the Rights Statement vocabulary in the Value Suggest module.
Omeka Classic: User determined.

R1.2. (Meta)data are associated with detailed provenance

Which metadata schemas do you use for describing the provenance of your metadata records, and/or datasets?

Omeka S: User determined through the creation of Resource Templates from any uploaded LOD Vocab (e.g. DCAT, PROV-O, ODRL). Could be applied to Item Sets or directly in the Item metadata.
Omeka Classic: User determined through the augmentation of Item Type metadata.

R1.3. (Meta)data meet domain-relevant community standards

Who is the community, and what are their domain-relevant community standards?

Omeka S: User determined.
Omeka Classic: User determined.

News Omeka and FAIR data principles