CSV Import

The CSV Import module allows you to import items, item sets, media, and users into your Omeka S install from a CSV (comma-separated values), TSV (tab-separated values), or ODS (OpenDocument Spreadsheet) file. This module is only available to Global Administrator and Supervisor users.

CSV Import requires your Omeka S installation to have PHP working in order to run background import jobs. Before using CSV Import, you should confirm that PHP is being recognized from the System Information page.

Prepare your CSV file

Most spreadsheet editors (including Microsoft Excel, Google Sheets, and Apple Numbers) can export to CSV, TSV, or ODS format.

Note

CSV files for import must be encoded in UTF-8, so when exporting or saving a new document, be sure to check that the encoding is UTF-8.

Most CSV Import options rely on you only importing one type of data: a list of items, a list of item sets, a list of media, etc. There is the option for a mixed-resource import, requiring one column that identifies the type of each row.

If the spreadsheet is already created, consider which columns you want to match to which vocabulary properties. Your CSV file must have a header row in order for the module to process it correctly, so you may need to add a row at the top with column names.

If you have multiple inputs for a single property, you can separate them with a secondary multivalue separator. For example, a work with multiple authors (E.B. White and William Strunk Jr.) with the column for Creator containing "E.B. White;William Strunk Jr" has a semicolon (;) as the multivalue separator. When imported into Omeka S, each of these would appear as a separate entry in the property (Creator: "E.B. White" and Creator: "William Strunk Jr."). Note that the import will be the same whether you leave a space after your separator (as in "E.B. White; William Strunk Jr") or not.

Column names

You can manually map each column to its corresponding property, and you are required to manually map non-metadata columns, such as the file URL for upload. The module will automatically map metadata columns by the names provided in the header row, if they conform to the property terms of your installation's vocabularies in the format prefix:property. For example, a CSV file with a column header "dcterms:title" would automap to the Dublin Core Title property when the CSV is loaded for mapping. You can modify these automapped columns before import.

To find the terms you should use for your column headers, go to the Vocabularies tab from the admin dashboard. Click on the number of properties for the vocabulary you want to use (for example, Dublin Core in the image below).

In the table of vocabulary properties, there is a column for Term. Use the Term as the column heading for the property you want to automap in CSV Import. For example, "dcterms:abstract" would automap to the Dublin Core property "Abstract" and "foaf:firstName" would automap to the Friend of a Friend property "firstName".

arrow points to the Term column for Dublin Core properties.

There is a setting in the inital import settings to automap with simple labels - this will work with columns whose names match a vocabulary label, for example "title" or "abstract", without supplying the term. Note that this option defaults to Dublin Core (dcterms:title and dcterms:abstract) before proceeding through other installed vocabularies.

If your column names are not exact and the automapping feature does not recognize them, you should still label them something helpful so that you can manually map them while importing.

If you have plans to batch-import metadata or properties that come with a module (such as latitude and longitude from the Mapping module) or using structured vocabularies that come from modules (such as the data types from the Value Suggest module), install and configure those modules first to ensure that the fields exist in your site's data model, before trying to enter information into those fields. Data may be lost if you uninstall those modules later.

Initial import settings

Start an import by clicking on the CSV Import tab on the left-hand navigation. This will open the initial "Import Settings" page. For most spreadsheets directly exported from a software program into the correct format, these settings can be left on their defaults.

Use the "Choose File" button to select a spreadsheet from your computer.
From the CSV column delimiter dropdown, choose from the following options (this should match the formatting of your file) the character that separates different values in a row:
- comma (default)
- semi-colon
- colon
- tabulation
- carriage return
- space
- pipe (|).
From the CSV column enclosure dropdown, choose the option that encloses long text in your file, if applicable:
- double quote (default)
- quote
- hash (#).
From the Import type dropdown, select what you are importing:
- Items
- Item sets
- Media (a column matching Media to already existing Items is required)
- Mixed resources (spreadsheet can include Item sets, Items, and Media; a column identifying the type of each row is required)
- Users.
Check the box to Automap with simple labels. CSV Import will automatically map specially formatted column headings (in prefix:property); if you check this box, it will also automatically map any column headings that match existing vocabulary property labels (such as "Title").
Comments will appear on the "Past Imports" page; you may find this useful to make a note about what is being imported and any settings you have chosen on this page, for example if you are working in batches or may wish to undo an import later.

Import settings as described, no entries

Click the "Next" button to continue with the import process.

Import items

To import items, select "Items" under the "Import type" on the first page.

When you click "Next", the page will load with the following tabs:

Map to Omeka S data

This tab displays a table with the columns from your spreadsheet as rows. Each row displays:

A checkbox
The column header from the spreadsheet
A plus symbol button for adding or modifying a mapping
A wrench symbol button for spreadsheet column options
A column displaying properties mapped, either automatically or manually
A trash can to delete existing mappings
A column to show the particular options selected (such as whether to look for multivalue separators, or visibility for that column).

Mappings for a spreadsheet with ten columns. Some of the columns, such as those named Description and Title, have automatically been mapped to Dublin Core properties.

Mapping options

To map a column header to a vocabulary property, click on the plus symbol button. This will open a drawer on the right-hand side of the screen.

The drawer has multiple options for mapping:

Properties allows you to select a property to map the column data to, from any of the installed vocabularies. Use the "Filter" field to search for a specific property.

Properties option open, showing all of the installed vocabularies for the Omeka S installation: Dublin Core, Bibliographic Ontology, Friend of a Friend, Scripto and OWL-Time Ontology.

Item-specific data has a dropdown to set an Item set by selected property. If you have a column identifying an Item set to which you want to add each item (rather than putting all of the imported items into the same Item sets on the "Basic Settings" tab), you can set how it maps using this dropdown. You can either use the Item set's internal ID, or any one of its properties (such as title).

dropdown as described

Generic data also has a dropdown where you can set one of four options:

Resource template (by label): Set the template for an item by name. The name of the template as entered in the spreadsheet and the name of the template in Omeka S must match exactly.
Resource class (by term): Set the resource class for an item. The term for the class in the spreadsheet and in the Omeka S installation must match exactly; reference the Vocabularies tab of your installation. For example, enter "dctype:Dataset", "dcterms:Location", "bibo:Interview", or "foaf:Person" with a colon separating the vocabulary prefix and the term, without spaces.
Owner (by email address): Set an item's owner by email address. This must be the email address associated with the user's account in the Omeka S installation.
Visibility public/private: Set the visibility of the item. Use "private" or "public" in the spreadsheet.

Dropdown as described

Media source allows you to import media along with your items, by selecting the sourcing method from the dropdown:

HTML
IIIF image (link)
IIIF presentation (link)
oEmbed (link)
URL
YouTube (link).

Other options may appear here based on your active modules, such as File Sideload.

Be sure to click the "Apply Changes" at the bottom of the drawer or nothing you set here will be kept.

To remove a mapping, click the trash can icon in the row for that data mapping. It will remove only the mapping, not the column data.

If you have data in a column in your CSV that you do not want to bring in to your Omeka S installation, simply do not map that column to a property or data type.

Column options

Column options are in addition to mappings. If you add options without also mapping column data to resource, media, or other data, nothing will be imported. If you have multiple mappings set up on a single column in your data, these options will apply to all of them.

To access options for a column of your CSV (represented by a row in the import table), click the wrench icon for that column heading.

drawer with options as described above

This will open a drawer on the right side of the browser window with the following options:

Use multivalve separator: Check this box to use the multivalue separator for data in this column. You set the multivalue separator in the initial import page, but you can change it in the Basic Settings tab.
Language: Set the language for this column using the IETF Language tag for the language in which the text is written. This will override what you have entered in basic settings.
Import values as private: Check this box to set all property values in this column private.
Data type: A dropdown with at least three options, which correspond to the values one can use when adding properties to an item:
- Import as text (default).
- Import as URI reference. You can set the label for a URI by including the desired text after a space, for example: http://example.com Label Text Goes Here.
- Import as Omeka S resource. This will create linked resources. If you select this option, you must choose which property values to match to find the intended Omeka resource in your installation, in the next Resource identifier property dropdown. This must be a unique property, so "Title" may not be a good choice.
  - You can use the internal Omeka ID. A resource's ID is the number sequence at the end of the URL when on the view or edit page, so for /admin/item/11576 the ID is 11576. You can also see the resource's ID in the right-hand drawer on the resource's view page. Items, item sets, and media all have IDs.
  - You can include resources that are being made in the same CSV, as long as the resources being linked to have already been created in earlier rows and can be found with the unique property value indicated here. If you wish to do this, we recommend setting the batch number low (even to 1) on the Advanced Settings tab, to ensure resources are being fully created before another new resource tries to link to them.
- If you have certain modules installed, such as Numeric Data Types, there may be additional data type options supplied by those modules.

Be sure to click the "Apply changes" button at the bottom of the drawer in order to save your changes. To remove a column option setting, click the wrench icon again and undo your changes manually.

Batch edit

When you select one or more rows in the table (columns from your CSV file), you can use the "Batch edit options" button to apply the column options described above — multivalue separator, language, data type, and property privacy — to multiple CSV columns at once.

a screenshot of the Mapping tab, with two column boxes checked. On the right side of the screen, a drawer offers options for changing the settings.

Be sure to click the "Apply Changes" button at the bottom of the drawer in order to save your changes.

Item import basic settings

These settings apply to the entire CSV you are importing. Note these settings can be overwritten by column options in the Map to Omeka S data tab. If a column is mapped for template, class, or owner, those values will override these settings; so will column settings for language and privacy.

options as described below

Resource template: Select a resource template from the drop-down menu to apply to the imported items. You can use the search field at the top of the dropdown to narrow results or find a particular template. Note that resource templates may have required fields, and items will not import without all the required fields of the selected template. For example, if your spreadsheet has entries without a dcterms:title value, and the resource template requires titles, those rows will not import and errors will appear in the log.
Class: Select a class from the drop-down menu to apply to the imported items. You can use the search field at the top of the dropdown to narrow results or find a particular class.
Owner: Set the owner for the Items by selecting a user from the drop-down menu. You can use the search field at the top of the dropdown to narrow results or find a particular user.
Visibility: Set the visibility of the imported items as public or private.
Item sets: Add the imported items to a specific item set or sets using the dropdown menu.
Sites: Add the imported items to the specified site or sites. Global and user-specific default sites will be preselected here.
Multivalue separator: Enter the multivalue separator character here, if you have used one.
- The columns of data in your CSV should be separated by commas, however within those columns you can add a special character to create multiple inputs, for example a semicolon. This is where you can specify multiple creators, multiple subjects, or other common uses.
Language: Set the language of the values in the spreadsheet using the appropriate IETF Language tag.

Note

If you are uploading different formats of data (for example, some text-based creator names and some URI-based creator links) into the same field (dcterms:creator, in this case), use two columns (named something helpful like "dcterms:creator-text" and "dcterms:creator-uri") and, upon import, map those two columns to different data types. Use the wrench icon to open up column mappings and select the correct data type for each column.

Item import advanced settings

There are two options on the "Advanced Settings" tab.

Advanced settings page showing only the Action dropdown and the field for number of rows to process.

The "Action" setting allows you to change the action of process from a straight import to one of the following options:

Create a new resource: Default option. Each row in the CSV will become a new resource.
Append data to the resource: Add new data to the resource, based on an identifier for an existing resource. (Cannot be undone.) This option allows you to supply multiple values for the same item; each row will be appended (that is, you can append one title to an item in one row, and append another title to the same item in another row). Note that you cannot supply resource template or class assignations in the rows of your CSV with an Append process; you will get an error.
Revise data of the resource: Replace existing data of the resource with data from the CSV, except if the corresponding cell in the CSV is empty. (Cannot be undone.)
Update data of the resource: Replace existing data of the resource with data from the CSV, even when the corresponding cell in the CSV is empty. (Cannot be undone.)
Replace all data of the resource: Remove all properties of the resource, and fill with new information from the sheet. (Cannot be undone.)
Delete the resource: Delete all matching resources. (Cannot be undone.)

If you select one of these options from the dropdown, three additional settings will appear on the tab. These settings help the process determine which resources to take action on.

Resource identifier column: Select from a dropdown of the columns in your CSV. This is the data from your spreadsheet that identifies existing items in your Omeka S installation. Choose a unique identifier (for example, you might use the "Title" column from your CSV). This column does not need to be mapped in the other tab.
Resource identifier property: Select from a dropdown of all properties in your Omeka S installation. This should be the equivalent property in your Omeka S install to the column you selected above (for example, dcterms:title). This will only work with exact matches. If you have more than one resource with matching data, it will only take action on the oldest resource.
Action on unidentified resources: This option determines what to do when no matching resource exists in the Omeka S installation, when your selected action applies to an existing resource ("Append", "Revise", "Update", or "Replace"). This option is not used when the main action is "Create" or "Delete". Your options are:
- Skip the row and ignore its contents
- Create a new resource with the information supplied.

In addition to the above, the Advanced Settings tab has an option to set the number of rows to process by batch. By default this is set to 20. However, if you are running into errors with an import you may want to set it to 5 or even 1 in order to troubleshoot and determine the source of the error.

Note

Note that Appending data will allow you to supply multiple rows with the same identifier; each row's values will be appended alongside the ones before.

Revising, Updating, and Replacing data will erase data that was supplied in earlier rows of your CSV, if later rows use the same identifier. If you wish to import multiple values (for example, two Creator values) in these processes, you can either: put them in two columns in the same row, mapped to the same property; or, put them in a single cell and use multivalue separators. Do not forget to specify your multivalue separator in the "Basic Settings" tab and check the "Use multivalue separator" box in the options (wrench icon) for each column.

Complete import

Once you have completed mappings, column options, and settings, click the "Import" button in the upper right corner of the browser window. This should start the import and redirect you to the "Past Imports" tab. You should see a confirmation message in green at the top of the screen saying "Importing in Job ID [number]".