Automating item creation with API and YAML

I've just upgraded to Omeka 2.1 largely because I wanted to play with the API. I'm using an Omeka installation to organize primary sources for a personal research project, and I'd like to streamline the item entry process, which typically involved lots of copying and pasting from a boilerplate file into the web form.

What I have in mind is a Python script that could use the API client to take a file of notes that I have prepared and create items in my database using POST requests.

I would (ideally) use YAML to keep my notes on the items, and then use either the Python script or a snippet in my text editor to fill in all of the fields that stay largely the same across items. The script would convert the YAML to JSON for posting, too.

In theory I think I see how this will work, but since I'm new to the API, I'm curious how much information the POST request has to contain. You can see form an example idem in my site that I don't use very many of the DC fields:

http://wcaleb.rice.edu/omeka/items/show/70

What I don't know for sure is whether I would have to, for example, create the full elements_set and elements tree for each item, or just part of it. Any pointers to resources that would help would be appreciated.

To clarify, I'm not asking for help with the Python script or YAML-to-JSON part of this. I'm more trying to get a sense of the minimum that a POST request would have to include in order to create items roughly like the one I linked to.

From what I understand and saw on twitter, you've pretty much got it, I think. But I'll try to recap here to pull it together and let others see more.

So, you have some boilerplate fields. Looks like DC Rights and DC Publisher (maybe more) from your example.

If you want to just create a pile of new Items in Omeka with the boilerplate data filled in, what you'd need is to figure out the ids of the boilerplate elements, then fill in the text value and POST them up.

There are lots of ways to dig up the ids of the elements. If you are happy digging around the database, you can find the ids directly that way. Alternatively, as it seems like you found, you can get a list of the elements at api/elements. From what you posted on twitter, this does the trick: http://wcaleb.rice.edu/omeka/api/elements?pretty_print

You can read through that to find the ids you need. If you know the id of the element set (e.g., Dublin Core), you can also narrow things down with that: http://wcaleb.rice.edu/omeka/api/elements?pretty_print&element_set=1

If there are a lot of elements, you might need to page through: http://wcaleb.rice.edu/omeka/api/elements?pretty_print&page=2

Once you get the element id(s), the POST is pretty straightforward to create some mostly empty items with the boilerplate:

{
  "element_texts": [
    {
      "html": false,
      "text": "Published by me",
      "element": {"id": 45}
    },
    {
        "html": false,
        "text": "Fight! For your right! To Faaaaiiiiir Use!",
        "element" : {"id": 47}
    }
  ]
}

POSTing that JSON will create an item with those elements. In my db, element id 47 is DC Rights and 45 is DC Publisher.

That seems to be the minimal thing. It sounds like your YAML might have additional data, but that'd just mean digging it up and figuring out the element ids to map it onto.

Hope that helps
Patrick

Thanks for the help, Patrick!

One additional question: you said, "In my db, element id 47 is DC Rights ..." Does that mean that element ids vary by database, so that the id for a Dublin Core field for me might not be the same as for someone else?

It's a fairly rare case that there would be variation from installation/db to installation/db, but I think it's a possibility. It's probably an overabundance of caution to double-check for each installation, but if I remember right some weird cases could produce a difference.

In case anyone else comes by, here's the script I'm working on.

The script also includes a Python dictionary mapping all of the element ids to their names, but per Patrick's last point, I'll probably need to adjust the script so that it gets this information automatically from the specific installation of Omeka.