Weird Omeka API Import error

Greetings all.

Today I tried to import data from a site on Omeka.net to a newly upgraded standalone Omeka site. I turned the API on at the omeka.net site and installed the plugin on the target site. All looked good.

The origin site has 196 items in one collection and 0 items in a second collection.

On the target site, I put in the API url and key then hit Submit.

It appears to import element sets, elements, item types, collections, and items, then hits an error and stops. (I glean this by repeatedly hitting the "Click to update status" link.)

Here's the weird part: when I look at the target site there are no new items. There are, however, new item types (good) and 196 new collections (bad). Given that I tried this numerous times before figuring out what was happening and then writing in to this forum, I now have nearly 2000 collections on my site--every time I clicked submit, nearly 200 new collections were created (super duper bad).

If I examine any of these new collections, it becomes clear that the API has imported each item from the origin collection not as an item but as a collection unto itself.

To see what I mean, go here: http://lgira.mesmernet.org/collections/browse

Anyone have an idea about why this happened and how to make the import work correctly?

ken.

PS: Is there any problem with me going into the database and deleting all these collections directly, rather than one at a time via the backend of the site?

Could you give a link to the origin, omeka.net site? That'll let us try the import and debug from here.

Regarding deleting the data, there should be a list of sites you have imported from at the bottom of the API Import page that will let you undo the import. If that list isn't there, then that's an additional problem for us to address.

Hi Patrick. Sure. The origin is ups.omeka.net.

Alas, there is only one item in the undo list, for the last time I submitted an import job. Because I did not undo each time (I didn't realize the nature of the problem until I'd tried to import the records several times), it seems that the ones that weren't undone right away are in there permanently now (until I tackle it with SQL).

Thanks in advance for you help!

ken.

Hmmm...a try on our end worked as expected. What version of the Omeka API Import plugin are you using?

On the undoing, even though there's just one entry it should remove everything from that API endpoint, so running that should clear everything out.

Runnning:

Omeka 2.0.4
PHP 5.3.27 (cgi-fcgi)
OS Linux 2.6.9-103.ELsmp i686
MySQL Server 5.5.30
OmekaApiImport 1.1

Unfortunately, I only ever get one "undo" item in the list, presumably for the last import I tried. If I don't use it, and instead try reimporting again, I don't then have two undo items, I still just have one. And if I use that undo, I then have none...and it doesn't delete any of the collections.

To put numbers to this:

I started with 1985 collections, ran the import, then had 2322 collections, ran the one undo that appeared, and after that completed successfully (so the message says), I still had 2322 collections.

Another weird thing I just noticed: the origin site has 335 items in it, but when I import them, it creates 337 new collections; leaving aside the item vs. collection issue, it seems like at least the number of items should be the same (i.e., 335 in origin, 335 in target).

Very mysterious....

Thanks again Patrick.

ksm.

New Info:

Upgraded to Omeka 2.2.2

Going to Omeka API Import plugin shows the following error (after taking about a minute to resolve--weird since the rest of the site works quickly):

The background.php.path in config.ini is not valid. The correct path must be set for the import to work.

Tried the import--no luck. Undo--no luck.

A forum search suggested that changing background.php.path in config.ini to /usr/local/bin/php might fix the problem, but it turned out that this was the path already there. So I deleted the path to see if Omeka would auto-detect it--nope, still throws the same error.

Finally, I went in PHPmyAdmin to delete the 2300+ bogus collections and discovered that I don't know what other table omeka_collections is connected to. Omeka_collections itself has 2322 records, but there's almost no data in each record, so there must be another table linked to omeka_collections with a key. Emptying omeka_collections, then, still leaves my database full of junk somewhere else, which I'd hate to do....

All ideas and insights welcome.

ken.

Update:

All that junk data is being sent to omeka_element_texts which currently has 38,000+ records in it.

Is scrubbing it as easy as deleting the bad "collection" records in omeka_element_texts and deleting all the bad collections from omeka_collections?

Clearly something in the plugin is causing records to be marked as collections instead of items.

Alternative: can change all the record types in omeka_element_texts that are mislabled "collection" to "item" and solve the problem (apart from a jillion duplicates)?

ken.

You might check your applications/conifg folder and make sure the right config.ini document is being used.

I'm not sure why your items are coming in as collections - when I tried pulling, I got two collections and a large number of items.

You should only have one "undo" button per api location, but it should have deleted collections as well as items. Have you tried uninstalling and reinstalling the plugin, now that you've updated your core Omeka install?

Weirder and weirder!

On the php path, if you have ssh access to the server doing which php should tell you the correct path.

On cleaning up the database, it might be quickest to just drop all the tables and reinstall if there's nothing else in the database.

If you go through individual tables, you'd want to empty out collections and element_texts for collections (changing the record_type column on element_texts won't work, since the ids won't match).

In addition to uninstalling/reinstalling the plugin, it might also be worth uninstalling it, it might be worth investigating if there is a conflict with another plugin by deactivating all the other plugins and trying the import.

Okay, here's the latest.

I deleted all the bad collections.

I uninstalled then reinstalled the API Import plugin.

I adjusted the background.php.path parameter in /application/config/config.ini to be /usr/local/bin/php. (This was to address the error message I got after installing the plugin that read: "The background.php.path in config.ini is not valid. The correct path must be set for the import to work.")

I ran the import sequence again.

No error messages were generated but the plugin clearly got stuck on the "Importing Collections" phase. Over the course of an hour, I hit the update status link about ten times and each time it returned "Importing Collections."

After an hour, I looked at the items and collections pages. The result: 1 collection was added, but that "collection" was really just a single item (like before, but only one collection instead of hundreds). No items were imported.

I clicked on the "Undo Previous Imports" check box and clicked submit. The listed import label next to the check box disappeared, but the single new collection/item remained. In other words, nothing was undone.

I repeated the import steps once again to see if anything would change on a second run-through. It didn't.

I then looked at omeka_element_texts and discovered that each of the two times had just tried the import, 9 records were added to the table marked as Record Type "Collection." In other words there are two nearly identical sets of entries--9 records each--in which only the ID and Record ID fields differ.

Finally, I disabled all my plugins except for the Omeka Import API and tried again.

Success! Three collections with the correct items within them!

So there's a conflict with one of the plugins. These are the ones I have installed:

COinS
Collection Tree
Commenting
CSV Import
Derivative Images
Docs Viewer
Dropbox
Dublin Core Extended
Embed Codes
Exhibit Builder
Geolocation
Hide Elements
HTML5 Media
Omeka Api Import
PDF Text
Record Relations
Reports
Search By Metadata
SimpleContactForm
Simple Pages
Simple Vocab

I'll see if I can figure out which one is the culprit and add it to this thread later.

ken.

Alright, I reactivated:

Embed Codes
Exhibit Builder
Geolocation
Hide Elements
HTML5 Media
Omeka Api Import
PDF Text
Record Relations
Reports
Search By Metadata
SimpleContactForm
Simple Pages
Simple Vocab

and re-ran the import. Smooth sailing. Because of the (brilliant) way that the API import plugin is written, it doesn't create duplicates, so that makes the testing much easier.

I then reactivated:

COinS
Commenting
Derivative Images
Docs Viewer
Dropbox

Result: No problem.

Finally, I reactivated the three plugins that seemed to me most likely to be causing trouble:

CSV Import (no problem)

Dublin Core Extended (No problem)

Collection Tree (Froze on Importing Collections, but did not add any additional collections or items)

So there's the conflict: the Omeka Import API conflicts with the Collection Tree plugin for some reason.

Hope this helps.

ken.

Excellent diagnostic work! Helps immensely!

Collection Tree does a lot of complex work, and it doesn't tap into the API, so its data can't be pulled into a new site.

There are a couple different places where I can see a conflict happening, but it'll take some digging to find, confirm, and test them.

The workaround for the foreseeable future, then, is probably what you discovered: deactivate it during an import, and reactivate it afterward.

Many thanks, and glad it got sorted in the end!

Happy to help the cause.

And for the record, Collection Tree does seem to work just fine after reactivating it.

Cheers,

ksm.