OAI-PMH harvesting problem

Hi, I'm trying out the OAI-PMH plugin (current version) on Omeka 1.42 in MAMP (current version) under OSX 10.7.2. I've checked my PHP path (v5.3.6) and set the plugin memory limit initially at 256MB and now at -1.

My issue is that I've not had any luck harvesting repositories. The two I've tried so far are: http://www.helmholtz-berlin.de/pubbin/oai and http://gdz.sub.uni-goettingen.de/oai2/ In both cases I can see the relevant sets and the plugin recognized a link to the oai_dc scheme. But when I selected a set for loading nothing appears to happen after several minutes. What I see in the status view is e.g.:

ID 3
Set Spec blumenbachiana
Set Name Blumenbachiana
Metadata Prefix oai_dc
Base URL http://gdz.sub.uni-goettingen.de/oai2/
Status Starting
Initiated 2011-11-03 14:13:15
Completed [not completed]

Is there a problem with the two listed OAI repositories? Something else?

Many thanks.

Interesting. I was able to do an import from the second one, but not the first. For the first, it didn't recognize a mapping for the fields.

But to what you are seeing, it might just be the fact that the interface doesn't automatically update via AJAX, and so you only see the progress when you reload the page. So the first thing to check is to go back to the OAI-PMH Harvester tab and see if it reports progress, or just check under Items to see if they really did show up.

The other possibility might be in the configuration of the plugin to make sure that you have the correct path to the PHP Command Line Interface.

Thank you for the quick reply.

I reloaded page a few times so I don't think it's an AJAX issue. I confirmed the path was correct via "which php" and by checking the version of PHP at that path. However, I wonder if MAMP might be responsible?

I'd really love to get this to work so if there is any further debugging etc. I can do please just let me know.

The MAMP situation is one where it's very easy to pick the "wrong" PHP binary, because with MAMP installed, there's often 3 different PHP binaries:

  • The one that came with Mac OS in /usr/bin
  • MAMP's php-5.2 in /Applications/MAMP/bin/php5.2/bin/
  • MAMP's php-5.3 in /Applications/MAMP/bin/php5.3/bin/

If you're running Omeka on MAMP, you want to pick the MAMP path for the version of PHP you're using, and set that one as the value for background.php.path in Omeka's application/config/config.ini.

Aha.. yes! That was the issue. Thanks. With the very latest version of MAMP the correct 5.3.x PHP path is:

/Applications/MAMP/bin/php/php5.3.6/bin/php

I entered this and tried to harvest again. I'm now seeing this error:

ID 1
Set Spec blumenbachiana
Set Name Blumenbachiana
Metadata Prefix oai_dc
Base URL http://gdz.sub.uni-goettingen.de/oai2/
Status In Progress
Initiated 2011-11-03 17:34:33
Completed [not completed]
Status Messages Error: OaipmhHarvester_Harvest_OaiDc::harvestRecord(): Node no longer exists in /Applications/MAMP/htdocs/omeka142/plugins/OaipmhHarvester/libraries/OaipmhHarvester/Harvest/OaiDc.php on line 68 (2011-11-03 17:34:35)

The process completed with 94 items but all fields are blank and all Items are untitled. I should add that I also have the extended DC plugin installed (don't know if that would make a difference).

Could I ask one of the Omeka devs to please take a second look at this to see if the problem lies with the data source or with the plugin?

Many thanks in advance.

I tried a harvest with version 1.0 of the OAI-PMH Harvester, using the same base URL and set that you posted, and got 94 items, all with completely filled-in metadata.

It worked this time - thanks for checking it as well.