Sensible handling of repeated fields during CVS import · Legacy Forums

bkalish September 21, 2011

I'm curious as to how folks have handled the difficulty of importing data from programs which put repeated fields into a single column on export. For example, PastPerfect puts its subjects into a single column in most of its output formats, delimiting the subjects with newlines. It does the same thing for most other fields which can have more than one value.

Currently, the only way I can think to get my data into Omeka in a sensible form is to export to CSV, and then use a script to determine the maximum number of subjects per record and create a new CSV with one subject per column. This seems like a lot of work! I almost wonder if it would be easier to modify the CSV import plugin to allow the user to select a delimiter and create multiple dublin core fields to be created from a single CSV column.

patrickmj September 21, 2011

That is an oft-requested feature, and is in active development alongside the requirements for being able to export/import from Omeka.net to a standalone Omeka installation.

Thanks!

bkalish September 21, 2011

How active is this active development? I'm very tempted to try to hack together a modification myself, but I won't bother if this feature will be available in a few weeks. And if it isn't projected to be completed so soon, is there a better way for me to help than simply working on an independent fix which might duplicate or be incompatible with the work someone else is doing?

John Flatness September 21, 2011

If you're looking to get involved, your best move would probably be to take a look at the CSV Import plugin's development at its GitHub page.

You can see the current state of the development, comment on code changes, or make a pull request and submit actual code changes for suggested features or fixes.

This advice applies for any of the Omeka plugins we develop ourselves, as well as for Omeka itself.

As for the current state of this particular feature, I can't say off the top of my head with precision how long it will be until there's a release, other than to say that that "active" really does mean active. Patrick might have more to say, though.

patrickmj September 22, 2011

We'll need to have something in place by mid October to allow for importing the CSV that can be requested from Omeka.net. That will have a fairly inelegant mechanism for multiple values, but no configuration of delimiters. However, there will be instructions for making your CSV match the exported CSV from Omeka.net, which should get us half-way there.

bkalish September 23, 2011

Thanks Patrick and John!

bkalish September 23, 2011

Patrick, as I'm new to GitHub, perhaps you could explain the Switch Branches, Switch Tags, and Branch List to me. How do these work? I assume I need to go to one place for active development and another place for the stable version.

Iwe July 20, 2012

How's the work on this topic? I can't find any follow up code that suggests that this was fixed. Has anyone developed something else to be able to import some kind of repeating field / relation values?

Is it worth it to start adding to the CVS plugin, or developing something that can do this?

patrickmj July 20, 2012

Unfortunately, we haven't made much progress on abstracting the code to import from Omeka.net to any CSV file.

Iwe July 24, 2012

How about importing repeated field data into the Omeka database?