CSV import and downloading files with spaces in the filename/URL

I'm trying to import a list of documents into Omeka using the CSV import plugin. The CSV file is well-formed and if the "Stop Import If A File For An Item Cannot Be Downloaded?" option is un-checked, all records are created. However, one of the fields is a URL to a file I would like to import into Omeka, and while the URL to the original file was shown, no file was actually imported.

After undoing and re-running the import with the stop at errors option checked, the import stops at the first record, claiming both "Import Error: Invalid File Download" and
"URL is not readable or does not exist:". Copying the offending URL into a new browser tab shows that the URL does indeed work correctly, and the PDF displays.

The first record contains a URL which contains spaces in the filename, which is what caused the problem. I rearranged my CSV file to make sure that the first record has no spaces in the URL and it worked fine, before stopping at the second record, whose filename contained spaces.

Is there a workaround for this?

Many thanks,

Tom

You should be able to replace any spaces in your URLs with %20.

You won't have to change where your files are actually located, just replacing the spaces with %20 in your CSV file should solve the problem.

Thanks John! I really should have known that.. Yesterday was a bad day ;-)

When the import was running, I did get a few problems. One was that the import stopped on an error, but the "undo" link was hidden at the offending URL was very long, and pushed the link out of the visible part of the div. It might be a good idea to put the undo link underneath an error message? That said, I was able to get the link from the source, so all was fine.

The second is to do with the display of imported file names. I am importing them from another website, and in the "Document Item Type Metadata" and "Download Associated Files" sections the full (old) path of the file is displayed in both admin and public views, but the actual underlying hyperlink contains the correct path to the imported file on the local Omeka server.

The other is to work out why the import silently pauses without error or sign of continuing. Perhaps that's a RAM/PHP memory limit issue (PHP's memory limit is 256MB which is quite a lot?). One to investigate...

Thanks for your help,

Tom

If I don't add the files, then all records import fine. I'm now using the Dropbox module to associate the files to the records which works well, as some records actually have a number of files attached to them.

This is what I'm doing in case it's of use to anyone:

Import records using CSV import.
Upload files to Dropbox module's folder.
Edit a record, go into the files tab, select the file(s) from the list.

It might be a good idea to put the undo link underneath an error message? That said, I was able to get the link from the source, so all was fine.

Thanks for this suggestion, Tom. We'll add a ticket for it in Trac.

As for the problem with display of imported file names, I don't quite follow. Are the links to the imported files not working, or something else?

As for the silent pause, it might be that one of the files you're trying to import is bigger than your server's POST limit for file uploads. The Dropbox plugin was designed to get around this limitation, since you have to FTP the files to your server first. Not sure this actually is the issue, but it seems the most likely.

Best,
Jeremy

Hi Jeremy,

Thanks for your reply. I hadn't thought about the POST limit, I will check that on a future import.

As to the file names (the ones that did import successfully):
On the public view of an item, it should display the filename of an attached file (filename.pdf). Clicking the file downloads it. However - for me, the anchor text filename.pdf displays as http://oldurl.com/documents/filename.pdf - the underlying hyperlink is to the correct myomekainstall.com/archive/filename.pdf
e.g.

<a href="http://myomekainstall.com/archive/files/filename.pdf">http://oldurl.com/documents/filename.pdf</a>

But I am now going down the route of using Dropbox and CSV import, so proceeding with my test collection that way.

Many thanks,

Tom