Request for more mime types

Hi all,

As we continue to work on upgrading our April 16 Archive to Omeka, we've identified what seems to be a problem with the mime_types in the file_lookup table. In our current database/archive, we have files representing the following mime_types, but these don't yet exist in the new file_lookup table:

application/msword
audio/mpeg
text/pdf
application/octet-stream
application/vnd.ms-powerpoint
application/pdf
audio/x-ms-wma

We could obviously add these ourselves, but it seems this would be a common need for Omeka users. In the interest of standardization, can the file_lookup table be expanded to include additional mime types, including these?

(Incidentally, I'm not sure why we have text/pdf and application/pdf - it seems we could easily standardize on one or the other.)

Thanks again for your continued efforts!

Brent

Hi Brent,
Thanks for pointing this out. We will look into this soon and get back to you.
Sheila

Brent,

The long and short of it is, it would be great to add these mime_types, but it probably won't happen anytime soon.

The mime_types that are included in the file_lookup table are ones that we know how to automatically cull information from. The ones we have included are ones that can pull in various sorts of metadata. So while we could add them to the table, it ultimately wouldn't serve much purpose until we learn how to pull in information for the types you list.

Could you help clarify what happens when an object/item is uploaded that does not match one of the existing mime_types? It seems our solution would be to either: a) replicate this outcome for the existing items in our database, or b) maintain our own customized list of mime_types. But, if we did the latter (b), are there system implications related to not knowing how to automatically handle them?

Hi, the MIME types that are currently stored in the file_meta_lookup table are primarily there for convenience sake. What happens is that we use a 3rd party library (getID3) to automatically extract extra metadata from these files and into the database (currently files_video, files_images tables). That extra metadata can be useful to have for display purposes, but it is always stored in the file regardless of whether or not it has been extracted into the database.

An example would be the width or bit depth of an image, which is useful to have available but is also coded into the file itself. If a certain image mime type was not in that file_meta_lookup table, then it just wouldn't extract any image-specific metadata for that type of file.

I see audio, pdf and MSWord MIME types on your list. Currently, Omeka doesn't know how to extract metadata from those kinds of files, so it won't make a difference if you put them in the file_meta_lookup table.

In conclusion, having more mime types in that table won't change the way Omeka works. getID3 is pretty versatile in terms of extracting metadata from files, so if you want to tinker with how it might work on audio files, feel free and let me know how it goes. I hope that helps. If this continues to be confusing, please let me know.