Simple Search Workaround

I noticed the simple search is very literal. For example the searches “doctor 1938” and “doctors 1938” will probably return a different number of results in a different order. The first search looks first for doctor, while the second search looks for doctors. Sometimes relevant records are buried a few pages into the search results.

As a potential workaround I’ve been thinking of adding variations of keywords for items into a metadata field that I don’t use. The field would then be set to not display. I don’t think that all search terms belong in the subject, description, tags, or other relevant fields, but it might be good to have them somewhere in the record. To improve the search above, I’d add the words doctor, doctors, dr., drs. to a field I’m not currently using for records that contain these terms.

Has anyone ever done something like this to improve search results? Are there any potential problems or issues that I should take into consideration? Also I was hoping to use the batch edit feature in the admin, but it looks like it you can’t use the batch edit to add data to a field like coverage. Is there a way to change that page to include another metadata field? Also is there a way to batch edit more than 10 items at a time?

There are definitely some serious limitations to Omeka's default search. This post summarizes these and provides an explanation.

I have similar issues. For example, I have lots of DC titles that look like this, "Der Film Nr. 10". So what gets indexed and searched against is "film" and nothing else. Not at all good. Since I'm using a web host I can't change my global MySQL settings (see link above) to work around these issues. Hopefully when MySQL 5.6 becomes widely available Omeka (v2?) will be able to take advantage of its improved full-text search features. This would improve though certainly not fix everything.

Much like your workaround, I've thought of putting in a second DC Title field with something like DerFilm010 and then ask people to search against that instead but in the end I think I will either add a Google Custom Search box on a separate page (that's easiest, though not super elegant) or else look into using Solr. Since I'm using Omeka for my personal use this would cost money. I'm now going to find out if I can run a pre-packaged Solr virtual appliance (e.g. Blaze) on Amazon for this purpose at no or little cost. A free (with restrictions) online indexing service like IndexTank is another option if an Omeka plugin for the service were available.

Regarding your batch edit question, I believe there is a workaround to this using CSV import. i.e. placing your changes in a spreadsheet, exporting to CSV and then reimporting to Omeka. A bit onerous but worthwhile if there are many changes. I believe this way of using the CSV import plugin is briefly covered in its video tutorial but perhaps an Omeka staffer many have a better solution for you too.

I need to re-emphasize something I missed when first reading over the above post explaining the search situation. If you use the advanced search option and specify a filter like DC Title, Omeka switches over to MySQL's regex search w/Boolean options. Now I can enter something like, "search for an exact match for "Der Film Nr. 4" in the Dublin Core title field only" and Omeka will find it. Partial matching works too as expected.

Why didn't I discover this before by myself? I was distracted by the interface. Take a look at the advanced search interface inside Omeka when logged in as super. The distinction between search by keywords and narrow search by specific fields is clear. The fact that the Add a Field button is in green helps a lot, as does the fact that its text field doesn't appear until you try to actually add a custom field. One could quibble about the "Select Below" on the pop-up. "Select from List" may be slightly better.

Now compare this to the advanced search in the regular (non-admin) interface. I had always assumed that the text field above "Add a field" was for Add a Field. And that the "Search for Keyword" text field on top should also be used with the field delimiters. This is partly due to layout and coloring (look again at the admin search if you think it shouldn't make a difference) and partly because I wasn't yet aware that there was both a full-text search and a separate regex search with different features & limitations etc. Besides, the labeling & layout suggests that you should first search by keyword(s) and then narrow by fields. Well, this won't work if your initial search term has less than four characters or has the misfortune to be on MySQL's list of stop-words.

I've learned something here and now I can add this knowledge to my site and guide my users to better deploy the various search options. If this is persuasive maybe the Omeka team could revisit how the search page is structured and labelled to help avoid this type of confusion for others as well. A short gloss (as seen below the DC fields input fields) might be worth considering as well. I found these to be unobtrusive and very useful.