Adding item search text from a plugin

Hi,

my assistant and I created a plugin that implements the Display filter for some Metadata elements. While a complete description of our plugin (which is still in development as I am typing this) would go beyond the scope of this forum post, let's just say for now that we use the Display filter to change and enrich the element's output a great deal.

As the displayed element now differs from the database representation and contains more information than before, we would like to modify its footprint in Omeka's in-site search. We figured that it all boils down to what's stored in the omeka_search_texts database table – which is filled by the item's addSearchText() calls.

However, we couldn't figure out a proper way to carry this out. We added our own addSearchText() calls to the hookAfterSaveItem() – like this (shorted code):

$itemId = intval($args["record"]["id"]);
$item = get_record_by_id('Item', $itemId);
$item->addSearchText("... our enriched search content …");
$item->save();

… and this kind of worked – until we had a second plugin trying to do the exact same thing: This lead to the situation that only the plugin which's hookAfterSaveItem() was called last was able to add search text, purging the search text that the first plugin had added just a moment before.

Even worse: If a different piece of code use something like (shortened)

update_item($itemId);

none of our hookAfterSaveItem() methods were called – leading to the situation that none of our plugins was able to add search text.

What else can we do? We combed through Omeka's core code, trying to figure out how element texts and tag texts are added to the search text. We figured that mixins might be the right way to go – question is: Can we implement a mixing that adds search text to the existing record type "Item"?

Thank you very much for any pointer where to go from here.

I actually had a similar problem recently. The sequence of the element texts being added to the search text is a little tricky. And it's a little weirder, because the search text will get completely rewritten each time the afterSave function is called. I suspect that's why having two plugins clobber each other.

An alternate approach is to operate directly on the search text records in the afterSave hook like this (this mimics what the Search mixin does):

//look up the existing search text
$searchText = $this->_db->getTable('SearchText')->findByRecord('Item', $item->id);

// searchText should already exist, but if something goes wrong, create it
if (!$searchText) {
            $searchText = new SearchText;
            $searchText->record_type = 'Item;
            $searchText->record_id = $item->id;
            $searchText->public = $item->public;
            $searchText->title = metadata($item, array('Dublin Core', 'Title'));
}
$searchText->text .= ' ' . $enrichedSearchTexts;
$searchText->save();

This will look up the search text record for the item (or create it if for some reason it doesn't exist). Then you can directly add your $enrichedSearchTexts.

That is great in theory - and I definitely considered modifying the omeka_search_texts table manually from within hookAfterSave(), although I thought it might be cleaner to use the regular API.

However – do me a favor and try calling update_item($itemId). No After Save hook is being fired, I couldn't believe it myself either. Albeit, the regular search text including all metadata and tags is being recreated, thus purging all previously manually added search texts.

Question is: Can I fire all hookAfterSave() hooks with the help of fire_plugin_hook()? Additional question: What if I am already inside a hookAfterSave()? ;-)

That direct modification of the search texts table is, if not the only way at it, the most straightforward way. That's because of the sequence of actions taken after saving. First, the Item model runs through it's mixins. One mixin, ElementText, collects all the search texts based just on the elements, then the Search mixin saves that to a new search text record in the table. All that happens _before_ any plugin hooks are run.

That explains the behavior in your 'however' paragraph, if I'm following right -- the plugin hook isn't fired, but all that happens behind the scenes (see the save() and runCallbacks() functions in application/libraries/Omeka/Record/AbstractRecord.php for the gory details).

So, the SearchText for the item is "updated" with just the element info, clobbering your enhancements before the plugin hooks are run at all. There's no way around rebuilding your enhanced texts each save in hookAfterSaveItem().

To the last question, that seems like there be dragons that way. fire_plugin_hook()' is much more intended for use in a theme, for example to display plugin content for the public_items_show hook. with directly (re)changing the search text in the after save plugin hook, it shouldn't be necessary.

Thank you very much - indeed, this reflects my findings (that I gained from creating debug log outouts inside the core).

So my question remains: If update_item() is used, it will re-generate the standard search text entry, clobbering all enhanced text manually added to the respective record before.

As I use update_item() (as you've probably already guessed - in a third plugin, but not inside an afterSave hook), after doing so I should find a way to fire the other plugins' afterSave hooks to let them do their magic to enhance the search text. If not through fire_plugin_hook, then perhaps through actually saving that item.

I'll look into that - thanks again!

... In case of interest what in heaven's name we're doing here:

The "enhanced search content" derives from two plugins:

1. Our highly enhanced ItemRelations that lets you enter a short text annotation to a relation - which should go into the searchable text.

2. Our latest invention: Item References, which modifies an element to carry just that - references to other items. Storing item IDs as numerical text, but being displayed as clickable titles, linked to the referenced items; obviously, the referenced item titles should go into the searchable text.
... Which is only half the treat, the bonus being that the referring item can now draw a Google map of all the referenced items' geo locations, i.e. multiple markers in one map. This is actually pretty neat.

PLUS - as in addition to just draw multiple marks, we can also draw a multi-segment line from marker to marker (which is incredibly useful to visualize a sequence of locations), it became necessary to re-order references, in other words: re-order texts stored as multipe values in one element. ... Which currently leads to named update_item() to trigger the re-generation of the item's (unfortunately) un-enhanced search text.

That does sound interesting!

Without running through your code or something that tries to mimic it, yes, I'd expect update_item to clobber enhanced data. But, I'd also expect the afterSaveItem hooks to also run, to rebuild that. Unless! There might be a complexity to the sequence. If update_item is invoked in an afterSaveItem hook, and that happens to run last in the sequence of plugin hooks, that would undo everything.

Even though separation of work via different plugins makes sense (and is my usual impulse), some of these quirks we're uncovering might make one big plugin make sense. Hard to know without details, and sounds like a judgement call.

I don't think it's really intended to be used this way, but get_specific_plugin_hook_output() might be another approach. That lets you run a specific plugin's hook. The intent is to get the output, but if it just does something and doesn't return anything, it'll still run, and that might be closer to what you need.

Short answer: No, update_item is not called from a plugin's afterSaveItem, but from within an index controller. (That plugin is performing a couple of actions between some dialog screens.)

Even shorter answer: Now that I'm back in my office, I'll dig deeper into what you told me. – Seriously, thank you very much for the input, I'll keep you in the loop.

So yes – expectedly, using your implementation to manually add search text works fine to add from two different afterSaveItem hooks without clobbering the previous enriched text.

But no – update_item() does not seem to fire the afterSaveItem hooks. *headscratch*

------------

Next experiment: Replace

update_item($itemId);

with

$item = get_record_by_id('Item', $itemId);
$item->save();

No luck either.

... But then it dawned on me: In fact, both update_item and $item->save() do very well fire the afterSaveItem hooks. However, the $args object is empty, and especially $args['post'] as well. Consequently, the usual first line in every afterSaveItem hook

if (!$args['post']) { return; }

catapults us right out of it.

... Isn't it great that each problem, once identified, presents a new one with a nice red bow around it? ;-)

I established a workaround in which I do no longer have to use either update_item() or $item->save(), so my former afterSaveItem() additions to the search texts are not in danger of being clobbered anymore.

… Still: What if someone else uses these methods? Wouldn't all afterSaveItem() hooks be useless if their $arg object is empty?

UPDATE: Sorry for spamming your inbox …

Well – $args is not undefined, it's definitely there, in both cases (i.e. update_item and $item->save().

I'm still trying to figure out how to make my afterSaveItem hooks work without $args['post'] – which fortunately is trivial most of the time.

$args will always be there, and $args['post'] will always be there, if only set to false.

It sounds like somehow sometimes there isn't POST data getting passed around. Is the code you are working on available somewhere? This sounds like we're getting to the level where that will help, especially to see the details of what you are aiming to accomplish.

Something else that might be helpful, $args should have $args['insert'] set to true for new items, and false if it is being updated.

Yes, I don't know what I said that $args wasn't there.

The problems have all been solved by now:

1. $args['post'] inside the afterSaveHook is indeed only necessary when reacting on clicking the "Save" button, to modify / extend data updates.

2. But I can easily retrieve $args["record"]["id"] even if afterSaveItem is invoked via $item->save() or update_item().

3. Together with your implementation of separate implementation of addSearchText, I now have my afterSaveItem hooks in order, re-creating all my enriched search text inside omeka_search_texts.

… Still: You can see a lot of my stuff (everything I've mentioned above) at https://github.com/GerZah – which is:

– Enhanced Item Relations
https://github.com/GerZah/plugin-ItemRelations
— providing an item select screen instead of the the edit field for the numerical ID
— supporting for multiple user-defined vocabularies
— allowing to enter and later edit comments to each relationship that is created

– Enhanced Geo Locations
https://github.com/GerZah/plugin-Geolocation
— supporting map overlays, e.g. to have historical maps

– Item References
https://github.com/GerZah/plugin-ItemReferences
— supporting the definition 1st and 2nd level item references
— collecting referenced geo locations to display them in a map together (also 2nd level)
— … supports Enhanced GeoLocations and its map overlays
— … requires Enhanced Item Relations (as it is re-using the item select screen to create references)

– Reorder Element Texts
https://github.com/GerZah/plugin-ReorderTextElements
— to be able to re-sort multiple text fields of one element
— which is useful to re-order item references to create a specific reference order
— … supports Item References

… And there's more that might be useful for others:

– Conditional Elements
https://github.com/GerZah/plugin-ConditionalElements
— show or hide metadata fields based on selections on higher level metadata
— thus removing clutter from item types with lots and lots of fields that don't apply all the times
— … requires Simple Vocabulary, as conditions are based on dropdown list element values

– Reassign Files
https://github.com/GerZah/plugin-ReassignFiles
— move uploaded files from one item to another
— which can be extremely useful to move multiple files into one item after bulk-uploading using Dropbox

– Date Search / Range Search / Measurement Search
https://github.com/GerZah/plugin-DateSearch
https://github.com/GerZah/plugin-RangeSearch
https://github.com/GerZah/plugin-MeasurementSearch
— These three related plugins, especially the latter two are rather specific for our use case.
— However, most prominently using DateSearch as an example:
—— providing a date-picker to select dates and timespans to be placed inside metadata edit fields
—— supporting both Gregorian and Julian dates
—— enabling a date search (hence the name) that goes beyond full-text-search
—— Example: search for a timespan (like 1980-1990) and also find items with e.g. 1985
— Range Search supports value triples with units including conversion, plus search beyond FTS.
— Measurement Search supports WxHxD triples and can search beyond FTS.

Wow! Lots of interesting stuff. We're actually just about to do some outreach work to learn more, and potentially share more, about the kind of work people are doing (post is scheduled for today). Would love to hear more.

I'd love to see a couple if our plugins on the official list. I think the more generic one like Conditional Elements, Item References, Reassign Files, and Reorder Text Elements could be really useful for a lot of people. ... Item Reference is hard to separate from our version of Item Relations, as it shares the item ID picker.

The enhanced versions of Item Relations and Geo Locations might also deliver improvements that the interested public might gain from, but propbably not in the form of forks. They derived from concrete requirements in our project, so did the more specific ones, *-Search. Still, I'm kind of proud on the data processing in them that has so much more fuctionality than plain full text search.