Plugins/Text Analysis

version 2.0

The Text Analysis plugin connects your Omeka Classic site with Watson Natural Language Understanding to enable text analysis on individual items. You must create an IBM Bluemix account to use this plugin.

Because Bluemix includes paid plans, only SuperUsers and Admin users can make use of the functions of the text analysis plugin.

Configuration

After you have installed the plugin, go to the Plugins tab in the upper navigation, and scroll down to Text Analysis. Click the blue Configure button.

On the Configure Plugin page, enter your Bluemix username and password.

Configuration settings

Be sure to save changes after entering the API key.

Using Text Analysis

Note: Because usage may incur costs with IBM Bluemix, use of this plugin is restricted to SuperUsers and Admin users only.

Once you have saved the API key, you can view text analysis on any item in your collection.

Select an Item

Go to the Items tab on the left hand navigation and click on the title of any item (this plugin will work best with longer text blocks). Note that you want to click on the title and not the edit button so that you are looking at the items/show page, not the items/edit page.

On the items/show page, the bottom option of the right hand blocks (below the Edit/View Public/Delete buttons, etc) is a Text Analysis block.

Large red arrow points to the Text Analysis block

Select an element from the dropdown in the Text Analysis block; you should be able to select any Dublin Core or Item Type element that has text entered in that item.

The checkboxes below the dropdown menu allow you to limit which features to analyze. By default they are all checked, but you can uncheck up to three if you want. Your options are (see below for more information):

  • Entities
  • Keywords
  • Categories
  • Concepts

Once you have chosen an element, click the Analyze This Element button.

Text Analysis block close up

View Analysis Results

The Text Analysis page will load. It has the following tabs, most of which correspond to various AlchemyLanguage features:

Overview

The first section summarizes the item and element analyzed, the word count for the element along with number of Unique words, the character count, text size, and the full text of the element being analyzed Overview for sample item

Frequencies

The frequencies is a table with each word, the total count within the element, and the relative frequency of the word in relation to the total word count.

Entities

Entities displays a table of named entities in the text, with columns for entity, type, sentiment, count, and relevance. When possible, entites are linked to external sources including maps, DBpedia, and the official website if it can be determined.

Entities table for Pride and Prejudice text sample

The following image has an example of an entity with external links:

Textanalysis ent2.png

Taxonomy

This section applies the AlchemyAPI taxonomy grouping labels to your content, with a confidence marker and a score.

Textanalysis tax.png

Concepts

In this section, concepts are given in a table with the concept and relevance; the words and phrases are generated by computation extraction and may not be referenced in the text. For more on how concepts work, see the documentation from AlchemyAPI.

Textanalysis concept.png

Keywords

This section include keywords, the associated sentiment based on context, and relevance in the overall text sample.

Textanalysis kw.png