Using Google's Search Appliance with Omeka

I need to use Google's search appliance on my Omeka site. However, I can't use their simple Page Layout wizard. I need to put it into my website's design.

I've had little experience with PHP. Has anyone done this by creating the search form, sending it to the Google Search Appliance (my site's already indexed and I have a URL to the appliance), creating a processing page to process the XML, and then creating a results page?

I'm not totally sure of the specifics of what Google product you're using and what you're planning, but I'll point out that the site search here on Omeka.org is a Google custom search engine, and we were able to integrate it into our design by simply using the code their wizard generated and modifying the CSS.

We're using the Google Search Appliance.

The problem for my site is the code that goes in the head of my design. This is the code for our header.php file:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
	"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title><?php echo settings('site_title'); echo $title ? ' | ' . $title : ''; ?></title>

<!-- Meta -->
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="description" content="<?php echo settings('description'); ?>" />

<?php echo auto_discovery_link_tag(); ?>

<!-- Stylesheets -->
<link rel="stylesheet" media="screen" href="<?php echo html_escape(css('screen')); ?>" />
<link rel="stylesheet" media="print" href="<?php echo html_escape(css('print')); ?>" />

<!-- JavaScripts -->
<?php echo js('default'); ?>

<script type="text/javascript" src="http://africanamericanart.si.edu/themes/fromscratch/js/jquery-1.4.4.min.js"></script>

<script type="text/javascript" charset="utf-8">
jQuery.noConflict();
</script>

<script type="text/javascript" src="http://africanamericanart.si.edu/themes/fromscratch/js/slicker.js"></script>

<!--JS for limiting num of char in textfield -->
<script language="javascript" type="text/javascript">
function limitText(limitField, limitCount, limitNum) {
	if (limitField.value.length > limitNum) {
		limitField.value = limitField.value.substring(0, limitNum);
	} else {
		limitCount.value = limitNum - limitField.value.length;
	}
}
</script>

<!-- link to the JavaScript files -->
<script language="javascript" type="text/javascript" src="http://africanamericanart.si.edu/themes/fromscratch/js/hoverIntent.js"></script>
<script language="javascript" type="text/javascript" src="http://africanamericanart.si.edu/themes/fromscratch/js/superfish.js"></script> 

<!-- initialize Superfish -->
<script language="javascript" type="text/javascript"> 

    $(document).ready(function(){
        $("ul.sf-menu").superfish();
    }); 

</script>

<!-- Plugin Stuff -->
<?php echo plugin_header(); ?>

</head>
<body<?php echo $bodyid ? ' id="'.$bodyid.'"' : ''; ?><?php echo $bodyclass ? ' class="'.$bodyclass.'"' : ''; ?>>

	<div id="wrap">
		<div id="header">
<div id="site-title"><a href="http://africanamericanart.si.edu/index.php"><img src="http://africanamericanart.si.edu/themes/fromscratch/images/logo.jpg" alt="Oh Freedom!" width="460" height="85" border="0" /></a>
<div class="section_color"></div>

<div id="search-wrap">
<p><a href="/myomeka">Login</a> | <a href="/myomeka/register">Register</a></p>
<!-- Search My Google Search Appliance -->
<form method="get" action="http://si-pwebsrch02.si.edu/search">
  <table>
    <tr>
      <td>
        <input type="text" name="q" size="25" maxlength="255" value=""/>
        <input type="submit" name="btnG" value="Search"/>
        <input type="hidden" name="site" value="africanamericanart"/>
        <input type="hidden" name="client" value="africanamericanart"/>
        <input type="hidden" name="proxystylesheet" value="africanamericanart"/>
        <input type="hidden" name="output" value="xml_no_dtd"/>
      </td>
    </tr>
  </table>
</form>
<!-- End Search My Google Search Appliance-->

<!--myOmeka Search code -->
<!--
<?php echo my_omeka_user_status(); ?>

				    <?php echo simple_search(); ?>
				    <?php echo link_to_advanced_search(); ?>
-->
</div><!-- close search-wrap -->

</div><!-- close site-title -->
	<div id="primary-nav">

    			 <div class="tab_navigation">
    	<ul class="sf-menu">
    	   <li class="nav-introduction"><a href="#">Introduction</a>
    	     <ul style="width: 108px;">
    			<li class="nav-about-oh-freedom"><a href="/intro">About Oh Freedom!</a></li>
				<li class="nav-faq"><a href="/faq">FAQ</a></li>
				<li class="nav-site-guide"><a href="/site_map">Site Map</a></li>
 				<li><a href="mailto:africanamericanart@si.edu">Contact Us</a></li>
             </ul>
          </li>
		   <li class="nav-explore-history-in-art"><a href="#">Explore History in Art</a>
		   	<ul style="width: 183px;">
	           <li class="nav-timeline"><a href="/timeline">Timeline</a></li>
               <li class="nav-art"><a href="/art">Art</a></li>
               <li class="nav-artists"><a href="/artists">Artists</a></li>
               <li class="nav-collect"><a href="/myomeka/help">Collect</a></li>
           </ul>
         </li>
         <li class="nav-lesson-plans"><a href="#">Lesson Plans</a>
           <ul style="width: 110px;">
	           <li class="nav-overview"><a href="/overview">Overview</a></li>
              <li class="nav-create-a-lesson-plan"><a href="/share">Create a Lesson Plan</a></li>
              <li class="nav-view-lesson-plans"><a href="/items?type=10">View Lesson Plans</a></li>
              <li class="nav-start-a-conversation"><a href="/discuss">Start a Conversation</a></li>
           </ul>
         </li>
         <li class="nav-more-resources"><a href="#">More Resources</a>
          <ul style="width: 138px;">
	          <li class="nav-curriculum-standards"><a href="/teaching">Teaching with Oh Freedom!</a></li>
              <li class="nav-teacher-bibliography"><a href="/teacher_bibliography">Teacher Bibliography</a></li>
              <li class="nav-student-bibliography"><a href="/student_bibliography">Student Bibliography</a></li>
              <li class="nav-general-bibliography"><a href="/general_bibliography">General Bibliography</a></li>
              <li class="nav-glossary"><a href="/glossary">Glossary</a></li>
          </ul>
       </li>
    </ul>

              </div><!-- close tab-navigation -->
    </div><!-- close primary-nav -->
  </div><!-- close header-->		 	

<div id="content">

When I enter that into the Header textbox in the wizard a lot of the <head> code is put within Google's body tag. Here's what the results page code looks like:

<html><head>
   <meta name="robots" content="NOINDEX,NOFOLLOW">
   <meta http-equiv="content-type" content="text/html; charset=UTF-8">

   <title>Search Results:
      Malcolm X
   </title><style>
      <!--
body,td,div,.p,a,.d,.s{font-family:palatino, "times new roman", serif}
body,td,div,.p,a,.d{font-size: }
body,div,td,.p,.s{color:#000000}
body,.d,.p,.s{background-color:#ffffff}
.s{font-size: 80%}
.g{margin-top: 1em; margin-bottom: 1em}
.s td{width:34em}
.l{font-size: }
.l{color: #0000cc}
a:link,.w,.w a:link{color:#0000cc}
.f,.f:link,.f a:link{color:#7777cc}
a:visited,.f a:visited{color:#551a8b}
a:active,.f a:active{color:#ff0000}
.t{color:#000000}
.t{background-color:#ffffff}
.z{display:none}
.i,.i:link{color:#a90a08}
.a,.a:link{color:#008000}
div.n {margin-top: 1ex}
.n a{font-size: 10pt; color:#000000}
.n .i{font-size: 10pt; font-weight:bold}
.q a:visited,.q a:link,.q a:active,.q {color:#0000cc;}
.b,.b a{font-size: 12pt; color:#0000cc; font-weight:bold}
.d{margin-right:1em; margin-left:1em;}
div.oneboxResults {
    height: expression( this.scrollHeight > 499 ? "500px" : "auto" ); /* sets
    max-height for IE 499, 500 mismatch is to avoid IE6 freeze bug*/
    max-height:500px;
    overflow:visible;}
--></style><script type="text/javascript">
      <!--
        function resetForms() {
          for (var i = 0; i < document.forms.length; i++ ) {
              document.forms[i].reset();
          }
        }
        // Search query
	var page_query = "Malcolm+X"
	// Starting page offset, usually 0 for 1st page, 10 for 2nd, 20 for 3rd.
	var page_start = ""
	// Front end that served the page.
	var page_site = "africanamericanart"
      //--></script></head>
   <body onLoad="resetForms()" dir="ltr"> <!-- Please enter html code below. -->

      <!-- Please enter html code below. -->
      <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
      	"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

      <html xmlns="http://www.w3.org/1999/xhtml">
      <head>
      <title>Oh Freedom! | Search Results</title>

      <!-- Meta -->
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
      <meta name="description" content="" />

      <?php echo auto_discovery_link_tag(); ?>

      <!-- Stylesheets -->

      <link rel="stylesheet" media="screen" href="http://africanamericanart.si.edu/themes/fromscratch/css/google_search.css" />
      <link rel="stylesheet" media="print" href="http://africanamericanart.si.edu/themes/fromscratch/css/print.css" />
      <link rel="stylesheet" media="screen" href="http://africanamericanart.si.edu/themes/fromscratch/css/about.css" />  

      <style>
      p {text-indent:0;}

      </style>

      <!-- JavaScripts -->
      <script type="text/javascript" src="http://africanamericanart.si.edu/application/views/scripts/javascripts/prototype.js" charset="utf-8"></script>
      <script type="text/javascript" src="http://africanamericanart.si.edu/application/views/scripts/javascripts/prototype-extensions.js"
      charset="utf-8"></script>
      <script type="text/javascript" src="http://africanamericanart.si.edu/application/views/scripts/javascripts/scriptaculous.js?load=effects,dragdrop"
      charset="utf-8"></script>

      <script type="text/javascript" src="http://africanamericanart.si.edu/application/views/scripts/javascripts/search.js" charset="utf-8"></script>

      <script type="text/javascript" src="http://africanamericanart.si.edu/themes/fromscratch/js/jquery-1.4.4.min.js"></script>

      <script type="text/javascript" charset="utf-8">
      jQuery.noConflict();
      </script>

      <script type="text/javascript" src="http://africanamericanart.si.edu/themes/fromscratch/js/slicker.js"></script>

      <!--JS for limiting num of char in textfield -->
      <script language="javascript" type="text/javascript">
      function limitText(limitField, limitCount, limitNum) {
      	if (limitField.value.length > limitNum) {
      		limitField.value = limitField.value.substring(0, limitNum);
      	} else {
      		limitCount.value = limitNum - limitField.value.length;
      	}
      }
      </script>

      <!-- link to the JavaScript files -->
      <script language="javascript" type="text/javascript" src="http://africanamericanart.si.edu/themes/fromscratch/js/hoverIntent.js"></script>

      <script language="javascript" type="text/javascript" src="http://africanamericanart.si.edu/themes/fromscratch/js/superfish.js"></script>

      <!-- initialize Superfish -->
      <script language="javascript" type="text/javascript"> 

      $(document).ready(function(){
      $("ul.sf-menu").superfish();
      }); 

      </script>

      <!-- Plugin Stuff -->
      <link rel="stylesheet" media="screen" href="http://africanamericanart.si.edu/plugins/MyOmeka/views/shared/css/myomeka.css"
      />

      </head>
      <body id="search">

      	<div id="wrap">
      		<div id="header" style="width=920px;">
      <div id="site-title"><a href="http://africanamericanart.si.edu/index.php"><img src="http://africanamericanart.si.edu/themes/fromscratch/images/logo.jpg"
      alt="Oh Freedom!" width="460" height="85" border="0" /></a>
      <div class="section_color"></div>

      <div id="search-wrap">
      <p><a href="http://africanamericanart.si.edu/myomeka">Login</a> | <a href="http://africanamericanart.si.edu/myomeka/register">Register</a></p>

      <!-- Search My Google Search Appliance -->
      <form method="get" action="http://si-pwebsrch02.si.edu/search">
      <table>
      <tr>
      <td>
      <input type="text" name="q" size="25" maxlength="255" value=""/>
      <input type="submit" name="btnG" value="Search"/>
      <input type="hidden" name="site" value="africanamericanart"/>
      <input type="hidden" name="client" value="africanamericanart"/>

      <input type="hidden" name="proxystylesheet" value="africanamericanart"/>
      <input type="hidden" name="output" value="xml_no_dtd"/>
      </td>
      </tr>
      </table>
      </form>
      <!-- End Search My Google Search Appliance-->

      <!--myOmeka Search code -->
      <!--
      <?php echo my_omeka_user_status(); ?>

      				    <?php echo simple_search(); ?>
      				    <?php echo link_to_advanced_search(); ?>
      -->

      </div><!-- close search-wrap -->

      </div><!-- close site-title -->
      	<div id="primary-nav">

      			 <div class="tab_navigation">
      	<ul class="sf-menu">
      	   <li class="nav-introduction"><a href="#">Introduction</a>
      	     <ul style="width: 108px;">
      			<li class="nav-about-oh-freedom"><a href="http://africanamericanart.si.edu/intro">About Oh Freedom!</a></li>
      				<li class="nav-faq"><a href="http://africanamericanart.si.edu/faq">FAQ</a></li>

      				<li class="nav-site-guide"><a href="http://africanamericanart.si.edu/site_map">Site Map</a></li>
      				<li><a href="mailto:africanamericanart@si.edu">Contact Us</a></li>
      </ul>
      </li>
      		   <li class="nav-explore-history-in-art"><a href="#">Explore History in Art</a>
      		   	<ul style="width: 183px;">
      	           <li class="nav-timeline"><a href="http://africanamericanart.si.edu/timeline">Timeline</a></li>

      <li class="nav-art"><a href="http://africanamericanart.si.edu/art">Art</a></li>
      <li class="nav-artists"><a href="http://africanamericanart.si.edu/artists">Artists</a></li>
      <li class="nav-collect"><a href="http://africanamericanart.si.edu/myomeka/help">Collect</a></li>
      </ul>
      </li>
      <li class="nav-lesson-plans"><a href="#">Lesson Plans</a>
      <ul style="width: 110px;">

      	           <li class="nav-overview"><a href="http://africanamericanart.si.edu/overview">Overview</a></li>
      <li class="nav-create-a-lesson-plan"><a href="http://africanamericanart.si.edu/share">Create a Lesson Plan</a></li>
      <li class="nav-view-lesson-plans"><a href="http://africanamericanart.si.edu/items?type=10">View Lesson Plans</a></li>
      <li class="nav-start-a-conversation"><a href="http://africanamericanart.si.edu/discuss">Start a Conversation</a></li>
      </ul>
      </li>
      <li class="nav-more-resources"><a href="#">More Resources</a>

      <ul style="width: 138px;">
      	          <li class="nav-curriculum-standards"><a href="http://africanamericanart.si.edu/teaching">Teaching with Oh Freedom!</a></li>
      <li class="nav-teacher-bibliography"><a href="http://africanamericanart.si.edu/teacher_bibliography">Teacher Bibliography</a></li>
      <li class="nav-student-bibliography"><a href="http://africanamericanart.si.edu/student_bibliography">Student Bibliography</a></li>
      <li class="nav-general-bibliography"><a href="http://africanamericanart.si.edu/general_bibliography">General Bibliography</a></li>
      <li class="nav-glossary"><a href="http://africanamericanart.si.edu/glossary">Glossary</a></li>

      </ul>
      </li>
      </ul>

      </div><!-- close tab-navigation -->
      </div><!-- close primary-nav -->
      </div><!-- close header-->

You can see that my head code appears within their body code. Even with this non-validating code, it looks good in Firefox, Chrome, and Safari but not in IE. Take a look at this page in IE and compare it by looking at the same page in Firefox: http://si-pwebsrch02.si.edu/search?q=Malcolm+X&btnG=Search&site=africanamericanart&client=africanamericanart&proxystylesheet=africanamericanart&output=xml_no_dtd

So, Google says I need to parse the xml from the GSA and then create my results page.

Since Omeka's built-in search is not very robust, it would be helpful for Omeka users who want to use Google's Search Appliance to have a template to work with. There are many here at the Smithsonian who have done this with ColdFusion but not with PHP.

`

John, we are using the Google Search Appliance (is there more than one Google search product?).

I notice that your code for returning search queries is this:

<h1>Search</h1>
<div id="cse" style="width: 100%;">Loading</div>

<script src="http://www.google.com/jsapi" type="text/javascript"></script>
<script type="text/javascript">
  function parseQueryFromUrl () {
    var queryParamName = "q";
    var search = window.location.search.substr(1);
    var parts = search.split('&');
    for (var i = 0; i < parts.length; i++) {
      var keyvaluepair = parts[i].split('=');
      if (decodeURIComponent(keyvaluepair[0]) == queryParamName) {
        return decodeURIComponent(keyvaluepair[1].replace(/\+/g, ' '));
      }
    }
    return '';
  }
  google.load('search', '1', {language : 'en'});
  google.setOnLoadCallback(function() {
    var customSearchControl = new google.search.CustomSearchControl('017451697004202169597:kve64ytzwfy');
    customSearchControl.setResultSetSize(google.search.Search.FILTERED_CSE_RESULTSET);
    var options = new google.search.DrawOptions();
    options.setAutoComplete(true);
    customSearchControl.draw('cse', options);
    var queryFromUrl = parseQueryFromUrl();
    if (queryFromUrl) {
      customSearchControl.execute(queryFromUrl);
    }
  }, true);
</script>

What is http://www.google.com/jsapi? In the wizard I was using there is no reference to this. Can you tell give me a URL of the Google search appliance you're using? I'm just wondering if we could use this and what changes to the code above I'd have to make?

I am trying to take your code for your search results and modify it for our site. But my search results are coming up with results from your site. So I need to know what in your search result code I need to change. It isn't apparent to me.

Here's your code:

<style type="text/css">
#cse * {
    font-family: inherit !important;
}
#cse a:active, #cse a:active * { color: #800; }
#cse a:hover, #cse a:hover * { color: #600; }
#cse a, #cse a * { color: #900; }
#cse .gs-webResult div.gs-visibleUrl-short {
    display: none;
}
#cse .gs-webResult div.gs-visibleUrl-long {
    display: block;
}
#cse .gs-webResult div.gs-per-result-labels {
    display: none;
}
#cse table {
    margin-bottom: 0;
}
#cse tr {
    border: none;
}
#cse .gsc-tabHeader {
    padding: .5em;
}
#cse .gsc-resultsHeader {
    display: none;
}
#cse .gsc-cursor-box {
    margin-top: 1em;
}
#cse form.gsc-search-box {
    width: 69%;
}
</style>
<div id="primary">
<h1>Search</h1>
<div id="cse" style="width: 100%;">Loading</div>

<script src="http://www.google.com/jsapi" type="text/javascript"></script>
<script type="text/javascript">
  function parseQueryFromUrl () {
    var queryParamName = "q";
    var search = window.location.search.substr(1);
    var parts = search.split('&');
    for (var i = 0; i < parts.length; i++) {
      var keyvaluepair = parts[i].split('=');
      if (decodeURIComponent(keyvaluepair[0]) == queryParamName) {
        return decodeURIComponent(keyvaluepair[1].replace(/\+/g, ' '));
      }
    }
    return '';
  }
  google.load('search', '1', {language : 'en'});
  google.setOnLoadCallback(function() {
    var customSearchControl = new google.search.CustomSearchControl('017451697004202169597:kve64ytzwfy');
    customSearchControl.setResultSetSize(google.search.Search.FILTERED_CSE_RESULTSET);
    var options = new google.search.DrawOptions();
    options.setAutoComplete(true);
    customSearchControl.draw('cse', options);
    var queryFromUrl = parseQueryFromUrl();
    if (queryFromUrl) {
      customSearchControl.execute(queryFromUrl);
    }
  }, true);
</script>
</div>

Actually, I see the variable (the custom ID) you have for your search. And I found the Google instruction page that allows you to sign up for a custom search (for future reference it's http://www.google.com/cse/)

So I see what I need to do. Thx.

Yes, that seems to have been the source of confusion here.

Google Custom Search, which is what we use here on Omeka.org to allow unified search across our WordPress, bbPress and MediaWiki installations, uses Google's existing web index.

From what I can tell, the Google Search Appliance is a physical product that allows you to index and search internal-only content or content that needs different indexing or searching than Google provides for the web as a whole.

Google Custom Search is a very simple solution for searching pages that are web-accessible, and it also can be used for free (even without ads for non-profits). I don't have any experience with the Search Appliance, so I can't really judge the differences.

Right. These are two different solutions.

Okay, so I'm trying a different direction. I'm trying to create a custom search results page within my theme. I'm using a link to our Google Search Appliance results.

I've created a simple page called search.
This search page needs to be able to access the google search stylesheet (search.xsl). And I have a feeling it's not able to access it. When I do a search I get the search simple page but there are no results (http://africanamericanart.si.edu/search). Try doing a search to see what I mean.

I've tried to enable error logging (made application/logs/errors.log writable 666 but I get no errors).

I've placed the xsl file within the items folder.

Here is the code for the Search results simple page (you can see that I call the xsl page with the following code on this page:

$xslt = "/items/search.xsl";)

I can send you the code for both the search results page and the xsl stylesheet if you send me an email address (tried to post them here with the requisite ticks but it didn't post.

I have discovered that, indeed, Omeka is not allowing the search results to be parsed through the search.xsl stylesheet. For some reason it's invisible to Omeka.

To test for this, I placed the xsl sheet on our own servers here at SI and it works fine. But, I'd like to have everything within the Omeka instance.

Omeka people, I'd like to hear back from you on why I am unable to parse my Google stylesheet if it's located within Omeka. Thanks.

If you send files to me at patrickmjchnm@gmail.com, I'll see if we can figure out what's happening. But, when I go to the search page you linked to, I'm seeing what look like search results. Do you have an example of a search that doesn't work as expected?

Yes, Patrick, that's because I placed my XSL stylesheet on our museum's server and linked to it. When I place the stylesheet within my Omeka instance (which is not on our museum's server). I go blank results. That is, nothing. Looking at the source on the blank results page, it's showing nothing. So it's not reading the stylesheet.

I will send you the files. How do you propose to experiment. What files do you want? Just the stylesheet? I've created a simple page within Omeka for the search results. I can send you the code for that (of course, it won't have the header and footer). Is that what you want?

I'll try to recreate what you are describing on my system as closely as I can, so both the stylesheet and the code of the simple page will be a good start.

Off the top of my head, I can't think of a reason Omeka would prevent the stylesheet from loading or being processed. My first guess is that the XML and XSL modules for PHP are not on the server where your Omeka instance is, though that would also be surprising since they are pretty common modules.

I'll see what I can come up with using your files and code. Thanks!

Ah...I think I should have looked more closely at the earlier info!

I got it working okay. I'm hoping that the magic is in getting the right path in the $xslt variable so Omeka looks in the right spot. I put the xsl file in the plugins directory, then used this:

$xslt = PLUGIN_DIR . "/search.xsl";

BASE_DIR gets you to the root of the installation.

I'm hoping that with one of those constants Omeka will be able to find its way to the xsl.

It seems to be working. Thanks.