Thursday, 6 April 2017

Solr Keyword Search with hyphen not working

Hello Devs,

While doing hands on with SOLR, faced issues while searching for keywords with hyphen. Just an overview, I have created list of items in Sitecore which has below fields:

Employee Name, Employee Birthdate, Employee City and Employee Bio and I’m reading its values from my custom SOLR Index.

When we search with Employee Name (for e.g. John - Smith), it does’nt give any search results even if its present in index.

The reason is Sitecore field types of “html|rich text|single-line text|multi-line text|text|memo|image|reference” are defined as a text field and it uses solr.StandardTokenizerFactory.

This Standard Tokenizer treats whitespace and punctuation as delimiters and it discards them (except (dots)). To read more about Tokenizers click here.
First, you need to add the Field (e.g. Employee name) in

Sitecore.ContentSearch.Solr.DefaultIndexConfiguration.config under

<fieldNames hint="raw:AddFieldByFieldName"> 

section or you can create a custom index configuration file and add the require fields as below:

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <contentSearch>
      <!-- Configuration sections for indexes -->
      <indexConfigurations>
        <!-- If an index has no configuration specified, it will use the configuration below. The configuration is not merged if the index also has
         configuration, it is either this configuration or the index configuration. -->
    <defaultSolrIndexConfiguration  type="Sitecore.ContentSearch.SolrProvider.SolrIndexConfiguration,    Sitecore.ContentSearch.SolrProvider">

          <!-- DEFAULT FIELD MAPPING
               This field map allows you to take full control over how your data is stored in the index. This can affect the way data is queried, performance of searching and how data is retrieved and casted to a proper type in the API.
            -->
          <fieldMap type="Sitecore.ContentSearch.SolrProvider.SolrFieldMap, Sitecore.ContentSearch.SolrProvider">
            <fieldNames hint="raw:AddFieldByFieldName">
                   <field fieldName="Employee Name" returnType="string" />
            </fieldNames>
         </fieldMap>
        </defaultSolrIndexConfiguration>
      </indexConfigurations>
    </contentSearch>
  </sitecore>
</configuration>

In Schema.xml file for the SOLR index you created, comment out

<fieldType name="string" class="solr.StrField" sortMissingLast="true" />

and add,

<fieldType name="string" class="solr.TextField" sortMissingLast="true">
      <analyzer type="index">
        <tokenizer class="solr.KeywordTokenizerFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.KeywordTokenizerFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
      </analyzer>
 </fieldType>

Once you make all these changes restart your SOLR Service once and Re-build index. Now Search for keywords with – in it and you would see the results.

Happy Coding 😊

1 comment:

  1. It's going to be ending of mine day, but before ending I am reading this enormous article to improve my knowledge.

    ReplyDelete