Sitecore Azure Search overview

Abstract

Introducing Sitecore Azure Search, helpful for Sitecore administrators to read before installation or use.

Caution

Azure Cognitive Search will be discontinued in the future and Sitecore will no longer provide support for this service in future releases.

The Sitecore Azure Search provider integrates the Sitecore Search engine with the Microsoft Azure Cognitive Search service. The Microsoft Azure Cognitive Search service is a part of the Microsoft Azure computing platform, you can read more about the Microsoft Azure Cognitive Search service on their website.

This topic applies to Sitecore Experience Platform 8.2 Update-1 and later and describes the features and limitations of Azure Cognitive Search.

The Microsoft Azure Cognitive Search service provides the following features:

  • Extreme scalability, simplicity, and stability.

  • A highly available infrastructure with 99.95% uptime as a part of the Microsoft Azure service level agreement (SLA).

  • An easy way to scale up and scale down as needed.

The Sitecore Azure Search provider includes the following features:

  • Support for all Sitecore search-driven UIs, including user-typed queries, and faceted searches.

  • Support for the majority of LINQ expressions, to enable rapid development of search-powered applications.

  • Native support for fundamental data types such as numbers and dates in faceting, and range queries.

  • Flexible configuration and precise control over the schema of the indexes.

  • Support for running Sitecore in geo-replicated scenarios.

Note

Sitecore Azure Search behaves slightly differently from the Lucene and Solr search providers; this is important to consider if you are going to switch between search providers. Read more about Sitecore Azure Search limitations and behavioral differences in the Limitations of Azure Cognitive Search section.

Sitecore Azure Search is the default provider for Sitecore instances that are deployed using the Sitecore Azure SDK. It supports on premise and IaaS deployments. Follow the instructions in Walkthrough: Configure Azure Cognitive Search to configure Sitecore Azure Search.

Compared with Sitecore Search on Lucene and Solr, Sitecore Search on Azure Cognitive Search has several limitations:

Limitation

Description

Automatic tokenization by the Azure Cognitive Search service of document field values and queries when searching and faceting.

This means that:

  • Substring searches that are limited to a single term, for instance, predicates, .StartsWith(), .EndsWith() and .Contains(), will match parts of terms, and will match terms that are located in any part of the field value. When multiple terms are passed, each term is searched separately, (this can provide more results than expected).

  • Regular expressions spanning multiple terms (containing spaces) returns 0 results.

  • Multiple terms that are passed to .Wildcard() are interpreted as individual wildcards in a field-scoped query.

  • The facet values are calculated based on individual terms in faceted fields, not on whole field values, when a value contains multiple words, (unlike Lucene and Solr).

Date-time and numeric values

The Azure Cognitive Search service stores date-time and numeric values as native types and only allows filtering on these fields. Search and filter parts can only be combined with the logical operator AND (&&), as a result:

  • Complex queries involving fields with different types that are combined with the logical operator OR (||) can return an error.

  • .Union() and .Except() operators may generate queries that return an error, depending on the types of the fields used.

  • Certain user queries in the Content Editor that span multiple fields with different types (such as creation date or version), return an error.

Fields

An Azure Cognitive Search index can only contain up to 1000 fields. This may be an issue for the and Master Web indexes that both have a default setup that starts with ~550 fields. If you reach the 1000 fields limit, create a new index that is specifically dedicated to indexing your custom templates and fields, then exclude your custom fields from the Master and Web indexes.

Note

The limitation of 1000 fields per index means the Azure Cognitive Search capabilities for multilingual solutions are also limited.

Fuzzy query semantics

These are different in Azure Cognitive Search, for example:

  • .Like(pattern, similarity)interprets the similarity parameter as the Damerau-Levenshtein Distance (value between 0 and 2). This is different from the way Lucene implements the similarity parameter in Sitecore.

  • The similarity and slop parameters cannot be combined in the Azure Cognitive Search Lucene syntax, this means multiple-word fuzzy queries, such as .Like() are always interpreted as a phrase query with a slop.

Joining queries

Queries such as .GroupJoin(), .SelfJoin(),and other operators that join queries, is not supported and results in an error.

Language-specific analysis

This is not supported prior to Sitecore version 8.2.7.

Maximum content length

For filterable, sortable, or facetable fields, the length is: 32766 bytes

Media indexing

This is not supported.

Pivot faceting

Used with the FacetPivotOn operator. This is not supported.

Range queries

These queries are always expressed as filters, which means:

  • Combining range queries with Search using the logical operator OR (||) produces an error.

  • Range queries on string fields always operate on the whole field value without tokenization and are case-sensitive.

Retrieve specific fields from documents with Azure Cognitive Search

Even though this is possible, the functionality is not currently visible through the Sitecore Content Search API. Retrieve specific fields from documents with Azure Cognitive Search - Even though this is possible, the functionality is not currently visible through the Sitecore Content Search API.

Same name fields

The Azure Cognitive Search service has a strong schema, this means for example, that there cannot be such things as fields that have the same name but different types in different documents.

searchContext.GetTermsByFieldName("fieldname", "prefix")

This is not supported. Use the following alternatives:

  • queryable.FacetOn(doc => doc[fieldName]).

  • queryable.Where(doc => doc[fieldName].StartsWith(prefix)).FacetOn(doc => doc[fieldName]).

Switch-on rebuild

This is not supported prior to Sitecore version 8.2.7.

Refer to the following list for features that exist in Azure that are not currently supported by your Sitecore provider:

  • Geospatial data types

  • Scoring profiles

  • Indexers

  • Suggestions

  • Highlighters