Return to doc.sitecore.com

Valid for Sitecore 6.0.0, 5.3.2
2.  Internal Architecture
Prev Next

As with many other frameworks, you need a bit of insight in internal processes to create effective solutions.

A special thing about the queries is that they can be handled either directly in the database on the data provider level or by the data manager tier. When the query is being resolved, the data manager tries to use its data providers first and resorts to a higher level (and slower) API in case of failure.

The SQL Server data provider supports only a small subset of queries: '/sitecore/content/home' which resolves an Item by path and '//home' which resolves an Item by name.

What does this really mean? Imagine that you have a large site that uses an SQL Server database and you're trying to find some content Items. Consider a simple case when we need to find all content Items named 'needle':  

Item content = Sitecore.Context.Database.Items["/sitecore/content"];
Item[] needles = content.Axes.SelectItems("//needle");

The  SQL Server data provider supports this kind of query, so the query gets resolved directly in the database fairly quickly even though we have a large number of content Items. 

Then we complicate the requirements a bit: imagine that the Items include a checkbox field called 'IsHidden' and we only want needles that are not hidden:

Item content = Sitecore.Context.Database.Items["/sitecore/content"];
Item[] needles = content.Axes.SelectItems("//needle[@IsHidden != 1]");

  Predicates are not supported by the SQL Server data provider, so it ignores this query. We don't have any other data providers associated with this database, so the data manager resorts to a higher level query API. This basically means that all Items in the query scope (all descendants of /sitecore/content in this example) are loaded so that the predicate can be evaluated against each Item and the matching Items are returned. The difference in performance between the database and Item evaluation can be quite dramatic for large sets of Items.

Now that we understand how the data manager and data providers work together to evaluate the query, let’s improve our solution:  

Item content = Sitecore.Context.Database.Items["/sitecore/content"];
// Limit ourselves to a data provider-supported query
Item[] allNeedles = content.Axes.SelectItems("//needle");

// Do the additional filtering - [@isHidden != "1"] predicate evaluation.
List<Item> nonHiddenNeedles = new List<Item>();
foreach(Item item in allNeedles)
{
   if (item["IsHidden"] != "1")
   {
      nonHiddenNeedles.Add(item);
   }
}

By using the '//needle' query, we obtain all Items named 'needle' in a very efficient way because the query is resolved in the database. There are probably only a few such Items, so iterating through each of them to check the 'isHidden' field should also be a fairly inexpensive process, and it only requires a few additional lines of code.

To sum it up, you need to understand what kinds of queries are supported by your database backend when querying against large Item sets. This points out another best practice when integrating external data providers : support common query scenarios to avoid performance problems.


Prev Next