Product catalog data

Abstract

Decide whether to store product data in Sitecore content, retrieve the data with a data provider, or index the product data.

There are three common models for how to store and retrieve product catalog data in a Commerce Connect solution: Product synchronization, data provider, and index hybrid models. The Commerce Connect service layers work independently of each other, so you can use any of the methods for storing and retrieving product data and still use the remainder of the service layers for integration and to provide built-in engagement.

This topic describes the:

Commerce Connect has built-in support for storing product data in Sitecore content. Commerce Connect contains a product data model and a product synchronization service layer for exchanging product data with one or more external systems.

External commerce systems store product data in different ways. The Commerce Connect product data model is designed to provide a scalable standard data model regardless of the external system that is used.

The model is highly scalable and has the following advantages:

  • A solid base for typical e-commerce scenarios.

  • No redundant product data.

  • Minimal product data that needs to be synchronized.

  • An easy way for e-commerce vendors to map product data.

  • A standard structure that enables developers to replace the external commerce system with minimal effort and a known structure to manage products.

  • Benefits from the Sitecore XP features, such as Item Buckets and ContentSearch.

It is best practice to use the product synchronization model and store product data as Sitecore content in the following situations:

  • To add marketing information to products – if you want to augment the product data that is stored in the external commerce system, for example, with presentation data or marketing content.

  • To support multiple data sources – if data is being gathered from a number of different systems.

  • When retrieving data from the external system is slow – if retrieval of product catalog data from one or more external systems is slow.

To add marketing information to products

There are typically limitations to the kind of information stored in the commerce system. One of the premises with Commerce Connect is that only the core product data is included in the external commerce system. Core data is static product information that is shared across all channels. The core data is not usually stored in a presentable way. Information such as marketing content needs to come from other external systems or can be added in Sitecore.

Also, Sitecore can handle different information for different channels. The product model does not accommodate different messaging for different channels, but this information can easily be added either in the form of individual fields on the product template itself or as subitems in a Channel subfolder. Sitecore becomes the aggregator of product information for customer presentation in all supported channels. Presentation and sales texts for different channels can be managed in Sitecore before rendering.

To support multiple data sources

In some scenarios, the product data is provided by multiple sources. Classification and categorization are important parts of the product catalog data and how it is presented to the visitor. The classification used in, for instance, an ERP system is usually custom-made and rarely fits with the categorization needed on the storefront presentation side. So the data needed for presentation classification, as well as specifications, can reside in a classification system outside of the external system and be incorporated as part of the synchronization. Two standard classification systems are UNSPEC® and CNET DataSource™.

  • The Universal Standard Products and Services Classification (UNSPEC) was created for coding products and services unambiguously according to industry-agreed naming standards for streamlining commerce among companies, particularly over the Internet. UNSPEC is an open, global electronic commerce standard that provides a logical framework for classifying goods and services.

  • CNET Content solutions contains a product called DataSource. DataSource converts non-standardized product information from multiple sources into consistent content for electronic product catalogs. It is also a catalog mapping service that automatically matches external catalogs to the CNET Content Solutions DataSource product database to enrich those catalogs with DataSource specifications, product attributes, and rich content.

The logic that determines where the data is retrieved from must be built into the processors and pipelines that make up the product synchronization layer. The synchronization process is broken down to handle the individual objects of the data model separately. Each pipeline operates on a single object type in the data model. How the data is obtained and the data objects are populated is up to the implementation of the integration. Separate pipeline processors can be implemented so that each processor has responsibility for retrieving parts of the information for the object and populating the object with that information before the synchronization with Sitecore happens.

When retrieving data from the external system is slow

Not all external systems perform equally well. ERP systems are notoriously slow. Another reason for slow performance using a data provider might be if the external system resides on a different server instance and the latency is high. In such scenarios, working with a data provider can negatively affect performance and it is preferable to have product data natively available in Sitecore content.

Support for distributed systems

Latency might be high if the Content Management (CM), Content Delivery (CD), and external commerce system instances are in different geographic locations, or even just on different server instances on the same network. For example, the CM and CD might be in the cloud, while the external commerce system is on the premises. In this scenario, it is probably neither possible nor practical to use real-time calls or an item data provider to retrieve data from the external commerce system.

In addition, using an item data provider makes the system vulnerable in case of connectivity problems between Sitecore and the external system. Once data is synchronized to Sitecore content, it is immediately and always available.

Keeping the product repository in Sitecore up-to-date is not resource intensive

Because price and stock information is dynamic, it is not treated as part of the core product data, and there are separate service layers to retrieve this information as needed. This ensures that the product repository in Sitecore does not contain prices or stock information, only static data. Even with a million products in the repository, only a small fraction of these will typically have to be updated, added, or removed on a regular basis.

An example of product data being synchronized into a product repository in Sitecore can be seen in our StarterKit solution, which is integrated with nopCommerce. The solution is located on GitHub.

It is a large task to create integration with Commerce Connect using product synchronization. However, it is an equally large task to create and support a Sitecore data provider that yields adequate performance.

The implementation of a data provider depends on how the product data is structured in the external commerce system. Two commerce systems that use a data provider are uCommerce and Sitecore Commerce. These systems have in common that product data is structured with nested categories containing products. One difference between these systems is that the product items contain different fields. You can find an example solution with Sitecore Commerce and a Reference Storefront on GitHub.

A hybrid solution is to create a Sitecore product index based on product data that is either stored in the external system or in Sitecore content. For example, Sitecore Commerce uses indexes as the primary means of driving the catalog browsing experience. When the product details are rendered, they are read live from the external catalog using a data provider. The indexes are built by crawling the catalog data stored in Sitecore content.

When querying the index, you can use the Connect product object model as part of the returned objects. The Sitecore Content Search implementation enables you to populate the objects with data from the index before returning the result. Information not stored in the index can be obtained from the external system or content using the product ID obtained from the index as a reference to load the data from the external commerce system.

To have Sitecore populate the returned objects automatically, the index field names must match the object properties or be configured using attributes.

A data provider solution might seem the most suitable because product data is typically owned by the external commerce system and naturally belongs there. However, there are a number of reasons to store product data in Sitecore and/or an index.

The following table shows the differences between the three data storage and retrieval models.

Product synchronization

Data provider

Index hybrid

Writable/updatable

Catalog data can be updated

Yes

Once data has been synchronized, the content can be updated in Sitecore.

Depends on the external system and the data provider.

Examples:Commerce Server can be updatedMicrosoft Dynamics AX for Retail cannot be updated

No

The index is static.

Possible to augment data

Yes

Additional fields and/or items can be added to the model.

Depends on the external system and the data provider implementation.

No

Will require additional content to be maintained separately in Sitecore.

Support for multiple sources

Yes

Processors for retrieving data from multiple systems can be injected into the synchronization pipelines.

Potentially

Normally there is only one source.

Potentially

Normally there is only one source.

Latency in accessing catalog data

None

Only depends on the speed of Sitecore.

Depends on the integration and proximity of external system.

Depends on the integrated search engine.

Time and resources needed to expose and maintain the catalog data

Yes

Regular product synchronization is needed.

No

Yes

Regular re-indexing is needed.

A solid base for typical e-commerce scenarios

Yes

Designed with storefront scenarios in mind.

Depends on the virtual structure and data exposed by the data provider.

Depends on the index content and how complete it is or whether the external system must be called often to get the needed information.

Redundant product data can be avoided

Yes

No

The following become a challenge when indexing the catalog data from Sitecore content:

Typically, data for a product is exposed as a single Sitecore item, so if multiple products have the same values, the data will appear multiple times.

Since products can belong to multiple different categories and catalogs, they will appear in multiple different places in the content tree structure.

No

Data for a product is stored in every index document, so if multiple products have the same values, the data will appear multiple times.

Since products can belong to multiple categories and catalogs, they will appear in multiple places in the index.

An easy way for e-commerce vendors to map product data

Yes

Depends on the structure of data in the external system and how it needs to be exposed in Sitecore as a tree structure.

Depends on the structure of data in the external system and how it needs to be mapped to a flat index structure.

A standard structure that allows developers to replace the external commerce system with minimal effort and a known structure to manage products

Yes

No

Every data provider exposes a different virtual structure for the catalog data.

Yes

The crawler and indexer can be created so the index maintains the same fields and data.