Batch aggregation configuration
Optimize the performance of batch aggregation.
Batch aggregation enables you to group interactions together into batches to improve the performance of your aggregation processing framework. In the Sitecore Experience Database (xDB), batch aggregation is part of the standard installation, but you can also change the default settings depending on your specific hardware and solution architecture. You can test the performance of your batch aggregation configuration using the performance counters in this topic.
You should change the default batch aggregation configuration if you want to:
Improve the performance of batch aggregation to suit your solution.
Enable batch aggregation to use your own custom aggregations.
To change the default batch aggregation configuration you need to edit the Sitecore.Analytics.Processing.Aggregation.Services.config
file.
Note
Do not make changes directly to the configuration file, but instead create your own custom configuration patch file that performs the required changes during run time.
The following settings enable you to adjust the configuration of batch aggregation to improve the performance of your Sitecore solution during aggregation processing.
Configuration node | Description and example |
---|---|
| The Adjust this setting according to the performance of your individual Sitecore solution. Consider the following factors when you set the maximum batch size: - The number of SQL Server disks you have. - Processing power (type and number of processors). - Network latency. - The specific configuration of your aggregation/processing server. The default setting is 64 interactions: <MaximumBatchSize>64</MaximumBatchSize> |
| Multiplexing collapses multiple aggregation threads into a single transaction just before commit to make processing faster and to reduce the load on SQL Server. The multiplexing time out setting enables you to specify the maximum time the multiplexer can wait for parallel executing aggregation agents before storing the data set in the reporting database. Setting the value to 1 second (default value) tells the system to save the combined results produced by aggregation agents each second. The default setting is 1 second: <MultiplexingTimeout>0.00:00:01</MultiplexingTimeout> |
| Replace the standard implementation of the batch aggregator with your own custom implementation. <Aggregator> <MultiplexingTimeout>0.00:00:01</MultiplexingTimeout> </Aggregator> |
Sitecore uses table-valued parameters (TVP) to send aggregated information to SQL server.
For information on how to use table-valued parameters read the Microsoft article Use Table-Valued Parameters (Database Engine).
Using TVP provides a faster way of communicating with SQL Server. It enables you to pass a table as a parameter to a function or stored procedure after the data from the collection database has been grouped together. The use of TVP is optional and is implemented separately for each fact and dimension table.
The SQL table mappings in the Sitecore configuration file have the following properties to support TVP:
IsMultiRow
– shows whether the stored procedure accepts TVP.TableType
– the name of the SQL Server user-defined type that is accepted by the stored procedure. This type must correspond to the structure of the table. It must have the same columns in the same order and use the same types. For more information on user-defined types, read the Microsoft article Working with User-Defined Types in SQL Server.
SQL table mappings are located at the following path: /sitecore/aggregation/routines/ExecRoutineStatementBuilder/mappings
and defined in the Sitecore.Analytics.Processing.Aggregation.config
file.
It is recommended that you create custom aggregations that support for TVP. See the example for a description of how to configure Sitecore to do this.
You can customize the following examples to change your default batch aggregation configuration.
Example 1: Changing the default configuration settings
This is an example of a configuration patch file that you can use to change the default batch aggregation configuration file settings. It shows how to change the MaximumBatchSize
, MultiplexingTimeout
, Interval
and MaxThreads
parameters for both the live and history aggregators.
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/"> <sitecore> <aggregation> <!-- Configure the aggregator agent: --> <aggregator> <Aggregator> <MultiplexingTimeout>0.00:00:01</MultiplexingTimeout> </Aggregator> <MaximumBatchSize>128</MaximumBatchSize> </aggregator> <!-- Configure the historyWorker agent: --> <historyWorker> <Aggregator> <MultiplexingTimeout>0.00:00:01</MultiplexingTimeout> </Aggregator> <MaximumBatchSize>128</MaximumBatchSize> </historyWorker> <!-- Aggregation Module: --> <module type="Sitecore.Analytics.Aggregation.AggregationModule"> <BackgroundServices> <aggregator> <Interval>0.00:00:15</Interval> <MaxThreads>1</MaxThreads> </aggregator> <history> <Interval>0.00:00:15</Interval> <MaxThreads>1</MaxThreads> </history> </BackgroundServices> </module> </sitecore> </configuration>
Example 2: Enabling TVP support for a custom aggregation
This is a code sample that you can use to create your own configuration file to enable a custom aggregation to work with batch aggregation. It shows how to integrate a custom processor into the aggregation pipeline for interactions and how to configure mappings for a custom fact table with support for SQL Server TVP.
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/"> <sitecore> <aggregation> <routines> <ExecRoutineStatementBuilder> <mappings> <!-- Reference to the fact table and associated stored procedure and table type --> <SqlMappingEntity uid="MyCustomFact" type="Sitecore.Analytics.Aggregation.SqlMappingEntity, Sitecore.Analytics.Sql"> <Table>Fact_CustomFact</Table> <Routine>Add_CustomFact_Tvp</Routine> <IsMultiRow>True</IsMultiRow> <TableType>CustomFact_Type</TableType> </SqlMappingEntity> </mappings> </ExecRoutineStatementBuilder> </routines> </aggregation> <pipelines> <group groupName="analytics.aggregation"> <pipelines> <interactions> <!-- Registration of the processor that will write to the new fact table --> <processor uid="MyCustomFact" type="Customization.CustomProcessor, Customization" /> </interactions> </pipelines> </group> </pipelines> </sitecore> </configuration>
If you want to adjust the performance of your batch aggregation processing configuration, or you want to monitor and troubleshoot the aggregation process, you can use the following counters to measure performance.
Counter | Description |
---|---|
Average batch size – live | Displays the average number of interactions contained in each batch processed during live aggregation. |
Average batch size – history | Displays the average number of interactions contained in each batch processed during history aggregation. Under normal operation, this counter is equal to |
Number of batches containing failing items – live | Displays the number of live aggregation batches (per second) containing interactions that have failed to be aggregated or saved to the reporting database. Under normal operation, this counter is always zero. |
Number of batches containing failing items – history | Displays the number of history aggregation batches (per second) containing interactions that have failed to be aggregated or saved to the reporting database. Under normal operation, this counter is always zero. |
Average write time – live | Displays the average time (in milliseconds) that it takes during live aggregation for a batch to be saved to the reporting database. A number larger than the configured multiplexing timeout indicates a bottleneck. |