Identifying Performance Symptoms through Logs and Metrics
The first clue came from analyzing response times using Sitecore logs and performance metrics. Based on the insights from the Issue Investigation (as detailed in the presentation): I looked at response duration and isolated instances where responses exceeded 5000 milliseconds (5 seconds). This data, visualized through a scatter chart, pointed toward periods of high latency when certain background processes were running.- Frequent slowdowns during content publishing.
- General performance lag during content authoring, even when not publishing.
Verifying Infrastructure and Eliminating Environmental Causes
As I reviewed the environment to rule out potential hardware or infrastructure problems:- Server Metrics: Both CM and CD servers showed normal CPU Processing threads and memory utilization under average load conditions except during the content authoring activities.
- Database Performance: SQL Server queries were running within expected timeframes, with no significant slowdowns.
- Network Latency: No signs of network congestion or packet loss were detected.
Investigating Publishing behavior
One of the first areas I explored was the publishing process itself. By analyzing logs and audit trails, it became evident that the Related option was being used excessively. Content authors were triggering Smart and Republish options with both Sub items & related options for routine content updates.- Increased load on the CM server, leading to slower processing times.
- Unnecessary re-indexing, adding further strain to the system.
- I recommended the content team on the significance & use cases of use of Smart/Republish Publish especially the preferences like Sub items & related items.
- Additionally, I reviewed the logs to identify where excessive Republish actions were occurring and reconfigured the publishing settings to minimize unnecessary usage of this feature (Custom).
Uncovering the Real pitfall : Indexing Jobs Overloading Resources
Frequent publishing with short intervals, especially when subitems and related items are included, combined with an asynchronous indexing strategy set to a 00:00:05 interval with checkThreshold
options, can lead to significant strain on the indexing process in Sitecore SXA environments.
The below trend indicates a consistent pattern of indexing activity over several days, involving multiple indexes such as sitecore_master_index
, sitecore_sxa_master_index
, and auxiliary indexes like sitecore_marketing_asset_index_master
. The job execution frequency demonstrates regular periodic activity, with notable spikes, particularly on October 17 and 19. On October 17, the sitecore_sxa_master_index
experienced a significant spike at 2:30 PM, with 156 jobs executed, while the sitecore_master_index
peaked on October 19 at 10:30 AM with 124 jobs. These surges suggest high content publication volumes or significant index rebuild triggers during these times.

The below trend narrows the focus to the most critical and frequently active indexes that
sitecore_master_index
and sitecore_sxa_master_index
. The trends continue, with the sitecore_master_index
showing heightened indexing activity on October 19, likely driven by heavy publishing cycles or large content changes. In contrast, indexes like sitecore_fxm_web_index
and other marketing-related indexes show lower job counts, indicating less frequent use.
Key Observations
- Peak Activity Windows: The job count spikes around October 17 and 19 suggest potential system strain due to heavy content publishing or full index rebuilds, highlighting periods of resource consumption.
- Job Count Distribution: While indexing jobs are processed consistently, there is room for optimization, particularly during peak times. This includes opportunities to enhance asynchronous indexing strategies or stagger publishing events to mitigate bottlenecks.
- Room for Optimization: The intensity of job executions for master indexes points to the need for refined indexing strategies, such as adjusting the frequency of asynchronous jobs to better distribute load throughout the day.
- High CPU and memory usage: The system has to repeatedly assess and process a large number of content updates, including related and subitems, which amplifies the workload on the search and indexing infrastructure.
- Search relevancy delays: Although the strategy aims for near real-time indexing, the system may struggle to maintain accuracy and consistency due to the continuous stream of updates. Items can remain in a partially indexed state, causing temporary discrepancies in search results.
- Increased risk of data contention: With the
checkThreshold
option enabled, Sitecore attempts to intelligently manage indexing load based on resource usage. However, with such frequent triggers, the system may fail to adequately stagger operations, leading to contention for system resources.
- Potential index corruption: The frequent, asynchronous updates increase the risk of partial or failed indexing operations, especially when the system is under heavy load. This can result in corrupt or incomplete indexes, further degrading search functionality.
- Performance degradation: Publishing operations, even in an async mode, can cause latency in other critical processes like content editing, search queries, and rendering, particularly if the indexing system is constantly in use without adequate recovery time.
Overall, while frequent updates aim to keep the index up-to-date, this aggressive interval can lead to diminished performance, slower user interactions, and potential inconsistencies in search results, especially in environments handling a large volume of content changes. Adjusting the publishing frequency or optimizing the indexing strategy, including batch processing and setting more reasonable intervals, would help mitigate these challenges.
Solution:
- Initially, the core index was scheduled to run every 1 minute, but since changes in this collection are infrequent, we extended the interval to 60 minutes.
- The sitecore_master_index and sitecore_sxa_master_index were originally scheduled to run every 5 seconds, which we adjusted to a 10-minute interval to optimize performance.
- The content that isn’t time-sensitive, considered for scheduled publishing during off-peak hours. This ensures that the CM server isn’t overloaded during periods when authors are actively working on content updates.
- Enforced publishing workflows that can automate routine tasks and enforce the use of best practices, such as avoiding unnecessary full republishes and ensuring that only approved content is published.
Best Practices for Indexing and Publishing in Solr-based Sitecore XM Environments
Based on this experience, here are some key best practices to ensure optimal performance in Solr-based Sitecore environments, especially with SXA:Understanding on Indexing Requirements
- Identify Content Types: Determine which content types are frequently queried or accessed. Prioritize indexing those items.
- Define Search Use Cases: Understand how users will search and what data they need. This will inform which fields should be indexed.
- Sitecore Content Search Index: Use Sitecore's built-in content search index (e.g., sitecore_master_index, sitecore_web_index) for querying content efficiently.
- Custom Indexes: Create custom indexes if we have specific requirements not met by default indexes, ensuring they are tailored to our specific content needs.
- Field Mapping: Map fields appropriately in the index configuration to ensure that only necessary fields are indexed, which reduces the index size and improves performance.
- Boosting and Analyzers: Use boosting for critical fields that require higher search priority. Choose appropriate analyzers based on the content and the search requirements (e.g., StandardAnalyzer, WhitespaceAnalyzer).
- Schedule Incremental Indexing: Use incremental indexing to update only the changed items instead of re-indexing the entire dataset, which saves time and resources.
- Disable Automatic Indexing for Unused Fields: Disable automatic indexing on fields that are not necessary for searching to reduce index updates.
- Use Monitoring Tools: Implement monitoring tools (like Sitecore's built-in tools or external monitoring) to track index performance and identify slow queries or potential issues.
- Log Search Queries: Log search queries to analyze their frequency and performance, helping to identify which queries might need optimization or which fields require additional indexing.
- Rebuild Indexes: Schedule regular index rebuilds to clear out stale data and optimize performance, particularly after significant content changes.
- Index Health Checks: Regularly check the health of indexes for corruption or errors and fix issues proactively.
By applying these insights and best practices, we'll not only prevent the frustrating slowdowns that can plague content-heavy environments but also empower our content authors to work more efficiently. Sitecore is an incredibly powerful platform, but like any complex system, it needs to be fine-tuned for optimal performance.
When we optimize our publishing and indexing strategies, we're not just improving server performances, we’re creating a smoother, faster experience for content authors and end-users alike. And in the fast-paced world of digital experience, every second counts.
So, take the time to investigate, tweak, and monitor our environments. The results will speak for themselves: faster load times, smoother publishing processes, and a more resilient system ready to handle whatever our team throws at it.
Our Sitecore environment and Our content authors will thank us!
Comments
Post a Comment