Introduction

Cosmos DB Change Feed provides a persistent log of document changes. When consumer applications process changes slower than they're generated, lag increases, causing delayed data synchronization and stale downstream systems.

Symptoms

Change feed lag increasing:

```bash # Monitor change feed processor metrics # Estimated lag = Current LSN - Last processed LSN

MetricValue
ChangeFeedProcessor.Lag10000+ documents
Processing latency> 5 minutes

Processor health check warning:

csharp
// Change feed processor shows warning
processor.GetEstimatedLag() // Returns high value
processor.GetHealthStatus() // Shows unhealthy

Lease container contention:

bash
# In lease container logs:
"Lease acquisition failed due to contention"
"Another processor instance already owns this lease"

Common Causes

  1. 1.Consumer throughput insufficient - Can't keep up with change rate
  2. 2.Lease container throttling - Lease operations hitting RU limits
  3. 3.Too few processor instances - Not enough parallel processing
  4. 4.Batch size too small - Processing few documents per batch
  5. 5.Slow downstream processing - External service bottleneck
  6. 6.Partition skew - Hot partition with more changes
  7. 7.Network latency - Cross-region read latency

Step-by-Step Fix

  1. 1.Check logs for specific error messages
  2. 2.Verify configuration settings
  3. 3.Test network connectivity
  4. 4.Review recent changes
  5. 5.Apply corrective action
  6. 6.Verify the fix

Step 1: Monitor Change Feed Metrics

```csharp // Get change feed processor metrics var processor = container.GetChangeFeedProcessorBuilder<dynamic>( "myProcessor", async (changes, cancellationToken) => { // Process changes }) .WithInstanceName("instance-1") .WithLeaseContainer(leaseContainer) .Build();

await processor.StartAsync();

// Check lag var lag = await processor.GetEstimatedLag(); Console.WriteLine($"Estimated lag: {lag}"); ```

Step 2: Check Lease Container RU/s

```bash # Get lease container throughput az cosmosdb sql container show \ --account-name my-cosmos \ --resource-group my-rg \ --database-name lease-db \ --name leases \ --query '{Throughput:resource.throughput}'

# Lease container needs sufficient RU/s # Each processor instance periodically updates its lease # Rule of thumb: 100-500 RU/s per processor instance ```

Step 3: Increase Lease Container RU/s

```bash # Increase lease container throughput az cosmosdb sql container throughput update \ --account-name my-cosmos \ --resource-group my-rg \ --database-name lease-db \ --name leases \ --throughput 1000

# Or enable autoscale az cosmosdb sql container throughput update \ --account-name my-cosmos \ --resource-group my-rg \ --database-name lease-db \ --name leases \ --max-throughput 2000 ```

Step 4: Optimize Batch Size

csharp
// Increase batch size for more throughput
var processor = container.GetChangeFeedProcessorBuilder<dynamic>(
    "myProcessor",
    async (changes, cancellationToken) => {
        foreach (var change in changes)
        {
            await ProcessChangeAsync(change);
        }
    })
    .WithInstanceName("instance-1")
    .WithLeaseContainer(leaseContainer)
    .WithMaxItems(1000) // Increase from default 100
    .WithPollInterval(TimeSpan.FromMilliseconds(100)) // Faster polling
    .Build();

Step 5: Scale Processor Instances

```csharp // Deploy more processor instances // Each instance handles a subset of partitions // Maximum instances = number of physical partitions

// Run multiple processor instances (same processor name!) // They will automatically distribute partitions among themselves

// Instance 1: var processor1 = container.GetChangeFeedProcessorBuilder<dynamic>( "myProcessor", // Same processor name ProcessChangesAsync) .WithInstanceName("instance-1") // Unique instance name .WithLeaseContainer(leaseContainer) .Build();

// Instance 2: var processor2 = container.GetChangeFeedProcessorBuilder<dynamic>( "myProcessor", // Same processor name ProcessChangesAsync) .WithInstanceName("instance-2") // Different instance name .WithLeaseContainer(leaseContainer) .Build(); ```

Step 6: Optimize Downstream Processing

```csharp // Use batch processing instead of individual items async Task ProcessChangesAsync( IReadOnlyCollection<dynamic> changes, CancellationToken cancellationToken) { // Batch process all changes var batch = new List<WriteModel<BsonDocument>>();

foreach (var change in changes) { batch.Add(new ReplaceOneModel<BsonDocument>( Builders<BsonDocument>.Filter.Eq("_id", change.id), change.ToBsonDocument(), new ReplaceOptions { IsUpsert = true })); }

if (batch.Count > 0) { await mongoCollection.BulkWriteAsync(batch); } } ```

Step 7: Use In-Memory Lease Container (Development)

```csharp // For development/testing, use in-memory lease container // WARNING: Not for production - leases lost on restart

var processor = container.GetChangeFeedProcessorBuilder<dynamic>( "myProcessor", ProcessChangesAsync) .WithInstanceName("instance-1") .WithLeaseContainer(leaseContainer) .WithStartFromBeginning() // Or WithStartTime(DateTime.UtcNow.AddHours(-1)) .Build(); ```

Step 8: Check for Blocked Processing

```csharp // Add timeout to processing async Task ProcessChangesAsync( IReadOnlyCollection<dynamic> changes, CancellationToken cancellationToken) { using var cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken); cts.CancelAfter(TimeSpan.FromMinutes(5)); // Timeout

try { foreach (var change in changes) { cts.Token.ThrowIfCancellationRequested(); await ProcessChangeAsync(change, cts.Token); } } catch (OperationCanceledException) { // Log timeout, don't block processor _logger.LogWarning("Change feed processing timed out"); throw; // Will retry } } ```

Step 9: Monitor Processor Health

```csharp // Implement health check public class ChangeFeedHealthCheck : IHealthCheck { private readonly ChangeFeedProcessor _processor;

public Task<HealthCheckResult> CheckHealthAsync( HealthCheckContext context, CancellationToken cancellationToken = default) { var lag = _processor.GetEstimatedLag().Result;

if (lag < 1000) return Task.FromResult(HealthCheckResult.Healthy($"Lag: {lag}")); else if (lag < 10000) return Task.FromResult(HealthCheckResult.Degraded($"Lag: {lag}")); else return Task.FromResult(HealthCheckResult.Unhealthy($"Lag: {lag}")); } } ```

Step 10: Use Push Model for Real-time

```csharp // For lower latency, use push model (Azure Functions) // Azure Functions Cosmos DB trigger automatically uses change feed

[FunctionName("ProcessChanges")] public static async Task Run( [CosmosDBTrigger( databaseName: "my-db", containerName: "my-container", Connection = "CosmosConnection", LeaseContainerName = "leases", CreateLeaseContainerIfNotExists = true)] IReadOnlyList<dynamic> changes, ILogger log) { foreach (var change in changes) { await ProcessChangeAsync(change); } } ```

Change Feed Processor Configuration

SettingDefaultRecommendation
MaxItems100500-1000 for throughput
PollInterval1s100-500ms for low latency
LeaseAcquireInterval13sLower for faster scaling
LeaseExpirationInterval60sLower for faster failover
LeaseRenewInterval17sAdjust based on workload

Verification

```bash # After changes, monitor lag # Should see decreasing lag

# Check processor health var health = await processor.GetHealthStatus(); var lag = await processor.GetEstimatedLag();

Console.WriteLine($"Health: {health}, Lag: {lag}");

ChangeFeedProcessor.Lag< 1000
Processing latency< 1 second
  • [Fix Azure Cosmos DB Throttling](/articles/fix-azure-cosmos-db-throttling)
  • [Fix Azure Cosmos DB Partition Hotspot](/articles/fix-azure-cosmos-db-partition-hotspot)
  • [Fix Azure Cosmos DB SQL Query Slow](/articles/fix-azure-cosmos-db-sql-query-slow)
  • [Technical troubleshooting: Fix Azure Aks Pod Crashloopbackoff Issue in Azure](azure-aks-pod-crashloopbackoff)
  • [Technical troubleshooting: Fix Azure Api Management Policy Expression Runtime](azure-api-management-policy-expression-runtime-error)
  • [Technical troubleshooting: Fix Azure App Configuration Feature Flag Not Refre](azure-app-configuration-feature-flag-not-refreshing)
  • [Technical troubleshooting: Fix Azure App Service 503 Always On Disabled Issue](azure-app-service-503-always-on-disabled)
  • [Technical troubleshooting: Fix Azure Application Gateway Err SSL Unrecognized](azure-application-gateway-err-ssl-unrecognized-name-alert)

<script type="application/ld+json"> { "@context": "https://schema.org", "@type": "TechArticle", "headline": "Fix Azure Cosmos DB Change Feed Lag", "description": "Troubleshoot Cosmos DB change feed lag. Optimize consumer throughput, lease container settings, and processor scaling.", "url": "https://www.fixwikihub.com/fix-azure-cosmos-db-change-feed-lag", "publisher": { "@type": "Organization", "name": "FixWikiHub", "url": "https://www.fixwikihub.com" }, "author": { "@type": "Person", "name": "FixWikiHub Editorial Team" }, "datePublished": "2026-04-02T12:07:18.357Z", "dateModified": "2026-04-02T12:07:18.357Z" } </script>