Introduction

Azure SQL Failover Groups provide geo-replication and automated failover for SQL databases. When synchronization fails, the secondary database becomes stale, affecting disaster recovery readiness and potential failover capability.

Symptoms

Replication state warning:

```bash $ az sql failover-group show \ --name my-failover-group \ --resource-group my-rg \ --server my-server \ --query '{State:replicationState,Lag:replicationLagSeconds}'

{ "State": "CATCH_UP", # Should be SYNCHRONIZED "Lag": 300 # Seconds behind } ```

Secondary offline:

json
{
  "error": {
    "code": "GeoReplicationSecondaryOffline",
    "message": "The secondary database is offline or unreachable"
  }
}

Failover failed:

```bash $ az sql failover-group set-primary \ --name my-failover-group \ --resource-group my-rg \ --server my-server-secondary

"Error: Failover cannot be performed because the secondary database is not synchronized" ```

Common Causes

  1. 1.Secondary server offline - Secondary SQL server not accessible
  2. 2.Network connectivity - Cross-region network issues
  3. 3.Secondary throttling - Secondary DTU/vCore limits exceeded
  4. 4.Primary high write rate - Replication can't keep up
  5. 5.Database removed - Database removed from failover group
  6. 6.Region outage - Secondary region unavailable
  7. 7.Configuration conflict - Misconfigured failover group

Step-by-Step Fix

Step 1: Check Failover Group Status

```bash # Get failover group status az sql failover-group show \ --name my-failover-group \ --resource-group my-rg \ --server my-server \ --query '{Name:name,State:replicationState,Databases:databases,Partner:partnerServers}'

# Check both primary and secondary az sql failover-group show \ --name my-failover-group \ --resource-group my-rg \ --server my-server-secondary \ --query 'replicationState' ```

Step 2: Check Secondary Database Health

```bash # Check secondary server status az sql server show \ --name my-server-secondary \ --resource-group my-rg \ --query '{Name:name,State:state,PublicAccess:publicNetworkAccess}'

# Check secondary database status az sql db show \ --name my-db \ --server my-server-secondary \ --resource-group my-rg \ --query '{Name:name,Status:status,Sku:sku}'

# Check for replication link status sqlcmd -S my-server-secondary.database.windows.net -d master -U admin -P password -Q " SELECT database_id, partner_server, partner_database, state_desc, synchronization_state_desc FROM sys.dm_geo_replication_link_status " ```

Step 3: Check Network Connectivity

```bash # Test connectivity to secondary # From primary region VM: sqlcmd -S my-server-secondary.database.windows.net -d master -U admin -P password -Q "SELECT @@VERSION"

# Check if secondary has firewall rules az sql server firewall-rule list \ --server my-server-secondary \ --resource-group my-rg \ --query '[].{Name:name,Start:startIpAddress,End:endIpAddress}'

# Allow Azure services if needed az sql server firewall-rule create \ --name AllowAzureServices \ --server my-server-secondary \ --resource-group my-rg \ --start-ip-address 0.0.0.0 \ --end-ip-address 0.0.0.0 ```

Step 4: Check Secondary Resource Limits

```sql -- Check secondary resource usage sqlcmd -S my-server-secondary.database.windows.net -d my-db -U admin -P password -Q " SELECT AVG(avg_cpu_percent) as avg_cpu, MAX(avg_cpu_percent) as max_cpu, AVG(avg_data_io_percent) as avg_io, MAX(avg_data_io_percent) as max_io FROM sys.dm_db_resource_stats WHERE end_time > DATEADD(hour, -1, GETDATE()) "

-- If max values > 90%, secondary is throttling -- Upgrade secondary tier ```

bash
# Upgrade secondary tier if throttling
az sql db update \
  --name my-db \
  --server my-server-secondary \
  --resource-group my-rg \
  --sku Standard \
  --capacity 200  # Match primary

Step 5: Replicate Missing Databases

```bash # List databases in failover group az sql failover-group show \ --name my-failover-group \ --resource-group my-rg \ --server my-server \ --query 'databases'

# List all databases on primary az sql db list \ --server my-server \ --resource-group my-rg \ --query '[].name'

# Add missing database to failover group az sql failover-group update \ --name my-failover-group \ --resource-group my-rg \ --server my-server \ --add-databases my-new-db ```

Step 6: Check Replication Lag

```sql -- Check replication lag from primary sqlcmd -S my-server.database.windows.net -d master -U admin -P password -Q " SELECT database_id, partner_server, partner_database, state_desc, synchronization_state_desc, last_replication_date, replication_lag_sec FROM sys.dm_geo_replication_link_status "

-- Lag > 60 seconds indicates issue -- Check primary write rate sqlcmd -S my-server.database.windows.net -d my-db -U admin -P password -Q " SELECT database_name, log_generation_rate_mb_per_sec FROM sys.dm_database_log_rate_stats " ```

Step 7: Force Failover Group Refresh

```bash # If replication stuck, remove and re-add database az sql failover-group update \ --name my-failover-group \ --resource-group my-rg \ --server my-server \ --remove-databases my-db

# Wait for removal to complete az sql db wait \ --name my-db \ --server my-server \ --resource-group my-rg \ --deleted

# Re-add to failover group az sql failover-group update \ --name my-failover-group \ --resource-group my-rg \ --server my-server \ --add-databases my-db ```

Step 8: Check for Region Issues

```bash # Check Azure status for region az account list-locations \ --query "[?name=='westus' || name=='eastus'].{Name:name,Status:availabilityStatus}"

# Check resource health az resource show \ --ids /subscriptions/SUB/resourceGroups/my-rg/providers/Microsoft.Sql/servers/my-server-secondary \ --query '{Name:name,Location:location}'

# If secondary region has issues, consider failover to new region ```

Step 9: Verify Grace Period Settings

```bash # Check grace period setting az sql failover-group show \ --name my-failover-group \ --resource-group my-rg \ --server my-server \ --query '{GracePeriod:gracePeriodWithDataLossHours,ReadWriteEndpoint:readWriteEndpoint}'

# Grace period too short may cause premature failover # Increase if needed az sql failover-group update \ --name my-failover-group \ --resource-group my-rg \ --server my-server \ --grace-period 2 # 2 hours ```

Step 10: Set Up Monitoring

```bash # Create alert for replication lag az monitor metrics alert create \ --name sql-failover-lag \ --resource-group my-rg \ --scopes /subscriptions/SUB/resourceGroups/my-rg/providers/Microsoft.Sql/servers/my-server/databases/my-db \ --condition "avg replication_lag_seconds > 60" \ --window-size 5m

# Create alert for secondary offline az monitor activity-log alert create \ --name sql-failover-offline \ --resource-group my-rg \ --condition category='ResourceHealth' and resourceType='Microsoft.Sql/servers/databases' and status='Unavailable'

Failover Group Replication States

StateDescriptionAction
SYNCHRONIZEDFully caught upNormal operation
CATCH_UPCatching upMonitor progress
PENDINGInitial seedingWait for completion
OFFLINESecondary unavailableFix connectivity

Verification

```bash # After fixing replication issues az sql failover-group show \ --name my-failover-group \ --resource-group my-rg \ --server my-server \ --query '{State:replicationState,Databases:databases}'

# Should show: # State: SYNCHRONIZED # Databases: [list of databases]

# Test failover capability (during maintenance window) az sql failover-group set-primary \ --name my-failover-group \ --resource-group my-rg \ --server my-server-secondary

# Verify failover succeeded az sql failover-group show \ --name my-failover-group \ --resource-group my-rg \ --server my-server-secondary \ --query 'replicationRole'

# Should show: Primary ```

Prevention

To prevent Azure SQL failover group sync issues from recurring, implement these proactive measures:

1. Monitor Replication Lag

yaml
groups:
- name: azure-sql-replication
  rules:
  - alert: AzureSQLFailoverGroupLag
    expr: |
      azure_sql_failover_group_lag_seconds > 60
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "Azure SQL failover group replication lag exceeds 60 seconds"

2. Configure Appropriate DTU/VCore

```bash # Ensure secondary has same or higher capacity az sql db create --name mydb --resource-group my-rg --server my-server --service-objective S3

# For geo-replicated secondary az sql db replica create --name mydb --partner-server my-secondary-server --resource-group my-rg --server my-server --service-objective S3

# Monitor DTU usage az monitor metrics list --resource /subscriptions/.../servers/my-server/databases/mydb --metric dtu_consumption_percent ```

3. Test Failover Regularly

```bash # Schedule quarterly failover test # Test failover (non-production hours) az sql failover-group set-primary --name my-failover-group --resource-group my-rg --server my-server-secondary

# Verify applications connect # Check connection strings point to failover group listener

# Failback az sql failover-group set-primary --name my-failover-group --resource-group my-rg --server my-server ```

Best Practices Checklist

  • [ ] Monitor replication lag
  • [ ] Size secondary appropriately
  • [ ] Test failover quarterly
  • [ ] Use failover group listener in connection strings
  • [ ] Monitor DTU usage on both servers
  • [ ] Document failover procedures
  • [Fix Azure SQL Database Geo-Replica Lag](/articles/fix-azure-sql-database-geo-replica-lag)
  • [Fix Azure SQL Database DTU Limit](/articles/fix-azure-sql-database-dtu-limit)
  • [Fix Azure SQL Auditing Not Writing](/articles/fix-azure-sql-auditing-not-writing)
  • [Technical troubleshooting: Fix Azure Aks Pod Crashloopbackoff Issue in Azure](azure-aks-pod-crashloopbackoff)
  • [Technical troubleshooting: Fix Azure Api Management Policy Expression Runtime](azure-api-management-policy-expression-runtime-error)
  • [Technical troubleshooting: Fix Azure App Configuration Feature Flag Not Refre](azure-app-configuration-feature-flag-not-refreshing)
  • [Technical troubleshooting: Fix Azure App Service 503 Always On Disabled Issue](azure-app-service-503-always-on-disabled)
  • [Technical troubleshooting: Fix Azure Application Gateway Err SSL Unrecognized](azure-application-gateway-err-ssl-unrecognized-name-alert)

<script type="application/ld+json"> { "@context": "https://schema.org", "@type": "TechArticle", "headline": "Fix Azure SQL Failover Group Not Syncing", "description": "Troubleshoot Azure SQL failover group synchronization issues. Check secondary health, network, and replication settings.", "url": "https://www.fixwikihub.com/fix-azure-sql-failover-group-not-syncing", "publisher": { "@type": "Organization", "name": "FixWikiHub", "url": "https://www.fixwikihub.com" }, "author": { "@type": "Person", "name": "FixWikiHub Editorial Team" }, "datePublished": "2026-04-03T21:33:16.616Z", "dateModified": "2026-04-03T21:33:16.616Z" } </script>