Introduction
AWS Cloud Map provides service discovery for ECS services. When DNS queries fail to resolve service endpoints, containers can't discover other services by name, breaking inter-service communication despite services being healthy and running.
Symptoms
DNS resolution failure in containers:
```bash $ nslookup my-service.my-namespace.local Server: 10.0.0.2 Address: 10.0.0.2#53
** server can't find my-service.my-namespace.local: NXDOMAIN ```
Application errors:
```bash $ curl http://my-service.my-namespace.local:8080 curl: (6) Could not resolve host: my-service.my-namespace.local
# Python socket.gaierror: [Errno -2] Name or service not known
# Java java.net.UnknownHostException: my-service.my-namespace.local ```
Service discovery health checks failing:
```bash $ aws servicediscovery get-service --service-id srv-12345
"HealthCheckConfig": { "Type": "HTTP", "ResourcePath": "/health", "FailureThreshold": 1 } ```
Common Causes
- 1.Wrong namespace type - Using HTTP namespace instead of DNS
- 2.Service not registered - Instances not registering with Cloud Map
- 3.DNS namespace not associated - VPC not linked to private DNS namespace
- 4.Wrong DNS format - Incorrect hostname format for namespace type
- 5.Health check failures - Instances deregistered due to failed health checks
- 6.Missing service discovery configuration - ECS service not configured for discovery
- 7.VPC DNS resolution disabled - VPC setting prevents DNS queries
Step-by-Step Fix
- 1.Check logs for specific error messages
- 2.Verify configuration settings
- 3.Test network connectivity
- 4.Review recent changes
- 5.Apply corrective action
- 6.Verify the fix
Step 1: List Service Discovery Resources
```bash # List all namespaces aws servicediscovery list-namespaces \ --query 'Namespaces[*].[Id,Name,Type]'
# List services in namespace aws servicediscovery list-services \ --filters Name=NAMESPACE_ID,Values=ns-12345 \ --query 'Services[*].[Id,Name,DnsConfig]'
# Get service details aws servicediscovery get-service --service-id srv-12345 ```
Step 2: Check Namespace Type and Configuration
```bash # Get namespace details aws servicediscovery get-operation --operation-id OP_ID # Or aws servicediscovery list-namespaces --filters Name=NAME,Values=my-namespace
# Namespace types: # - DNS_PRIVATE: Uses Route 53 private hosted zone, resolves via DNS # - HTTP: Uses API calls, no DNS resolution ```
If using HTTP namespace, services won't resolve via DNS - you must use Cloud Map API:
# For HTTP namespace, discover via API
aws servicediscovery discover-instances \
--namespace-name my-namespace \
--service-name my-service \
--health-status HEALTHYStep 3: Verify DNS Namespace VPC Association
```bash # Get private DNS namespace details aws servicediscovery get-namespace --namespace-id ns-12345
# Check Route 53 hosted zone VPC associations aws route53 list-hosted-zones-by-vpc \ --vpc-id vpc-12345 \ --vpc-region us-east-1
# Or get hosted zone ID from namespace and check associations HOSTED_ZONE_ID=$(aws servicediscovery get-namespace --namespace-id ns-12345 \ --query 'Namespace.Properties.DnsProperties.HostedZoneId' --output text)
aws route53 get-hosted-zone --id $HOSTED_ZONE_ID \ --query 'VPCs[*].[VPCId,VPCRegion]' ```
If VPC not associated:
# Associate VPC with private hosted zone
aws route53 associate-vpc-with-hosted-zone \
--hosted-zone-id $HOSTED_ZONE_ID \
--vpc VPCId=vpc-12345,VPCRegion=us-east-1Step 4: Check ECS Service Discovery Configuration
```bash # Get ECS service discovery configuration aws ecs describe-services --cluster my-cluster --services my-service \ --query 'services[*].serviceRegistries'
# Should show: # [{ # "registryArn": "arn:aws:servicediscovery:...", # "port": 8080, # "containerName": "my-container" # }]
# If empty, service discovery not configured ```
Add service discovery to ECS service:
```bash # Create service discovery service first if needed aws servicediscovery create-service \ --name my-service \ --namespace-id ns-12345 \ --dns-config Type=A,DnsRecords=[{Type=A,TTL=60}] \ --health-check-custom-config FailureThreshold=1
# Update ECS service with discovery aws ecs create-service \ --cluster my-cluster \ --service-name my-service \ --task-definition my-task \ --service-registries 'registryArn=arn:aws:servicediscovery:region:account:service/srv-12345' ```
Step 5: Verify Service Instance Registration
```bash # List instances registered with service aws servicediscovery list-instances --service-id srv-12345 \ --query 'Instances[*].[Id,Attributes]'
# Check specific instance aws servicediscovery get-instance --service-id srv-12345 --instance-id ins-12345
# Attributes should include: # AWS_INSTANCE_IPV4: 10.0.1.50 # AWS_INSTANCE_PORT: 8080 ```
If no instances registered:
```bash # Check ECS task events aws ecs describe-tasks --cluster my-cluster --tasks TASK_ID \ --query 'tasks[*].containers[*].serviceDiscoveryEndpoints'
# Manual instance registration (for debugging) aws servicediscovery register-instance \ --service-id srv-12345 \ --instance-id unique-id \ --attributes AWS_INSTANCE_IPV4=10.0.1.50,AWS_INSTANCE_PORT=8080 ```
Step 6: Check Health Check Status
```bash # Get service health check configuration aws servicediscovery get-service --service-id srv-12345 \ --query 'Service.HealthCheckConfig'
# Health check types: # - HTTP: HTTP GET to specified path # - HTTPS: HTTPS GET # - TCP: TCP connection attempt # - CUSTOM: ECS health check managed
# List instances with health status aws servicediscovery list-instances --service-id srv-12345 \ --query 'Instances[*].[Id,HealthStatus]' ```
Instances with UNHEALTHY status won't be returned in discovery queries.
Fix health check issues:
```bash # Update health check configuration aws servicediscovery update-service \ --service-id srv-12345 \ --health-check-config Type=HTTP,ResourcePath=/health,FailureThreshold=3
# Or use custom health check (managed by ECS) aws servicediscovery update-service \ --service-id srv-12345 \ --health-check-custom-config FailureThreshold=2 ```
Step 7: Verify Correct DNS Query Format
```bash # For private DNS namespace with DNS records: # Format: service-name.namespace-name
# Example: my-service.my-namespace.local # NOT: my-service.my-namespace.local.example.com
# Test DNS resolution dig my-service.my-namespace.local @10.0.0.2
# Or nslookup my-service.my-namespace.local 10.0.0.2
# Check from inside container kubectl exec -it container -- nslookup my-service.my-namespace.local ```
Step 8: Check VPC DNS Settings
```bash # Get VPC DNS configuration aws ec2 describe-vpcs --vpc-ids vpc-12345 \ --query 'Vpcs[*].[EnableDnsSupport,EnableDnsHostnames]'
# Both should be true for service discovery to work: # EnableDnsSupport: true (DNS resolution enabled) # EnableDnsHostnames: true (DNS hostnames enabled) ```
If DNS disabled:
```bash # Enable DNS support aws ec2 modify-vpc-attribute --vpc-id vpc-12345 --enable-dns-support
# Enable DNS hostnames aws ec2 modify-vpc-attribute --vpc-id vpc-12345 --enable-dns-hostnames ```
Step 9: Check Route 53 Resolver Endpoints
For cross-VPC or hybrid resolution:
```bash # List Resolver endpoints aws route53resolver list-resolver-endpoints \ --filters Name=VPCId,Values=vpc-12345
# Check Resolver rules aws route53resolver list-resolver-rules \ --filters Name=VPCId,Values=vpc-12345
# Forwarding rules for external DNS aws route53resolver get-resolver-rule --resolver-rule-id rslvr-12345 ```
Step 10: Debug from Container
```bash # Run test container aws ecs run-task --cluster my-cluster --task-definition debug-task
# In container, test DNS nslookup my-service.my-namespace.local dig my-service.my-namespace.local
# Check DNS server cat /etc/resolv.conf
# Test Cloud Map API discovery aws servicediscovery discover-instances \ --namespace-name my-namespace \ --service-name my-service \ --health-status HEALTHY \ --query 'Instances[*].Attributes' ```
Step 11: Verify Service Discovery Service ARN
```bash # Get service discovery service ARN aws servicediscovery get-service --service-id srv-12345 \ --query 'Service.Arn'
# ECS service must reference correct ARN aws ecs describe-services --cluster my-cluster --services my-service \ --query 'services[*].serviceRegistries[*].registryArn'
# ARNs must match ```
Verification
```bash # Test DNS resolution from task aws ecs execute-command --cluster my-cluster --task TASK_ID \ --container my-container --command "nslookup my-service.my-namespace.local"
# Or run test task aws ecs run-task --cluster test-cluster --task-definition dns-test
# Should return IP address of healthy instances aws servicediscovery discover-instances \ --namespace-name my-namespace \ --service-name my-service \ --health-status HEALTHY
# Should return instance with AWS_INSTANCE_IPV4 attribute ```
Related Issues
- [Fix AWS ECS Task Stuck in Pending](/articles/fix-aws-ecs-task-pending)
- [Fix AWS ECS Service Unstable](/articles/fix-aws-ecs-service-unstable)
- [Fix DNS Resolution Failure](/articles/fix-dns-resolution-failure)
Related Articles
- [AWS troubleshooting: Fix IAM Permission Denied - Complete Tro](fix-iam-permission-denied)
- [AWS cloud troubleshooting: AWS ACM Certificate Pending Validation Because the](aws-acm-certificate-pending-validation-wrong-route53-zone)
- [AWS cloud troubleshooting: AWS ALB Returns 502 Because the Target Closed the ](aws-alb-502-target-closed-connection-keepalive-timeout-mismatch)
- [AWS cloud troubleshooting: Fix AWS ALB CreateListener TargetGroupNotFound Err](aws-alb-createlistener-targetgroupnotfound)
- [AWS cloud troubleshooting: Fix Aws Alb Lambda 502 Bad Gateway Issue in AWS](aws-alb-lambda-502-bad-gateway)
<script type="application/ld+json"> { "@context": "https://schema.org", "@type": "TechArticle", "headline": "Fix AWS ECS Service Discovery Not Resolving", "description": "Troubleshoot ECS service discovery DNS resolution issues. Fix Cloud Map namespaces, service registration, and DNS configuration.", "url": "https://www.fixwikihub.com/fix-aws-ecs-service-discovery-not-resolving", "publisher": { "@type": "Organization", "name": "FixWikiHub", "url": "https://www.fixwikihub.com" }, "author": { "@type": "Person", "name": "FixWikiHub Editorial Team" }, "datePublished": "2026-04-01T15:42:55.218Z", "dateModified": "2026-04-01T15:42:55.218Z" } </script>