Introduction

Amazon EC2 instances run on physical host machines in AWS data centers. Each EC2 instance has two types of status checks: system status checks and instance status checks. System status checks detect problems with the underlying infrastructure—power, networking, hardware, or the hypervisor itself. Instance status checks detect problems within the guest operating system or the instance configuration.

When a system status check fails, the problem is below the virtual machine layer. Rebooting the operating system from within the instance typically has no effect because the issue exists in the physical infrastructure hosting the VM. The resolution requires migrating the instance to healthy hardware through AWS-controlled operations like stop-start or instance recovery.

Understanding the distinction between system and instance status checks is critical for efficient troubleshooting. Misdiagnosing a system check failure as an application or OS issue leads to wasted time and prolonged outages.

Symptoms

When EC2 system status check fails due to underlying system impairment, you will observe these symptoms:

  • EC2 console shows "Status check failed (System check)" for the instance
  • Instance is unreachable via SSH, RDP, or network
  • The instance appears as "running" but cannot be accessed
  • Reboot from within the OS or AWS console doesn't resolve the issue
  • CloudWatch alarm triggers for status check failure
  • Instance worked previously with no configuration changes

EC2 console display:

``` Instance State: running Status Checks: System reachability: 0/2 checks passed (impaired) Instance reachability: 2/2 checks passed

Status check message: Impaired ```

AWS CLI output:

```bash aws ec2 describe-instance-status --instance-ids i-0123456789abcdef0

# Output: { "InstanceStatuses": [{ "InstanceId": "i-0123456789abcdef0", "InstanceState": {"Name": "running"}, "InstanceStatus": { "Status": "ok", "Details": [{"Status": "passed", "Name": "reachability"}] }, "SystemStatus": { "Status": "impaired", "Details": [{"Status": "failed", "Name": "reachability"}] } }] } ```

CloudWatch alarm:

bash
Alarm Name: EC2-StatusCheck-System
State: ALARM
Metric: StatusCheckFailed_System
Value: 1.0 (failure)

Common Causes

Several factors cause EC2 system status check failures:

  1. 1.Hardware failure on host: Physical server hardware (CPU, memory, storage, network interface) fails or becomes unreliable.
  2. 2.Power or cooling issues: Data center power fluctuations or cooling problems affect the physical host.
  3. 3.Network impairment: Network connectivity issues between the host and AWS network infrastructure.
  4. 4.Hypervisor problems: Issues with the virtualization layer managing the instance.
  5. 5.Scheduled maintenance: AWS-initiated host maintenance causing temporary impairment.
  6. 6.Host capacity issues: The physical host becomes overloaded or unstable.
  7. 7.Storage system problems: EBS or instance store connectivity issues from the host.
  8. 8.AWS infrastructure events: Rare but significant infrastructure problems affecting multiple hosts.

Step-by-Step Fix

Follow these steps to diagnose and resolve EC2 system status check failures:

Step 1: Confirm which status check is failing

Verify the system check is the issue:

```bash # Check instance status aws ec2 describe-instance-status \ --instance-ids i-0123456789abcdef0 \ --query 'InstanceStatuses[0].{System: SystemStatus.Status, Instance: InstanceStatus.Status}'

# Output confirming system check failure: { "System": "impaired", "Instance": "ok" }

# Get detailed status check results aws ec2 describe-instance-status \ --instance-ids i-0123456789abcdef0 \ --query 'InstanceStatuses[0].SystemStatus' ```

Step 2: Check console output for additional context

Review the instance console output:

```bash # Get console output aws ec2 get-console-output \ --instance-id i-0123456789abcdef0 \ --query 'Output' \ --output text > console-output.txt

# Review for OS-level issues cat console-output.txt

# If console shows normal OS startup, problem is likely infrastructure # If console shows kernel panic or errors, might be instance-level issue ```

Step 3: Attempt instance reboot first

Try a simple reboot (may resolve some transient issues):

```bash # Reboot the instance aws ec2 reboot-instances --instance-ids i-0123456789abcdef0

# Wait 2-5 minutes sleep 300

# Check status again aws ec2 describe-instance-status \ --instance-ids i-0123456789abcdef0 \ --query 'InstanceStatuses[0].SystemStatus.Status'

# If still "impaired", proceed to stop-start ```

Step 4: Stop and start the instance

Migrate to new host via stop-start:

```bash # Stop the instance (this moves it to new hardware when started) aws ec2 stop-instances --instance-ids i-0123456789abcdef0

# Wait for instance to stop aws ec2 wait instance-stopped --instance-ids i-0123456789abcdef0

# Verify stopped state aws ec2 describe-instances \ --instance-ids i-0123456789abcdef0 \ --query 'Reservations[0].Instances[0].State.Name'

# Start the instance aws ec2 start-instances --instance-ids i-0123456789abcdef0

# Wait for instance to run aws ec2 wait instance-running --instance-ids i-0123456789abcdef0

# Check status checks aws ec2 wait instance-status-ok --instance-ids i-0123456789abcdef0

# Expected: Both status checks now pass ```

Step 5: Consider implications of stop-start

Before stopping, understand the consequences:

bash # Check if instance has instance store volumes (lost on stop) aws ec2 describe-instances \ --instance-ids i-0123456789abcdef0 \ --query 'Reservations[0].Instances[0].BlockDeviceMappings[?DeviceName!=/dev/sda1`]'

# Check public IP (changes on stop-start unless Elastic IP) aws ec2 describe-instances \ --instance-ids i-0123456789abcdef0 \ --query 'Reservations[0].Instances[0].PublicIpAddress'

# Check if Elastic IP is associated aws ec2 describe-addresses \ --filters "Name=instance-id,Values=i-0123456789abcdef0" ```

If instance has instance store or public IP that must be preserved:

```bash # Allocate Elastic IP if needed aws ec2 allocate-address --domain vpc

# Associate Elastic IP before stopping aws ec2 associate-address --instance-id i-0123456789abcdef0 --allocation-id eipalloc-12345678

# Now stop-start is safe ```

Step 6: Use instance recovery if stop-start is not viable

For instances that cannot be stopped:

```bash # Request instance recovery aws ec2 recover-instance --instance-id i-0123456789abcdef0

# Recovery attempts to restore on same or different hardware # Without stopping the instance

# Monitor recovery aws ec2 describe-instance-status \ --instance-ids i-0123456789abcdef0 ```

Step 7: Create replacement if recovery fails

If the instance cannot be recovered:

```bash # Create AMI from the instance (if still accessible at disk level) aws ec2 create-image \ --instance-id i-0123456789abcdef0 \ --name "recovery-$(date +%Y%m%d)" \ --no-reboot

# Launch new instance from AMI aws ec2 run-instances \ --image-id ami-newlycreated \ --instance-type t3.medium \ --subnet-id subnet-12345 \ --key-name my-key

# Or launch from original AMI and restore data from EBS snapshot ```

Step 8: Verify recovery

Confirm the instance is healthy:

```bash # Check status checks aws ec2 describe-instance-status \ --instance-ids i-0123456789abcdef0 \ --query 'InstanceStatuses[0].{System: SystemStatus.Status, Instance: InstanceStatus.Status}'

# Expected output: { "System": "ok", "Instance": "ok" }

# Test connectivity ssh ec2-user@$(aws ec2 describe-instances \ --instance-ids i-0123456789abcdef0 \ --query 'Reservations[0].Instances[0].PublicIpAddress' \ --output text)

# Verify application health curl http://instance-ip/health ```

Verification

After recovery, verify complete functionality:

```bash # Verify status checks pass aws ec2 describe-instance-status \ --instance-ids i-0123456789abcdef0 \ --query 'InstanceStatuses[0].SystemStatus.Status' # Expected: "ok"

# Check CloudWatch metrics are normal aws cloudwatch get-metric-statistics \ --namespace AWS/EC2 \ --metric-name StatusCheckFailed_System \ --dimensions Name=InstanceId,Value=i-0123456789abcdef0 \ --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \ --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \ --period 300 \ --statistics Average # Expected: All values should be 0.0

# Verify application responds curl -f http://instance-ip:8080/health || echo "Health check failed" ```

Prevention

To prepare for and prevent system status check failures:

  1. 1.Use Elastic IPs: Prevent IP changes during stop-start recovery.
bash
aws ec2 allocate-address --domain vpc
aws ec2 associate-address --instance-id i-xxxx --allocation-id eipalloc-xxxx
  1. 1.Use EBS-backed instances: EBS persists data across stop-start.
bash
# Check root device type
aws ec2 describe-instances --query 'Reservations[0].Instances[0].RootDeviceType'
# "ebs" is preferred over "instance-store"
  1. 1.Set up CloudWatch alarms for status checks:
bash
aws cloudwatch put-metric-alarm \
  --alarm-name "ec2-system-check-i-xxxx" \
  --metric-name StatusCheckFailed_System \
  --namespace AWS/EC2 \
  --dimensions Name=InstanceId,Value=i-xxxx \
  --statistic Maximum \
  --period 60 \
  --evaluation-periods 2 \
  --threshold 1 \
  --comparison-operator GreaterThanOrEqualToThreshold \
  --alarm-actions arn:aws:sns:region:account:alerts
  1. 1.Configure automatic recovery:
bash
aws cloudwatch put-metric-alarm \
  --alarm-name "ec2-auto-recover-i-xxxx" \
  --metric-name StatusCheckFailed_System \
  --namespace AWS/EC2 \
  --dimensions Name=InstanceId,Value=i-xxxx \
  --statistic Maximum \
  --period 60 \
  --evaluation-periods 2 \
  --threshold 1 \
  --comparison-operator GreaterThanOrEqualToThreshold \
  --alarm-actions arn:aws:automate:region:ec2:recover
  1. 1.Use Auto Scaling Groups: ASG automatically replaces unhealthy instances.
  2. 2.Maintain backups and AMIs: Regular AMI creation for quick replacement.
bash
# Automated AMI creation
aws ec2 create-image \
  --instance-id i-xxxx \
  --name "backup-$(date +%Y%m%d)" \
  --no-reboot \
  --description "Automated backup"
  1. 1.Document recovery procedures: Create runbooks with specific instance IDs and recovery steps.
markdown
## Instance Recovery Runbook
1. Check status: `aws ec2 describe-instance-status`
2. Attempt reboot: `aws ec2 reboot-instances`
3. Stop-start: `aws ec2 stop-instances && aws ec2 start-instances`
4. Create replacement: Launch from AMI ami-xxxx

Related Articles

  • [AWS troubleshooting: Fix IAM Permission Denied - Complete Tro](fix-iam-permission-denied)
  • [AWS cloud troubleshooting: AWS ACM Certificate Pending Validation Because the](aws-acm-certificate-pending-validation-wrong-route53-zone)
  • [AWS cloud troubleshooting: AWS ALB Returns 502 Because the Target Closed the ](aws-alb-502-target-closed-connection-keepalive-timeout-mismatch)
  • [AWS cloud troubleshooting: Fix AWS ALB CreateListener TargetGroupNotFound Err](aws-alb-createlistener-targetgroupnotfound)
  • [AWS cloud troubleshooting: Fix Aws Alb Lambda 502 Bad Gateway Issue in AWS](aws-alb-lambda-502-bad-gateway)

<script type="application/ld+json"> { "@context": "https://schema.org", "@type": "TechArticle", "headline": "AWS cloud troubleshooting: AWS EC2 Instance Status Check Failed Because the U", "description": "Professional guide to fix AWS EC2 Instance Status Check Failed Because the Underlying System Was Impaired. AWS cloud troubleshooting with step-by-step solutions. Learn best practices and prevention strategies.", "url": "https://www.fixwikihub.com/aws-ec2-instance-status-check-failed-system-impaired", "publisher": { "@type": "Organization", "name": "FixWikiHub", "url": "https://www.fixwikihub.com" }, "author": { "@type": "Person", "name": "FixWikiHub Editorial Team" }, "datePublished": "2026-01-22T17:34:14.810Z", "dateModified": "2026-01-22T17:34:14.810Z" } </script>