Home / AWS / AWS cloud troubleshooting: AWS EC2 Instance Status Check Failed Because the U

AWS

AWS cloud troubleshooting: AWS EC2 Instance Status Check Failed Because the U

Fix EC2 instances with a failed system status check by distinguishing host impairment from guest OS issues and migrating the instance to healthy hardware when needed.

Published: Jan 22, 20269 min readBy FixWikiHub Editorial Team

Abstract illustration for a troubleshooting knowledge base category.

Introduction

Amazon EC2 instances run on physical host machines in AWS data centers. Each EC2 instance has two types of status checks: system status checks and instance status checks. System status checks detect problems with the underlying infrastructure—power, networking, hardware, or the hypervisor itself. Instance status checks detect problems within the guest operating system or the instance configuration.

When a system status check fails, the problem is below the virtual machine layer. Rebooting the operating system from within the instance typically has no effect because the issue exists in the physical infrastructure hosting the VM. The resolution requires migrating the instance to healthy hardware through AWS-controlled operations like stop-start or instance recovery.

Understanding the distinction between system and instance status checks is critical for efficient troubleshooting. Misdiagnosing a system check failure as an application or OS issue leads to wasted time and prolonged outages.

Symptoms

When EC2 system status check fails due to underlying system impairment, you will observe these symptoms:

EC2 console shows "Status check failed (System check)" for the instance
Instance is unreachable via SSH, RDP, or network
The instance appears as "running" but cannot be accessed
Reboot from within the OS or AWS console doesn't resolve the issue
CloudWatch alarm triggers for status check failure
Instance worked previously with no configuration changes

EC2 console display:

``` Instance State: running Status Checks: System reachability: 0/2 checks passed (impaired) Instance reachability: 2/2 checks passed

Status check message: Impaired ```

AWS CLI output:

```bash aws ec2 describe-instance-status --instance-ids i-0123456789abcdef0

# Output: { "InstanceStatuses": [{ "InstanceId": "i-0123456789abcdef0", "InstanceState": {"Name": "running"}, "InstanceStatus": { "Status": "ok", "Details": [{"Status": "passed", "Name": "reachability"}] }, "SystemStatus": { "Status": "impaired", "Details": [{"Status": "failed", "Name": "reachability"}] } }] } ```

CloudWatch alarm:

bash

Alarm Name: EC2-StatusCheck-System
State: ALARM
Metric: StatusCheckFailed_System
Value: 1.0 (failure)

Common Causes

Several factors cause EC2 system status check failures:

1.Hardware failure on host: Physical server hardware (CPU, memory, storage, network interface) fails or becomes unreliable.
2.Power or cooling issues: Data center power fluctuations or cooling problems affect the physical host.
3.Network impairment: Network connectivity issues between the host and AWS network infrastructure.
4.Hypervisor problems: Issues with the virtualization layer managing the instance.
5.Scheduled maintenance: AWS-initiated host maintenance causing temporary impairment.
6.Host capacity issues: The physical host becomes overloaded or unstable.
7.Storage system problems: EBS or instance store connectivity issues from the host.
8.AWS infrastructure events: Rare but significant infrastructure problems affecting multiple hosts.

Step-by-Step Fix

Follow these steps to diagnose and resolve EC2 system status check failures:

Step 1: Confirm which status check is failing

Verify the system check is the issue:

```bash # Check instance status aws ec2 describe-instance-status \ --instance-ids i-0123456789abcdef0 \ --query 'InstanceStatuses[0].{System: SystemStatus.Status, Instance: InstanceStatus.Status}'

# Output confirming system check failure: { "System": "impaired", "Instance": "ok" }

# Get detailed status check results aws ec2 describe-instance-status \ --instance-ids i-0123456789abcdef0 \ --query 'InstanceStatuses[0].SystemStatus' ```

Step 2: Check console output for additional context

Review the instance console output:

```bash # Get console output aws ec2 get-console-output \ --instance-id i-0123456789abcdef0 \ --query 'Output' \ --output text > console-output.txt

# Review for OS-level issues cat console-output.txt

# If console shows normal OS startup, problem is likely infrastructure # If console shows kernel panic or errors, might be instance-level issue ```

Step 3: Attempt instance reboot first

Try a simple reboot (may resolve some transient issues):

```bash # Reboot the instance aws ec2 reboot-instances --instance-ids i-0123456789abcdef0

# Wait 2-5 minutes sleep 300

# Check status again aws ec2 describe-instance-status \ --instance-ids i-0123456789abcdef0 \ --query 'InstanceStatuses[0].SystemStatus.Status'

# If still "impaired", proceed to stop-start ```

Step 4: Stop and start the instance

Migrate to new host via stop-start:

```bash # Stop the instance (this moves it to new hardware when started) aws ec2 stop-instances --instance-ids i-0123456789abcdef0

# Wait for instance to stop aws ec2 wait instance-stopped --instance-ids i-0123456789abcdef0

# Verify stopped state aws ec2 describe-instances \ --instance-ids i-0123456789abcdef0 \ --query 'Reservations[0].Instances[0].State.Name'

# Start the instance aws ec2 start-instances --instance-ids i-0123456789abcdef0

# Wait for instance to run aws ec2 wait instance-running --instance-ids i-0123456789abcdef0

# Check status checks aws ec2 wait instance-status-ok --instance-ids i-0123456789abcdef0

# Expected: Both status checks now pass ```

Step 5: Consider implications of stop-start

Before stopping, understand the consequences:

bash # Check if instance has instance store volumes (lost on stop) aws ec2 describe-instances \ --instance-ids i-0123456789abcdef0 \ --query 'Reservations[0].Instances[0].BlockDeviceMappings[?DeviceName!=/dev/sda1`]'

# Check public IP (changes on stop-start unless Elastic IP) aws ec2 describe-instances \ --instance-ids i-0123456789abcdef0 \ --query 'Reservations[0].Instances[0].PublicIpAddress'

# Check if Elastic IP is associated aws ec2 describe-addresses \ --filters "Name=instance-id,Values=i-0123456789abcdef0" ```

If instance has instance store or public IP that must be preserved:

```bash # Allocate Elastic IP if needed aws ec2 allocate-address --domain vpc

# Associate Elastic IP before stopping aws ec2 associate-address --instance-id i-0123456789abcdef0 --allocation-id eipalloc-12345678

# Now stop-start is safe ```

Step 6: Use instance recovery if stop-start is not viable

For instances that cannot be stopped:

```bash # Request instance recovery aws ec2 recover-instance --instance-id i-0123456789abcdef0

# Recovery attempts to restore on same or different hardware # Without stopping the instance

# Monitor recovery aws ec2 describe-instance-status \ --instance-ids i-0123456789abcdef0 ```

Step 7: Create replacement if recovery fails

If the instance cannot be recovered:

```bash # Create AMI from the instance (if still accessible at disk level) aws ec2 create-image \ --instance-id i-0123456789abcdef0 \ --name "recovery-$(date +%Y%m%d)" \ --no-reboot

# Launch new instance from AMI aws ec2 run-instances \ --image-id ami-newlycreated \ --instance-type t3.medium \ --subnet-id subnet-12345 \ --key-name my-key

# Or launch from original AMI and restore data from EBS snapshot ```

Step 8: Verify recovery

Confirm the instance is healthy:

```bash # Check status checks aws ec2 describe-instance-status \ --instance-ids i-0123456789abcdef0 \ --query 'InstanceStatuses[0].{System: SystemStatus.Status, Instance: InstanceStatus.Status}'

# Expected output: { "System": "ok", "Instance": "ok" }

# Test connectivity ssh ec2-user@$(aws ec2 describe-instances \ --instance-ids i-0123456789abcdef0 \ --query 'Reservations[0].Instances[0].PublicIpAddress' \ --output text)

# Verify application health curl http://instance-ip/health ```

Verification

After recovery, verify complete functionality:

```bash # Verify status checks pass aws ec2 describe-instance-status \ --instance-ids i-0123456789abcdef0 \ --query 'InstanceStatuses[0].SystemStatus.Status' # Expected: "ok"

# Check CloudWatch metrics are normal aws cloudwatch get-metric-statistics \ --namespace AWS/EC2 \ --metric-name StatusCheckFailed_System \ --dimensions Name=InstanceId,Value=i-0123456789abcdef0 \ --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \ --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \ --period 300 \ --statistics Average # Expected: All values should be 0.0

# Verify application responds curl -f http://instance-ip:8080/health || echo "Health check failed" ```

Prevention

To prepare for and prevent system status check failures:

1.Use Elastic IPs: Prevent IP changes during stop-start recovery.

bash

aws ec2 allocate-address --domain vpc
aws ec2 associate-address --instance-id i-xxxx --allocation-id eipalloc-xxxx

1.Use EBS-backed instances: EBS persists data across stop-start.

bash

# Check root device type
aws ec2 describe-instances --query 'Reservations[0].Instances[0].RootDeviceType'
# "ebs" is preferred over "instance-store"

1.Set up CloudWatch alarms for status checks:

bash

aws cloudwatch put-metric-alarm \
  --alarm-name "ec2-system-check-i-xxxx" \
  --metric-name StatusCheckFailed_System \
  --namespace AWS/EC2 \
  --dimensions Name=InstanceId,Value=i-xxxx \
  --statistic Maximum \
  --period 60 \
  --evaluation-periods 2 \
  --threshold 1 \
  --comparison-operator GreaterThanOrEqualToThreshold \
  --alarm-actions arn:aws:sns:region:account:alerts

1.Configure automatic recovery:

bash

aws cloudwatch put-metric-alarm \
  --alarm-name "ec2-auto-recover-i-xxxx" \
  --metric-name StatusCheckFailed_System \
  --namespace AWS/EC2 \
  --dimensions Name=InstanceId,Value=i-xxxx \
  --statistic Maximum \
  --period 60 \
  --evaluation-periods 2 \
  --threshold 1 \
  --comparison-operator GreaterThanOrEqualToThreshold \
  --alarm-actions arn:aws:automate:region:ec2:recover

1.Use Auto Scaling Groups: ASG automatically replaces unhealthy instances.
2.Maintain backups and AMIs: Regular AMI creation for quick replacement.

bash

# Automated AMI creation
aws ec2 create-image \
  --instance-id i-xxxx \
  --name "backup-$(date +%Y%m%d)" \
  --no-reboot \
  --description "Automated backup"

1.Document recovery procedures: Create runbooks with specific instance IDs and recovery steps.

markdown

## Instance Recovery Runbook
1. Check status: `aws ec2 describe-instance-status`
2. Attempt reboot: `aws ec2 reboot-instances`
3. Stop-start: `aws ec2 stop-instances && aws ec2 start-instances`
4. Create replacement: Launch from AMI ami-xxxx

[AWS troubleshooting: Fix IAM Permission Denied - Complete Tro](fix-iam-permission-denied)
[AWS cloud troubleshooting: AWS ACM Certificate Pending Validation Because the](aws-acm-certificate-pending-validation-wrong-route53-zone)
[AWS cloud troubleshooting: AWS ALB Returns 502 Because the Target Closed the ](aws-alb-502-target-closed-connection-keepalive-timeout-mismatch)
[AWS cloud troubleshooting: Fix AWS ALB CreateListener TargetGroupNotFound Err](aws-alb-createlistener-targetgroupnotfound)
[AWS cloud troubleshooting: Fix Aws Alb Lambda 502 Bad Gateway Issue in AWS](aws-alb-lambda-502-bad-gateway)

Was this guide helpful?

Related search paths

People also search for

If the symptom is close but not identical, these search paths usually surface the right neighboring fixes faster than scrolling the full archive.

AWS cloud troubleshooting: AWS EC2 Instance Status Check Failed AWS cloud troubleshooting: AWS EC2 Instance Status Check Failed AWS AWS cloud troubleshooting: AWS EC2 Instance Status Check Failed troubleshooting AWS cloud troubleshooting: AWS EC2 Instance Status Check Failed fix Fix EC2 instances with a failed system status check by distinguishing host impairment from guest OS issues and migrating the instance to healthy hardware when needed AWS Fix EC2 instances with a failed system status check by distinguishing host impairment from guest OS issues and migrating the instance to healthy hardware when needed

Explore Related Topics

Browse Guides from Other Categories

Discover troubleshooting guides from related categories to expand your knowledge.

FAQ

AWS Troubleshooting FAQs

Common questions about troubleshooting and preventing similar issues

How do I know if this aws-errors troubleshooting guide applies to my situation?

This guide is designed for aws-errors issues. If you're experiencing similar symptoms described in the article, follow the step-by-step instructions. Start with the most common causes and work through the diagnostic process.

Is it safe to follow these aws-errors troubleshooting steps?

Yes, all steps are designed to be safe and non-destructive. We recommend creating backups before making significant changes and testing each step before proceeding to the next.

How long does it typically take to resolve this type of aws-errors issue?

Most aws-errors issues can be resolved within 30 minutes to 2 hours, depending on the complexity and root cause. Follow the troubleshooting flow to identify and fix the problem efficiently.

How can I prevent this aws-errors issue from happening again?

Regular maintenance, monitoring, and following best practices for aws-errors configuration can help prevent recurrence. Consider implementing automated checks and alerts for early detection.

Written by

FixWikiHub Editorial Team

Our editorial team consists of experienced DevOps engineers, systems administrators, and cloud architects with hands-on experience in production environments across AWS, Azure, GCP, and on-premises infrastructure.

Every guide undergoes technical review for accuracy and is updated when software versions, commands, or best practices change.

Last updated: Jan 22, 2026

About our team

Important Notice

Disclaimer & Safety Guidelines

The troubleshooting steps in this guide are provided for educational and informational purposes. Before applying any changes to production systems:

Test in a staging environment first — Always verify commands and configurations in a non-production environment before deploying to live systems.
Create backups — Ensure you have current backups of databases, configurations, and critical files before making changes.
Understand the impact — Review how each step may affect your specific environment, dependencies, and users.
Consult official documentation — This guide supplements, but does not replace, official vendor documentation and best practices.

FixWikiHub is not responsible for any damages arising from the use of this content. See our Terms of Use for more information.

Resources

Official Documentation & Further Reading

For authoritative information, consult the official documentation for the technologies discussed in this guide. Our troubleshooting content supplements, but does not replace, vendor documentation.

AWS Documentation — Official Amazon Web Services guides and API references
Kubernetes Documentation — Official Kubernetes documentation
Nginx Documentation — Official Nginx web server documentation
Apache Documentation — Official Apache HTTP Server documentation
Docker Documentation — Official Docker container documentation

AWS cloud troubleshooting: AWS EC2 Instance Status Check Failed Because the U

Introduction

Symptoms

Common Causes

Step-by-Step Fix

Step 1: Confirm which status check is failing

Step 2: Check console output for additional context

Step 3: Attempt instance reboot first

Step 4: Stop and start the instance

Step 5: Consider implications of stop-start

Step 6: Use instance recovery if stop-start is not viable

Step 7: Create replacement if recovery fails

Step 8: Verify recovery

Verification

Prevention

Related Articles

People also search for

Browse Guides from Other Categories

WordPress

SSL

DNS

AWS Troubleshooting FAQs

FixWikiHub Editorial Team

Disclaimer & Safety Guidelines

Official Documentation & Further Reading

AWS cloud troubleshooting: AWS EC2 Instance Status Check Failed Because the U

Introduction

Symptoms

Common Causes

Step-by-Step Fix

Step 1: Confirm which status check is failing

Step 2: Check console output for additional context

Step 3: Attempt instance reboot first

Step 4: Stop and start the instance

Step 5: Consider implications of stop-start

Step 6: Use instance recovery if stop-start is not viable

Step 7: Create replacement if recovery fails

Step 8: Verify recovery

Verification

Prevention

Related Articles

People also search for

Share this guide

More AWS Troubleshooting Guides

Browse Guides from Other Categories

AWS Troubleshooting FAQs

FixWikiHub Editorial Team

Disclaimer & Safety Guidelines

Official Documentation & Further Reading