Introduction

As infrastructure scales from dozens to hundreds or thousands of hosts, Ansible playbook execution time can grow from minutes to hours. A playbook that completes in 2 minutes on 10 hosts might take 4 hours on 500 hosts if not properly optimized. The bottleneck is rarely the tasks themselves but rather the parallelism configuration, fact gathering overhead, SSH connection management, and execution strategy choices.

Slow execution impacts deployment windows, incident response times, and team productivity. Understanding and tuning Ansible's parallelism controls is essential for production-scale automation.

Symptoms

Playbooks take excessively long to complete:

```bash $ time ansible-playbook deploy.yml -i production

PLAY [Deploy application] *********** # Each task takes minutes as hosts process sequentially

TASK [Update packages] ************** Tuesday 14:00:00 - changed: [server-001] Tuesday 14:01:30 - changed: [server-002] Tuesday 14:03:00 - changed: [server-003] # ... continuing one at a time ...

real 4h32m15s user 0m45.123s sys 0m12.456s ```

Fact gathering dominates execution time:

``` TASK [Gather Facts] ************ ok: [web-server-001] ok: [web-server-002] # 10 minutes of fact gathering for 100 hosts

PLAY RECAP ***************** web-server-001 : ok=5 changed=2 unreachable=0 failed=0 # Total time: 45 minutes, of which 30 minutes was fact gathering ```

Low CPU utilization on control node:

bash
# During playbook run
$ top -p $(pgrep ansible)
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
12345 admin     20   0  245624  45124  12345 S   5.2   1.2   145:23.45 ansible-playbook
# Only 5% CPU - control node is mostly idle waiting for hosts

Tower job queue backing up:

bash
Tower Dashboard:
  Running Jobs: 5 (each taking 2+ hours)
  Pending Jobs: 47
  Average Job Duration: 2h 15m

Connection timeouts during large-scale runs:

bash
TASK [Deploy application] *******************************************************
fatal: [server-150]: FAILED! => {"msg": "Failed to connect to the host via ssh: Connection timed out during SSH handshake"}

Common Causes

1. Default Forks Too Low

The default forks = 5 means only 5 hosts execute simultaneously:

bash
# With 100 hosts and forks=5:
# Each task batch processes 5 hosts at a time
# 100 hosts / 5 forks = 20 sequential batches
# If each batch takes 30 seconds: 20 * 30s = 10 minutes per task

2. Fact Gathering Not Cached

Every play gathers facts from scratch:

```yaml - name: Play 1 hosts: all # Implicit: gather_facts: yes (default) # Gathers facts from ALL hosts

  • name: Play 2
  • hosts: all
  • # Gathers facts AGAIN from ALL hosts
  • `

3. SSH Connection Overhead

Each task creates a new SSH connection by default:

``` # Without pipelining: # 1. SSH connect -> 2. Transfer module -> 3. Execute -> 4. Return result -> 5. Disconnect # Overhead: 1-3 seconds per host per task

# With 100 hosts and 20 tasks: # 100 * 20 * 2s = 4000s = 66 minutes of overhead alone ```

4. Serial Execution Limiting Parallelism

Using serial: 1 or low serial values:

yaml
- name: Rolling deploy
  hosts: webservers
  serial: 1  # Only 1 host at a time!
  # For 50 hosts, this means 50 sequential deployments

5. Strategy Plugin Overhead

The linear strategy waits for all hosts before proceeding:

yaml
- name: Deploy
  hosts: all
  strategy: linear  # Default - waits for slowest host
  # If one host is slow, all others wait

6. Callback Plugin Overhead

Some callbacks add significant overhead:

ini
# Heavy callbacks
callbacks_enabled = ansible.builtin.profile_tasks, ansible.builtin.timer, my_custom_logging
# Each callback processes every event

Step-by-Step Fix

Step 1: Diagnose Performance Bottlenecks

Measure where time is spent:

```bash # Enable profiling export ANSIBLE_CALLBACKS_ENABLED=profile_tasks,profile_roles

ansible-playbook deploy.yml

# Output shows time per task: # Tuesday 14:30:00 - TASK: Gather Facts (0:02:15.123) # Tuesday 14:32:15 - TASK: Install packages (0:05:30.456) # Tuesday 14:37:45 - TASK: Configure app (0:01:45.789) ```

Time the playbook components:

```bash # Time fact gathering only time ansible all -m setup

# Time a single task time ansible all -m ping

# Check SSH connection overhead time ansible all -m command -a "echo test" ```

Monitor control node resources:

bash
# Watch during playbook run
watch -n 1 "ps aux | grep ansible; echo '---'; free -h; echo '---'; uptime"

Step 2: Increase Forks Configuration

Set appropriate forks for your environment:

```bash # Run with more forks ansible-playbook deploy.yml -f 50

# Or configure permanently in ansible.cfg ```

```ini # ansible.cfg [defaults] # Set forks based on your infrastructure # Rule of thumb: forks = number of hosts / 10, minimum 10, maximum 100-200 forks = 50

# For very large runs (1000+ hosts) # forks = 100

# Consider control node resources: # Each fork uses ~10-30MB memory # 100 forks = ~1-3GB additional memory ```

Calculate optimal forks:

```bash #!/bin/bash # calculate_forks.sh

HOST_COUNT=$(ansible all --list-hosts | wc -l) CONTROL_MEMORY_GB=$(free -g | awk '/^Mem:/{print $2}')

# Estimate: 20MB per fork, leave 50% memory for other processes MAX_FORKS_BY_MEMORY=$((CONTROL_MEMORY_GB * 1024 / 20 / 2))

# Estimate: forks = host_count / 10 SUGGESTED_FORKS=$((HOST_COUNT / 10))

# Cap at reasonable maximum FINAL_FORKS=$((SUGGESTED_FORKS < MAX_FORKS_BY_MEMORY ? SUGGESTED_FORKS : MAX_FORKS_BY_MEMORY)) FINAL_FORKS=$((FINAL_FORKS > 100 ? 100 : FINAL_FORKS)) FINAL_FORKS=$((FINAL_FORKS < 10 ? 10 : FINAL_FORKS))

echo "Host count: $HOST_COUNT" echo "Control node memory: ${CONTROL_MEMORY_GB}GB" echo "Suggested forks: $FINAL_FORKS" ```

Step 3: Enable SSH Pipelining

Reduce SSH connection overhead:

```ini # ansible.cfg [defaults] pipelining = True

[ssh_connection] pipelining = True ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o ControlPath=/tmp/ansible-ssh-%h-%p-%r pipelining = True

# For faster SSH connections scp_if_ssh = smart transfer_method = smart ```

Verify pipelining is working:

```bash # Run with verbose SSH output ANSIBLE_DEBUG=1 ansible all -m ping 2>&1 | grep -i pipeline

# Compare performance time ansible all -m ping # With pipelining time ANSIBLE_PIPELINING=False ansible all -m ping # Without pipelining ```

Step 4: Configure Fact Caching

Cache facts to avoid repeated gathering:

```ini # ansible.cfg [defaults] gathering = smart fact_caching = jsonfile fact_caching_connection = /var/cache/ansible/facts fact_caching_timeout = 86400 # 24 hours

# Or use Redis for distributed caching # fact_caching = redis # fact_caching_connection = localhost:6379:0 ```

Create the cache directory:

bash
sudo mkdir -p /var/cache/ansible/facts
sudo chown $USER:$USER /var/cache/ansible/facts

Use selective fact gathering:

```yaml # playbook.yml - name: First play - gather facts once hosts: all gather_facts: yes tasks: - name: Cache facts set_fact: facts_cached: true delegate_to: localhost delegate_facts: true

  • name: Second play - use cached facts
  • hosts: all
  • gather_facts: no # Don't re-gather
  • tasks:
  • - name: Use cached fact
  • debug:
  • msg: "IP is {{ ansible_default_ipv4.address }}"
  • `

Clear cache when needed:

```bash # Clear all cached facts ansible all -m meta -a "clear_facts=true"

# Clear cache files rm -rf /var/cache/ansible/facts/* ```

Step 5: Optimize Strategy Configuration

Use the free strategy for faster execution:

```yaml # playbook.yml - name: Fast parallel deployment hosts: all strategy: free # Don't wait for other hosts tasks: - name: Task 1 # Hosts proceed to Task 2 as soon as they finish Task 1 # Instead of waiting for all hosts to finish Task 1

  • name: Task 2
  • # Some hosts may start this while others are still on Task 1
  • `

```ini # ansible.cfg [defaults] # Strategy for faster execution strategy = free

# For ordered execution with some parallelism # strategy = linear (default) ```

For rolling updates with proper parallelism:

```yaml - name: Rolling deployment hosts: webservers serial: "20%" # Process 20% of hosts at a time # For 100 hosts: 20 at a time # Better than serial: 1

tasks: - name: Deploy # ... ```

Step 6: Optimize Task Execution

Reduce unnecessary operations:

```yaml # Bad: Runs on every host every time - name: Install package yum: name: nginx state: present # Always checks/installs

# Better: Conditional execution - name: Install package yum: name: nginx state: present when: "'nginx' not in ansible_facts.packages" # Only runs if not already installed

# Best: Use check mode in CI, then apply - name: Check package status command: rpm -q nginx register: nginx_check changed_when: false failed_when: false

  • name: Install package
  • yum:
  • name: nginx
  • state: present
  • when: nginx_check.rc != 0
  • `

Use async for long-running tasks:

```yaml - name: Long-running update yum: name: "*" state: latest async: 3600 # 1 hour timeout poll: 0 # Don't wait, move on register: update_async

  • name: Check update status later
  • async_status:
  • jid: "{{ update_async.ansible_job_id }}"
  • register: update_result
  • until: update_result.finished
  • retries: 120
  • delay: 30
  • `

Step 7: Configure Tower/AWX Performance

For Tower environments:

```python # /etc/tower/settings.py # Increase parallel job capacity AWX_TASK_ENV = { 'ANSIBLE_FORKS': 50, 'ANSIBLE_PIPELINING': 'True', 'ANSIBLE_GATHERING': 'smart', 'ANSIBLE_FACT_CACHING': 'jsonfile', 'ANSIBLE_FACT_CACHING_CONNECTION': '/var/lib/awx/facts_cache', }

# Configure instance capacity CLUSTER_HOST_CAPACITY = 100 # Max concurrent forks per instance ```

For Kubernetes AWX:

yaml
# awx-deployment.yaml
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx
spec:
  task_env:
    - name: ANSIBLE_FORKS
      value: "50"
    - name: ANSIBLE_PIPELINING
      value: "True"

Verification

Test performance improvements:

```bash # Run with timing time ansible-playbook deploy.yml -f 50

# Compare before/after: # Before (forks=5): 45 minutes # After (forks=50): 8 minutes

# Check fact cache is working ls -la /var/cache/ansible/facts/ # Should see files for each host

# Run again to see cache benefit time ansible-playbook deploy.yml -f 50 # Second run should be faster due to cached facts

# Verify pipelining with debug ANSIBLE_DEBUG=1 ansible all -m ping 2>&1 | grep -c "Pipelining" # Should show pipelining is enabled ```

Monitor resource usage during execution:

```bash # Watch control node during playbook run watch -n 1 "ps aux | grep ansible-playbook | head -1; echo '---'; uptime"

# Expected: Higher CPU usage (efficient), stable memory ```

  • [ansible-ssh-unreachable-host-key-verification-failed](/articles/ansible-ssh-unreachable-host-key-verification-failed) - SSH connection issues
  • [ansible-inventory-dynamic-cloud-source-failed](/articles/ansible-inventory-dynamic-cloud-source-failed) - Inventory performance
  • [ansible-handler-not-triggered-notify-missing](/articles/ansible-handler-not-triggered-notify-missing) - Handler execution optimization
  • [WordPress troubleshooting: Ansible Artifact Download Uses an Old Mi](ansible-artifact-download-uses-an-old-mirror-after-proxy-change)
  • [WordPress troubleshooting: Ansible Audit Trail Misses Events Under ](ansible-audit-trail-misses-events-under-burst-load)
  • [WordPress troubleshooting: Ansible Background Worker Gets Stuck in ](ansible-background-worker-stuck-in-a-retry-loop)
  • [WordPress troubleshooting: Ansible Backup Completes but Restore Fai](ansible-backup-completes-but-restore-fails-checksum-validation)
  • [WordPress troubleshooting: Ansible Batch Importer Duplicates Rows A](ansible-batch-importer-duplicates-rows-after-a-retry)

<script type="application/ld+json"> { "@context": "https://schema.org", "@type": "TechArticle", "headline": "WordPress troubleshooting: Ansible Execution Too Slow - Parallelism", "description": "Learn how to fix Ansible Execution Too Slow - Parallelism and Forks. Professional WordPress troubleshooting solutions with step-by-step guidance. WP error fix, WordPress optimization, WP security, WordPress performance.", "url": "https://www.fixwikihub.com/ansible-forks-too-slow-parallel-execution", "publisher": { "@type": "Organization", "name": "FixWikiHub", "url": "https://www.fixwikihub.com" }, "author": { "@type": "Person", "name": "FixWikiHub Editorial Team" }, "datePublished": "2025-12-15T09:25:13.665Z", "dateModified": "2025-12-15T09:25:13.665Z" } </script>