Introduction
In batch processing workflows with Ansible, a common pattern is to process records through multiple stages: read source data, transform records, validate results, and write to the destination. When validation fails at the final stage, the expectation is that no changes should persist. However, if each stage commits its work independently, partial results remain in the database, leaving the system in an inconsistent state.
This issue manifests when batch jobs fail validation but leave orphaned records, partial updates across related tables, or API calls that were executed but whose results were never acknowledged. The problem is particularly acute when processing financial transactions, inventory updates, or configuration changes where partial application causes data corruption.
Symptoms
The playbook processes a batch but fails at validation:
``` TASK [Process batch records] ************ ok: [localhost] => (item={'id': 1, 'amount': 100.00}) ok: [localhost] => (item={'id': 2, 'amount': 250.00}) ok: [localhost] => (item={'id': 3, 'amount': -50.00}) # Invalid negative amount ok: [localhost] => (item={'id': 4, 'amount': 175.00})
TASK [Validate all records] ************* fatal: [localhost]: FAILED! => {"msg": "Validation failed: Record 3 has invalid amount -50.00"} ```
But checking the database shows records 1, 2, and 4 were committed:
```sql SELECT * FROM transactions WHERE batch_id = 'batch-2026-04-12-001';
id | amount | batch_id | status ----+---------+----------------------+-------- 1 | 100.00 | batch-2026-04-12-001 | pending 2 | 250.00 | batch-2026-04-12-001 | pending 4 | 175.00 | batch-2026-04-12-001 | pending (3 rows)
-- Record 3 is missing, but related records from 1, 2, 4 exist ```
Related tables show partial updates:
```sql SELECT * FROM transaction_audit WHERE batch_id = 'batch-2026-04-12-001';
id | transaction_id | action | created_at ----+---------------+-----------+-------------------- 1 | 1 | CREATED | 2026-04-12 08:25:01 2 | 2 | CREATED | 2026-04-12 08:25:02 3 | 4 | CREATED | 2026-04-12 08:25:04 ```
API calls were made but never marked complete:
# API endpoint shows partial state
$ curl -s https://api.example.com/batches/batch-2026-04-12-001 | jq
{
"id": "batch-2026-04-12-001",
"status": "processing",
"records": [
{"id": 1, "status": "created"},
{"id": 2, "status": "created"},
{"id": 4, "status": "created"}
],
"validation": "never_run"
}Common Causes
1. Auto-Commit in Database Modules
Most Ansible database modules auto-commit each query, making multi-step operations non-atomic:
```yaml - name: Insert record postgresql_query: query: "INSERT INTO table VALUES (...)" # Commits immediately after this task
- name: Insert related record
- postgresql_query:
- query: "INSERT INTO related_table VALUES (...)"
- # If this fails, first insert is already committed
`
2. Validation After Write Operations
The playbook writes data before validation completes:
```yaml - name: Write all records postgresql_query: query: "INSERT INTO records VALUES (...)" loop: "{{ records }}"
- name: Validate written records
- postgresql_query:
- query: "SELECT COUNT(*) FROM records WHERE valid = false"
- # Too late - invalid records already written
`
3. API Calls Without Compensation
REST API calls execute immediately with no rollback mechanism:
- name: Create resources via API
uri:
url: "https://api.example.com/resources"
method: POST
body: "{{ item }}"
loop: "{{ resources }}"
# Each POST is immediately applied - no transaction4. File-Based State Without Atomic Writes
Writing state files without atomic replace:
- name: Write partial results
copy:
content: "{{ partial_results }}"
dest: /var/lib/batch/state.json
# File written immediately, no rollback on failure5. Missing Savepoint or Checkpoint Mechanism
Long-running batches have no checkpoints for partial rollback:
- name: Process 10000 records
command: process-batch --input records.csv
# No way to resume or rollback if fails at record 7500Step-by-Step Fix
Step 1: Implement Transaction Boundaries
Wrap all write operations in explicit transactions:
```yaml - name: Process batch with transaction hosts: localhost vars: batch_id: "batch-{{ ansible_date_time.iso8601_basic_short }}"
tasks: - name: Begin transaction community.postgresql.postgresql_query: db: appdb query: "BEGIN" register: txn
- name: Process records
- block:
- - name: Insert records
- community.postgresql.postgresql_query:
- db: appdb
- query: >
- INSERT INTO transactions (id, amount, batch_id, status)
- VALUES (%s, %s, %s, 'pending')
- positional_args:
- - "{{ item.id }}"
- - "{{ item.amount }}"
- - "{{ batch_id }}"
- loop: "{{ records }}"
- register: insert_results
- name: Validate all records
- community.postgresql.postgresql_query:
- db: appdb
- query: >
- SELECT id FROM transactions
- WHERE batch_id = %s AND amount < 0
- positional_args:
- - "{{ batch_id }}"
- register: invalid_records
- name: Fail if validation errors
- fail:
- msg: "Validation failed: Found {{ invalid_records.query_result | length }} records with negative amounts"
- when: invalid_records.query_result | length > 0
- name: Commit transaction
- community.postgresql.postgresql_query:
- db: appdb
- query: "COMMIT"
rescue: - name: Rollback transaction community.postgresql.postgresql_query: db: appdb query: "ROLLBACK"
- name: Report failure
- debug:
- msg: "Batch {{ batch_id }} failed validation, all changes rolled back"
`
Step 2: Implement Two-Phase Commit Pattern
Separate preparation from commitment for better control:
```yaml - name: Two-phase batch processing hosts: localhost vars: batch_id: "{{ lookup('uuid') }}" staging_table: "transactions_staging"
tasks: - name: Phase 1 - Prepare all records in staging block: - name: Create staging table community.postgresql.postgresql_query: db: appdb query: > CREATE TEMP TABLE {{ staging_table }} AS SELECT * FROM transactions WITH NO DATA
- name: Load records to staging
- community.postgresql.postgresql_query:
- db: appdb
- query: >
- INSERT INTO {{ staging_table }} (id, amount, batch_id)
- VALUES (%s, %s, %s)
- positional_args:
- - "{{ item.id }}"
- - "{{ item.amount }}"
- - "{{ batch_id }}"
- loop: "{{ records }}"
- name: Validate staged records
- community.postgresql.postgresql_query:
- db: appdb
- query: >
- SELECT COUNT(*) as invalid_count
- FROM {{ staging_table }}
- WHERE amount < 0 OR amount > 1000000
- register: validation
- name: Check validation result
- set_fact:
- validation_passed: "{{ validation.query_result[0].invalid_count == 0 }}"
- name: Phase 2 - Commit if validation passed
- when: validation_passed
- block:
- - name: Move staged records to production
- community.postgresql.postgresql_query:
- db: appdb
- query: >
- INSERT INTO transactions (id, amount, batch_id, status)
- SELECT id, amount, batch_id, 'committed'
- FROM {{ staging_table }}
- name: Record batch completion
- community.postgresql.postgresql_query:
- db: appdb
- query: >
- INSERT INTO batch_history (id, record_count, status, completed_at)
- VALUES (%s, %s, 'success', NOW())
- positional_args:
- - "{{ batch_id }}"
- - "{{ records | length }}"
- name: Phase 2 - Abort if validation failed
- when: not validation_passed
- block:
- - name: Record batch failure
- community.postgresql.postgresql_query:
- db: appdb
- query: >
- INSERT INTO batch_history (id, record_count, status, error_message, completed_at)
- VALUES (%s, %s, 'failed', 'Validation failed', NOW())
- positional_args:
- - "{{ batch_id }}"
- - "{{ records | length }}"
- name: Fail the batch
- fail:
- msg: "Batch {{ batch_id }} failed validation - no records committed"
`
Step 3: Implement Compensating Transactions for APIs
For API-based operations, implement undo/compensation logic:
```yaml - name: API batch with compensation hosts: localhost vars: created_resources: []
tasks: - name: Create resources via API block: - name: Create each resource uri: url: "https://api.example.com/resources" method: POST body_format: json body: "{{ item }}" status_code: 201 loop: "{{ resources }}" register: create_results
- name: Track created resource IDs
- set_fact:
- created_resources: "{{ create_results.results | map(attribute='json.id') | list }}"
- name: Validate created resources
- uri:
- url: "https://api.example.com/batches/{{ batch_id }}/validate"
- method: POST
- status_code: 200
- register: validation
- name: Fail if validation error
- fail:
- msg: "Validation failed: {{ validation.json.errors }}"
- when: validation.json.valid == false
rescue: - name: Compensating transaction - delete created resources uri: url: "https://api.example.com/resources/{{ item }}" method: DELETE status_code: 204 loop: "{{ created_resources }}" when: created_resources | length > 0
- name: Log compensation
- debug:
- msg: "Rolled back {{ created_resources | length }} resources after validation failure"
- name: Rethrow original error
- fail:
- msg: "{{ ansible_failed_result.msg }}"
`
Step 4: Use Atomic File Operations
Write files atomically using temporary files:
```yaml - name: Atomic file-based batch state hosts: localhost vars: state_dir: /var/lib/batch state_file: "{{ state_dir }}/current_state.json" temp_file: "{{ state_dir }}/.tmp_state.json"
tasks: - name: Ensure state directory exists file: path: "{{ state_dir }}" state: directory mode: '0755'
- name: Process records and collect results
- set_fact:
- processed_results: "{{ processed_results | default([]) + [processed_item] }}"
- vars:
- processed_item:
- id: "{{ item.id }}"
- status: "processed"
- timestamp: "{{ ansible_date_time.iso8601 }}"
- loop: "{{ records }}"
- name: Validate all results
- fail:
- msg: "Validation failed: Found invalid results"
- when: processed_results | selectattr('status', 'equalto', 'error') | list | length > 0
- name: Write to temporary file first
- copy:
- content: "{{ {'batch_id': batch_id, 'records': processed_results, 'updated': ansible_date_time.iso8601} | to_nice_json }}"
- dest: "{{ temp_file }}"
- mode: '0644'
- name: Atomically replace state file
- command: mv {{ temp_file }} {{ state_file }}
- args:
- creates: "{{ state_file }}"
- changed_when: true
`
Step 5: Implement Batch Processing with Checkpoints
Add checkpoint-based processing for large batches:
```yaml - name: Checkpoint-based batch processing hosts: localhost vars: checkpoint_file: /var/lib/batch/checkpoint.json batch_size: 100
tasks: - name: Load checkpoint if exists slurp: src: "{{ checkpoint_file }}" register: checkpoint_data ignore_errors: yes
- name: Set start position from checkpoint
- set_fact:
- start_position: "{{ (checkpoint_data.content | b64decode | from_json).position | default(0) }}"
- batch_id: "{{ (checkpoint_data.content | b64decode | from_json).batch_id | default(lookup('uuid')) }}"
- when: checkpoint_data is succeeded
- name: Initialize checkpoint for new batch
- set_fact:
- start_position: 0
- batch_id: "{{ lookup('uuid') }}"
- when: checkpoint_data is failed
- name: Process records from checkpoint
- community.postgresql.postgresql_query:
- db: appdb
- query: >
- BEGIN;
- INSERT INTO transactions (id, amount, batch_id)
- SELECT id, amount, %s
- FROM source_records
- WHERE id > %s
- ORDER BY id
- LIMIT %s;
- SELECT MAX(id) as last_id FROM transactions WHERE batch_id = %s;
- COMMIT;
- positional_args:
- - "{{ batch_id }}"
- - "{{ start_position }}"
- - "{{ batch_size }}"
- - "{{ batch_id }}"
- register: batch_result
- name: Update checkpoint
- copy:
- content: "{{ {'batch_id': batch_id, 'position': batch_result.query_result[0].last_id, 'updated': ansible_date_time.iso8601} | to_nice_json }}"
- dest: "{{ checkpoint_file }}"
- name: Validate batch before final commit
- community.postgresql.postgresql_query:
- db: appdb
- query: >
- SELECT COUNT(*) as invalid_count
- FROM transactions
- WHERE batch_id = %s AND amount < 0
- positional_args:
- - "{{ batch_id }}"
- register: validation
- name: Rollback invalid batch
- when: validation.query_result[0].invalid_count > 0
- block:
- - name: Delete invalid batch records
- community.postgresql.postgresql_query:
- db: appdb
- query: "DELETE FROM transactions WHERE batch_id = %s"
- positional_args:
- - "{{ batch_id }}"
- name: Clear checkpoint
- file:
- path: "{{ checkpoint_file }}"
- state: absent
- name: Fail the batch
- fail:
- msg: "Batch rolled back due to validation failure"
- name: Clear checkpoint on success
- file:
- path: "{{ checkpoint_file }}"
- state: absent
- when: validation.query_result[0].invalid_count == 0
`
Verification
Test that partial results are properly rolled back:
```bash # Run batch with intentionally invalid data ansible-playbook batch_process.yml -e '{"records": [{"id": 1, "amount": 100}, {"id": 2, "amount": -50}]}'
# Verify no records were committed psql -d appdb -c "SELECT COUNT(*) FROM transactions WHERE batch_id = 'last_batch';" # Should return 0
# Check batch history shows failure psql -d appdb -c "SELECT * FROM batch_history ORDER BY created_at DESC LIMIT 1;" # Should show status = 'failed' ```
Test API compensation:
```bash # Run API batch that fails validation ansible-playbook api_batch.yml -e '{"resources": [{"name": "valid"}, {"name": "invalid"}]}'
# Verify no resources were left behind curl -s https://api.example.com/resources | jq 'length' # Should show no new resources ```
Test atomic file writes:
```bash # Simulate failure during file write ansible-playbook atomic_file_batch.yml
# Check that state file is either complete or absent test -f /var/lib/batch/current_state.json && echo "File exists and is valid" || echo "File does not exist (rolled back)" ```
Related Issues
- [ansible-batch-importer-duplicates-rows-after-a-retry](/articles/ansible-batch-importer-duplicates-rows-after-a-retry) - Duplicate record issues
- [ansible-lock-contention-grows-after-an-index-change](/articles/ansible-lock-contention-grows-after-an-index-change) - Transaction lock issues
- [ansible-migration-holds-the-main-table-longer-than-the-deployment-window](/articles/ansible-migration-holds-the-main-table-longer-than-the-deployment-window) - Database migration issues
Related Articles
- [WordPress troubleshooting: Ansible Artifact Download Uses an Old Mi](ansible-artifact-download-uses-an-old-mirror-after-proxy-change)
- [WordPress troubleshooting: Ansible Audit Trail Misses Events Under ](ansible-audit-trail-misses-events-under-burst-load)
- [WordPress troubleshooting: Ansible Background Worker Gets Stuck in ](ansible-background-worker-stuck-in-a-retry-loop)
- [WordPress troubleshooting: Ansible Backup Completes but Restore Fai](ansible-backup-completes-but-restore-fails-checksum-validation)
- [WordPress troubleshooting: Ansible Batch Importer Duplicates Rows A](ansible-batch-importer-duplicates-rows-after-a-retry)
<script type="application/ld+json"> { "@context": "https://schema.org", "@type": "TechArticle", "headline": "WordPress troubleshooting: Ansible Batch Writer Commits Partial Res", "description": "Learn how to fix Ansible Batch Writer Commits Partial Results Before Final Validation. Professional WordPress troubleshooting solutions with step-by-step guidance. WP error fix, WordPress optimization, WP security, WordPress performance.", "url": "https://www.fixwikihub.com/ansible-batch-writer-commits-partial-results-before-final-validation", "publisher": { "@type": "Organization", "name": "FixWikiHub", "url": "https://www.fixwikihub.com" }, "author": { "@type": "Person", "name": "FixWikiHub Editorial Team" }, "datePublished": "2026-03-07T13:37:38.525Z", "dateModified": "2026-03-07T13:37:38.525Z" } </script>