Introduction
During a partial secret rotation in Ansible Tower or AWX (where some credentials are updated while others remain unchanged), the runtime may end up using a mix of old and new secret sources. This causes inconsistent authentication behavior: some components authenticate with new credentials while others use stale ones. The issue often manifests as intermittent failures, where operations work sporadically depending on which credential version is used.
This typically occurs during: - Gradual credential rotation across multiple systems - Failed or incomplete secret rotation automation - Rolling deployment where old pods use old secrets - Split-brain scenarios during HA failover with different credential versions - Manual rotation that only updated database records but not mounted secrets
Symptoms
Authentication succeeds intermittently:
```bash # First attempt succeeds $ curl -X POST -k -u admin:password https://tower.example.com/api/v2/job_templates/1/launch/ {"job": 123, "status": "pending"}
# Second attempt fails $ curl -X POST -k -u admin:password https://tower.example.com/api/v2/job_templates/1/launch/ {"error": "Authentication failed: Invalid credentials"}
# Third attempt succeeds again $ curl -X POST -k -u admin:password https://tower.example.com/api/v2/job_templates/1/launch/ {"job": 124, "status": "pending"} ```
In job execution logs:
``` TASK [Connect to database] *********** fatal: [db-server]: FAILED! => { "msg": "Access denied for user 'app_user'@'10.0.1.100' (using password: YES)" } ... but earlier in the same job:
TASK [Verify application health] ************* ok: [app-server] => { "msg": "Successfully authenticated to application with credentials" } ```
In Tower logs showing mixed credentials:
2024-03-15 14:23:45,123 INFO awx.main.models.credential Injecting credential 42 (version: new)
2024-03-15 14:24:12,456 WARNING awx.main.models.credential Credential 42 file path /var/run/secrets/old/aws_key still exists
2024-03-15 14:25:01,789 INFO awx.main.models.credential Injecting credential 42 (version: old) - using cached path
2024-03-15 14:26:34,012 ERROR awx.main.tasks Job failed due to credential mismatch: API expects new token, runner provided old tokenIn Kubernetes pod logs:
```bash # Pod 1 (old) uses old secret $ kubectl logs tower-worker-old-abc123 Credential path: /var/run/secrets/aws_key (version: 2024-03-01)
# Pod 2 (new) uses new secret $ kubectl logs tower-worker-new-def456 Credential path: /var/run/secrets/aws_key (version: 2024-03-15)
# Both pods exist simultaneously $ kubectl get pods -l app=tower-worker NAME READY STATUS AGE tower-worker-old-abc123 1/1 Running 15d # Has old secrets tower-worker-new-def456 1/1 Running 1h # Has new secrets ```
Database shows one credential version, mounted files show another:
```bash # Database has new credential $ sudo -u postgres psql -d awx -c "SELECT id, name, updated FROM main_credential WHERE id=42;" id | name | updated ----+-------------+------------------- 42 | AWS Creds | 2024-03-15 08:00:00
# But mounted file has old content $ cat /var/run/secrets/aws_key AKIAIOSFODNN7EXAMPLE # Old key from 2024-03-01
# New key should be $ cat /etc/tower/new_secrets/aws_key AKIAI44QHJHJ8EXAMPLE # New key from 2024-03-15 ```
Common Causes
Mixed credential usage occurs due to incomplete rotation:
- 1.Database updated but files not replaced: Tower's credential records were updated in the database, but the actual secret files on disk (mounted secrets, credential injection paths) still contain old values.
- 2.Stale Celery worker processes: Workers started before the rotation have cached credential values or paths. New workers use new credentials, old workers use old ones.
- 3.Rolling deployment timing: During Kubernetes rolling updates, pods with old secrets and pods with new secrets may coexist. Jobs routed to old pods fail, jobs routed to new pods succeed.
- 4.Secret cache in credential plugins: Custom credential plugins may cache secret values for performance, not refreshing until explicit restart.
- 5.Multiple credential sources not synchronized: Credentials are stored in multiple places (database, Vault, mounted files, environment variables). Updating one source but not others causes inconsistency.
- 6.HA instances not synchronized: In Tower HA setups, different instances may have different versions of mounted secrets or different database replication timing.
- 7.Credential injection timing: Tower injects credentials at job start, but the injection may use database values while the runtime reads from files that weren't updated.
Step-by-Step Fix
Step 1: Identify All Credential Sources
Find where each credential is stored:
```bash # List all credential storage locations CREDENTIAL_ID=42
# Tower database sudo -u postgres psql -d awx -c " SELECT id, name, credential_type_id, inputs, updated, created FROM main_credential WHERE id = $CREDENTIAL_ID;"
# Mounted secret files ls -la /var/run/secrets/ /etc/tower/secrets/
# Kubernetes secrets kubectl get secrets -l app=tower -o yaml | grep -A10 "aws_key"
# Vault secrets (if using HashiCorp Vault) vault kv get secret/tower/aws_credentials
# Environment variables in running processes for pid in $(pgrep -f celery); do sudo cat /proc/$pid/environ 2>/dev/null | tr '\0' '\n' | grep -i aws done
# Credential type injectors curl -k -u admin:password https://localhost/api/v2/credential_types/ | \ jq '.results[] | {id, name, injectors}' ```
Step 2: Check Credential Version Consistency
Compare all versions:
```bash #!/bin/bash # check_credential_consistency.sh
CRED_ID=$1
echo "=== Credential $CRED_ID Consistency Check ==="
# Database value echo "Database value:" sudo -u postgres psql -d awx -t -c " SELECT inputs FROM main_credential WHERE id = $CRED_ID;" | \ jq -r '.password // .token // .key'
# Mounted file value echo "Mounted file values:" for path in /var/run/secrets /etc/tower/secrets; do if [ -f "$path/cred_$CRED_ID" ]; then echo "$path/cred_$CRED_ID: $(cat $path/cred_$CRED_ID)" fi done
# Kubernetes secret value echo "Kubernetes secret:" kubectl get secret tower-credentials -o jsonpath='{.data.cred_'$CRED_ID'}' | base64 -d
# Running worker cache echo "Worker process cache:" for pid in $(pgrep -f "celery.*worker"); do cred=$(sudo strings /proc/$pid/environ 2>/dev/null | grep cred_$CRED_ID) echo "PID $pid: $cred" done ```
Step 3: Synchronize All Credential Sources
Update all sources to the same value:
```bash # Get the correct (new) credential value NEW_AWS_KEY=$(vault kv get -field=access_key secret/tower/aws_new)
# Update Tower database curl -X PATCH -k -u admin:password \ -H "Content-Type: application/json" \ -d '{"inputs": {"username": "AKIAI44QHJHJ8EXAMPLE", "password": "new_secret_key"}}' \ https://localhost/api/v2/credentials/42/
# Update mounted files sudo bash -c 'echo "AKIAI44QHJHJ8EXAMPLE" > /var/run/secrets/aws_key' sudo bash -c 'echo "new_secret_key" > /var/run/secrets/aws_secret' sudo chmod 600 /var/run/secrets/aws_key /var/run/secrets/aws_secret
# Update Kubernetes secret kubectl create secret generic tower-credentials \ --from-literal=aws_key=AKIAI44QHJHJ8EXAMPLE \ --from-literal=aws_secret=new_secret_key \ --dry-run=client -o yaml | kubectl apply -f -
# Update Vault vault kv put secret/tower/aws_credentials \ access_key=AKIAI44QHJHJ8EXAMPLE \ secret_key=new_secret_key
# Remove old credential files sudo rm -f /etc/tower/secrets/aws_key /etc/tower/secrets/aws_secret ```
Step 4: Clear Cached Credentials
Force workers to use fresh credentials:
```bash # Restart Celery workers to clear cache sudo supervisorctl restart tower-worker:*
# Clear Tower's internal credential cache redis-cli DEL "credential:*" redis-cli KEYS "cred*" | xargs redis-cli DEL
# Clear Django cache sudo -u awx awx-manage clear_cache
# Restart callback receiver sudo supervisorctl restart tower-callback-receiver
# Full Tower restart if needed sudo ansible-tower-service restart
# For Kubernetes, force pod restart kubectl rollout restart deployment/tower kubectl rollout restart deployment/tower-worker ```
Step 5: Implement Atomic Rotation
Use atomic rotation to prevent partial states:
```bash #!/bin/bash # atomic_credential_rotation.sh
CRED_ID=$1 NEW_KEY=$2 NEW_SECRET=$3
set -e
echo "Starting atomic rotation for credential $CRED_ID"
# Step 1: Create new files in staging location STAGING_PATH="/var/run/secrets/staging" sudo mkdir -p $STAGING_PATH sudo bash -c "echo '$NEW_KEY' > $STAGING_PATH/aws_key" sudo bash -c "echo '$NEW_SECRET' > $STAGING_PATH/aws_secret" sudo chmod 600 $STAGING_PATH/aws_key $STAGING_PATH/aws_secret
# Step 2: Update Kubernetes secret (atomic) kubectl create secret generic tower-credentials-new \ --from-literal=aws_key=$NEW_KEY \ --from-literal=aws_secret=$NEW_SECRET
# Step 3: Update database (atomic) curl -X PATCH -k -u admin:password \ -H "Content-Type: application/json" \ -d '{"inputs": {"username": "'$NEW_KEY'", "password": "'$NEW_SECRET'"}}' \ https://localhost/api/v2/credentials/$CRED_ID/
# Step 4: Atomic file swap sudo mv /var/run/secrets/aws_key /var/run/secrets/aws_key.old sudo mv /var/run/secrets/aws_secret /var/run/secrets/aws_secret.old sudo mv $STAGING_PATH/aws_key /var/run/secrets/aws_key sudo mv $STAGING_PATH/aws_secret /var/run/secrets/aws_secret
# Step 5: Restart workers sudo supervisorctl restart tower-worker:* sleep 10
# Step 6: Verify new credentials work TEST_RESULT=$(curl -s -k -u admin:password \ https://localhost/api/v2/job_templates/1/launch/ | jq -r '.job // .error')
if [ "$TEST_RESULT" == "error" ]; then echo "Rotation failed - rolling back" sudo mv /var/run/secrets/aws_key.old /var/run/secrets/aws_key sudo mv /var/run/secrets/aws_secret.old /var/run/secrets/aws_secret sudo supervisorctl restart tower-worker:* exit 1 fi
# Step 7: Cleanup old files sudo rm -f /var/run/secrets/aws_key.old /var/run/secrets/aws_secret.old sudo rmdir $STAGING_PATH
echo "Atomic rotation complete for credential $CRED_ID" ```
Step 6: Fix Rolling Deployment Issues
Ensure all pods use same credential version:
# Kubernetes deployment with proper secret update strategy
apiVersion: apps/v1
kind: Deployment
metadata:
name: tower-worker
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0 # Don't allow any old pods to serve during update
maxSurge: 1
template:
spec:
containers:
- name: tower-worker
volumeMounts:
- name: credentials
mountPath: /var/run/secrets
readOnly: true
volumes:
- name: credentials
secret:
secretName: tower-credentialsForce complete rollout:
```bash # Ensure all pods are restarted after secret change kubectl rollout restart deployment/tower-worker kubectl rollout status deployment/tower-worker --timeout=300s
# Verify all pods have new secret kubectl get pods -l app=tower-worker -o name | while read pod; do key=$(kubectl exec $pod -- cat /var/run/secrets/aws_key) echo "$pod: $key" done | grep -v OLD_KEY || echo "All pods have new credentials" ```
Step 7: Add Credential Version Tracking
Track credential versions for debugging:
```bash # Add version tracking to credentials curl -X PATCH -k -u admin:password \ -H "Content-Type: application/json" \ -d '{ "inputs": { "username": "AKIAI44QHJHJ8EXAMPLE", "password": "new_secret_key", "version": "2024-03-15-v2" } }' \ https://localhost/api/v2/credentials/42/
# Add version to mounted file sudo bash -c 'echo "AKIAI44QHJHJ8EXAMPLE # version: 2024-03-15-v2" > /var/run/secrets/aws_key'
# Log version in job events cat > /etc/tower/conf.d/version_logging.py << 'EOF' import logging import os
logger = logging.getLogger('awx.main.tasks')
def log_credential_version(credential_id, path): try: with open(path, 'r') as f: content = f.read() version = content.split('# version:')[-1].strip() if '# version:' in content else 'unknown' logger.info(f"Credential {credential_id} using version: {version}") except Exception as e: logger.warning(f"Could not determine credential version: {e}") EOF ```
Verification
Confirm all components use the same credential version:
```bash # Check consistency ./check_credential_consistency.sh 42
# Expected output: All sources show same value === Credential 42 Consistency Check === Database value: AKIAI44QHJHJ8EXAMPLE Mounted file values: /var/run/secrets/aws_key: AKIAI44QHJHJ8EXAMPLE Kubernetes secret: AKIAI44QHJHJ8EXAMPLE
# Test multiple job launches for i in {1..10}; do result=$(curl -s -k -u admin:password \ https://localhost/api/v2/job_templates/1/launch/ | jq -r '.job // .error') echo "Attempt $i: $result" done
# Expected: All 10 attempts succeed with job IDs
# Check job execution logs for consistent authentication curl -k -u admin:password "https://localhost/api/v2/jobs/?status=successful&order_by=-created&limit=5" | \ jq '.results[] | {id, status, job_template}'
# Verify all Kubernetes pods have same secret version kubectl get pods -l app=tower -o name | while read pod; do kubectl exec $pod -- cat /var/run/secrets/aws_key | cut -d' ' -f1 done | sort | uniq -c # Expected: Single line showing count matching pod count ```
Related Issues
- [ansible-credential-rotation-best-practices](/articles/ansible-credential-rotation-best-practices)
- [ansible-runtime-keeps-reading-old-credential-file](/articles/ansible-runtime-keeps-reading-an-old-credential-file-after-secret-mount-move)
- [ansible-kubernetes-secret-update-propagation](/articles/ansible-kubernetes-secret-update-propagation)
Related Articles
- [WordPress troubleshooting: Ansible Artifact Download Uses an Old Mi](ansible-artifact-download-uses-an-old-mirror-after-proxy-change)
- [WordPress troubleshooting: Ansible Audit Trail Misses Events Under ](ansible-audit-trail-misses-events-under-burst-load)
- [WordPress troubleshooting: Ansible Background Worker Gets Stuck in ](ansible-background-worker-stuck-in-a-retry-loop)
- [WordPress troubleshooting: Ansible Backup Completes but Restore Fai](ansible-backup-completes-but-restore-fails-checksum-validation)
- [WordPress troubleshooting: Ansible Batch Importer Duplicates Rows A](ansible-batch-importer-duplicates-rows-after-a-retry)
<script type="application/ld+json"> { "@context": "https://schema.org", "@type": "TechArticle", "headline": "WordPress troubleshooting: Ansible Runtime Mixes Two Secret Sources", "description": "Learn how to fix Ansible Runtime Mixes Two Secret Sources After a Partial Rotation. Professional WordPress troubleshooting solutions with step-by-step guidance. WP error fix, WordPress optimization, WP security, WordPress performance.", "url": "https://www.fixwikihub.com/ansible-runtime-mixes-two-secret-sources-after-a-partial-rotation", "publisher": { "@type": "Organization", "name": "FixWikiHub", "url": "https://www.fixwikihub.com" }, "author": { "@type": "Person", "name": "FixWikiHub Editorial Team" }, "datePublished": "2026-03-07T18:57:51.757Z", "dateModified": "2026-03-07T18:57:51.757Z" } </script>