# Fix Thanos Sidecar Upload Failures
You're monitoring Thanos metrics and seeing thanos_sidecar_upload_failures_total increasing, or your Prometheus blocks aren't being uploaded to object storage. The Thanos sidecar is responsible for uploading Prometheus TSDB blocks to object storage for long-term retention.
Introduction
- 1.The Thanos sidecar runs alongside Prometheus and:
- 2.Uploads TSDB blocks to object storage (S3, GCS, Azure)
- 3.Serves Prometheus data via the StoreAPI
- 4.Exposes Prometheus as a Prometheus remote write receiver
When uploads fail, you lose long-term metric retention.
Symptoms
Common error messages include:
```bash # Kubernetes kubectl logs -l app=thanos-sidecar -n monitoring
# Docker docker logs thanos-sidecar
# Systemd journalctl -u thanos-sidecar -f ```
```bash # Query Prometheus for upload failures curl -s 'http://localhost:9090/api/v1/query?query=thanos_sidecar_upload_failures_total' | jq
# Check upload success rate curl -s 'http://localhost:9090/api/v1/query?query=rate(thanos_sidecar_upload_successes_total[5m])' | jq
# Check queued blocks curl -s 'http://localhost:9090/api/v1/query?query=thanos_sidecar_queue_length' | jq ```
```bash # For S3 aws s3 ls s3://your-thanos-bucket/
# For GCS gsutil ls gs://your-thanos-bucket/
# Test bucket access aws s3api head-bucket --bucket your-thanos-bucket ```
Common Causes
- Configuration misconfiguration
- Missing or incorrect credentials
- Network connectivity issues
- Version compatibility problems
- Resource exhaustion or limits
- Permission or access denied
Step-by-Step Fix
Check Thanos sidecar logs:
```bash # Kubernetes kubectl logs -l app=thanos-sidecar -n monitoring
# Docker docker logs thanos-sidecar
# Systemd journalctl -u thanos-sidecar -f ```
Check upload metrics:
```bash # Query Prometheus for upload failures curl -s 'http://localhost:9090/api/v1/query?query=thanos_sidecar_upload_failures_total' | jq
# Check upload success rate curl -s 'http://localhost:9090/api/v1/query?query=rate(thanos_sidecar_upload_successes_total[5m])' | jq
# Check queued blocks curl -s 'http://localhost:9090/api/v1/query?query=thanos_sidecar_queue_length' | jq ```
Check object storage connectivity:
```bash # For S3 aws s3 ls s3://your-thanos-bucket/
# For GCS gsutil ls gs://your-thanos-bucket/
# Test bucket access aws s3api head-bucket --bucket your-thanos-bucket ```
Common Causes and Solutions
Cause 1: Object Storage Credentials Invalid
# Error: Access Denied, Invalid credentialsSolution: Verify credentials configuration:
# Kubernetes secret for S3
apiVersion: v1
kind: Secret
metadata:
name: thanos-object-storage
type: Opaque
stringData:
object-storage.yaml: |
type: S3
config:
bucket: your-thanos-bucket
endpoint: s3.amazonaws.com
region: us-east-1
access_key: AKIAIOSFODNN7EXAMPLE
secret_key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEYTest credentials:
```bash # Verify AWS credentials aws sts get-caller-identity
# Test bucket write aws s3 cp /tmp/test.txt s3://your-thanos-bucket/test.txt aws s3 rm s3://your-thanos-bucket/test.txt ```
Cause 2: Bucket Permissions Missing
# Error: 403 ForbiddenSolution: Add proper bucket policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012:role/thanos-role"
},
"Action": [
"s3:ListBucket",
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::your-thanos-bucket",
"arn:aws:s3:::your-thanos-bucket/*"
]
}
]
}Apply policy:
aws s3api put-bucket-policy \
--bucket your-thanos-bucket \
--policy file://bucket-policy.jsonCause 3: Network Connectivity Issues
# Error: Connection timeout, Network unreachableSolution: Check network configuration:
```bash # Test connectivity curl -I https://s3.amazonaws.com
# Check DNS resolution nslookup s3.amazonaws.com
# For Kubernetes, check egress kubectl run test --image=busybox --rm -it --restart=Never -- curl -I https://s3.amazonaws.com ```
For private endpoints:
# object-storage.yaml
type: S3
config:
bucket: your-thanos-bucket
endpoint: s3.internal.company.com # Private endpoint
region: us-east-1
insecure: false
signature_version2: falseCause 4: Prometheus Not Ready
# Error: Prometheus not ready, TSDB not initializedSolution: Ensure Prometheus is fully started:
```bash # Check Prometheus health curl http://localhost:9090/-/healthy
# Check TSDB status curl http://localhost:9090/api/v1/status/tsdb
# Verify Prometheus is ready before sidecar starts # In Kubernetes, use init container or readiness probe ```
# Kubernetes deployment
spec:
containers:
- name: prometheus
readinessProbe:
httpGet:
path: /-/ready
port: 9090
initialDelaySeconds: 30
- name: thanos-sidecar
# Wait for PrometheusCause 5: Block Upload Timeout
# Error: Upload timeout, context deadline exceededSolution: Increase upload timeout:
# Thanos sidecar configuration
--upload.timeout=30m
--upload.wait-interval=5s
--upload.max-upload-timeout=1hOr in Kubernetes:
args:
- sidecar
- --prometheus.url=http://localhost:9090
- --objstore.config-file=/etc/thanos/object-storage.yaml
- --upload.timeout=30mCause 6: Disk Space Issues
# Error: No space left on deviceSolution: Check and clean disk space:
```bash # Check Prometheus data directory df -h /var/lib/prometheus
# Check TSDB blocks ls -la /var/lib/prometheus/data/
# Clean old blocks if needed # Thanos should handle this, but check retention curl http://localhost:9090/api/v1/status/tsdb | jq '.data.headGC' ```
Configure retention:
# Prometheus configuration
storage:
tsdb:
retention.time: 15d
retention.size: 50GBCause 7: Concurrent Upload Conflicts
# Error: Block already exists, ConflictSolution: This is usually harmless - Thanos handles conflicts:
```bash # Check if blocks are being uploaded by multiple sidecars kubectl get pods -l app=thanos-sidecar
# Ensure unique instance labels # Each Prometheus should have unique external_labels ```
# Prometheus configuration
global:
external_labels:
cluster: 'prod'
replica: 'prometheus-1' # Unique per instanceCause 8: Invalid Block Data
# Error: Invalid block, checksum mismatchSolution: Verify TSDB integrity:
```bash # Check Prometheus TSDB promtool tsdb check /var/lib/prometheus/data
# List blocks promtool tsdb list /var/lib/prometheus/data
# Verify specific block promtool tsdb verify /var/lib/prometheus/data/<block-id> ```
Cause 9: Thanos Version Mismatch
# Error: Incompatible version, unsupported formatSolution: Ensure Thanos and Prometheus versions are compatible:
```bash # Check Thanos version thanos --version
# Check Prometheus version prometheus --version
# Recommended: Thanos v0.30+ with Prometheus v2.40+ ```
Complete Thanos Sidecar Configuration
Kubernetes Deployment
```yaml apiVersion: apps/v1 kind: Deployment metadata: name: prometheus spec: replicas: 1 selector: matchLabels: app: prometheus template: metadata: labels: app: prometheus spec: containers: - name: prometheus image: prom/prometheus:v2.45.0 args: - --config.file=/etc/prometheus/prometheus.yml - --storage.tsdb.path=/var/lib/prometheus/data - --storage.tsdb.retention.time=15d - --web.enable-remote-write-receiver - --web.enable-lifecycle ports: - containerPort: 9090 volumeMounts: - name: config mountPath: /etc/prometheus - name: data mountPath: /var/lib/prometheus/data readinessProbe: httpGet: path: /-/ready port: 9090 initialDelaySeconds: 30
- name: thanos-sidecar
- image: thanosio/thanos:v0.32.0
- args:
- - sidecar
- - --prometheus.url=http://localhost:9090
- - --objstore.config-file=/etc/thanos/object-storage.yaml
- - --grpc-address=0.0.0.0:10901
- - --http-address=0.0.0.0:10902
- - --upload.timeout=30m
- - --shipper.upload.compaction-timeout=5m
- ports:
- - containerPort: 10901
- name: grpc
- - containerPort: 10902
- name: http
- volumeMounts:
- - name: object-storage
- mountPath: /etc/thanos
- - name: data
- mountPath: /var/lib/prometheus/data
volumes: - name: config configMap: name: prometheus-config - name: data emptyDir: {} - name: object-storage secret: secretName: thanos-object-storage ```
Prometheus Configuration
```yaml global: external_labels: cluster: 'prod' replica: 'prometheus-1'
scrape_configs: - job_name: 'prometheus' static_configs: - targets: ['localhost:9090']
remote_write: - url: http://thanos-receive:10908/api/v1/receive ```
Verification
After fixing issues:
```bash # Check upload success rate curl -s 'http://thanos-sidecar:10902/api/v1/query?query=rate(thanos_sidecar_upload_successes_total[5m])'
# Verify no failures curl -s 'http://thanos-sidecar:10902/api/v1/query?query=thanos_sidecar_upload_failures_total'
# Check blocks in object storage aws s3 ls s3://your-thanos-bucket/ | grep "meta.json"
# Verify Thanos query can access data curl -s 'http://thanos-query:10902/api/v1/query?query=up' | jq ```
Monitoring Alerts
```yaml # Alert for upload failures groups: - name: thanos-sidecar rules: - alert: ThanosSidecarUploadFailures expr: rate(thanos_sidecar_upload_failures_total[5m]) > 0 for: 5m labels: severity: critical annotations: summary: "Thanos sidecar upload failures" description: "Thanos sidecar {{ $labels.instance }} is failing to upload blocks"
- alert: ThanosSidecarUploadQueueGrowing
- expr: thanos_sidecar_queue_length > 10
- for: 5m
- labels:
- severity: warning
- annotations:
- summary: "Thanos sidecar upload queue growing"
- description: "Upload queue length is {{ $value }}"
- alert: ThanosSidecarNoUploads
- expr: rate(thanos_sidecar_upload_successes_total[1h]) == 0
- for: 1h
- labels:
- severity: warning
- annotations:
- summary: "No Thanos sidecar uploads"
- description: "No blocks uploaded in the last hour"
`
Related Articles
- [WordPress troubleshooting: Fix IAM Timeout Error - Complete Trouble](fix-iam-timeout-error)
- [Technical troubleshooting: Fix Cloudwatch Alarm Not Triggering Issue in Monit](cloudwatch-alarm-not-triggering)
- [Fix Datadog Agent Not Sending Metrics Issue in Monitoring](datadog-agent-not-sending-metrics)
- [Fix Elasticsearch Cluster Red Yellow Status Issue in Monitoring](elasticsearch-cluster-red-yellow-status)
- [Fix Alertmanager Notification Failed](fix-alertmanager-notification-failed)
<script type="application/ld+json"> { "@context": "https://schema.org", "@type": "TechArticle", "headline": "Fix Thanos Sidecar Upload Failures", "description": "Step-by-step guide to fix Thanos sidecar upload failures. Configure object storage, resolve upload errors, and ensure long-term Prometheus metrics storage.", "url": "https://www.fixwikihub.com/fix-thanos-sidecar-upload-failures", "publisher": { "@type": "Organization", "name": "FixWikiHub", "url": "https://www.fixwikihub.com" }, "author": { "@type": "Person", "name": "FixWikiHub Editorial Team" }, "datePublished": "2026-04-27T10:08:00.000Z", "dateModified": "2026-04-27T10:08:00.000Z" } </script>