Your OpenTelemetry Collector is failing to process telemetry data, showing errors in logs, or metrics/traces/logs aren't reaching their destinations. The Collector is central to your observability pipeline, so errors here cascade across all monitoring.
Introduction
This article covers troubleshooting steps and solutions for Fix OpenTelemetry Collector Error. The error typically occurs in production environments and can cause service disruptions if not addressed promptly.
Symptoms
Common error messages include:
failed to push telemetry: exporter is downreceiver error: connection refusedprocessor memory_limiter exceededCommon Causes
- Configuration misconfiguration
- Missing or incorrect credentials
- Network connectivity issues
- Version compatibility problems
- Resource exhaustion or limits
- Permission or access denied
Step-by-Step Fix
- 1.Check logs for specific error messages
- 2.Verify configuration settings
- 3.Test network connectivity
- 4.Review recent changes
- 5.Apply corrective action
- 6.Verify the fix
Understanding Collector Architecture
OpenTelemetry Collector has four main components:
- Receivers: Accept telemetry from various sources
- Processors: Transform/enrich telemetry data
- Exporters: Send telemetry to backends
- Extensions: Additional functionality (health check, zpages, etc.)
Error patterns:
failed to push telemetry: exporter is downreceiver error: connection refusedprocessor memory_limiter exceededpipeline not initialized: configuration errorInitial Diagnosis
Check Collector status and logs:
```bash # Check Collector pod/service status kubectl get pods -l app=otel-collector -n monitoring systemctl status otel-collector
# Check Collector logs kubectl logs -l app=otel-collector -n monitoring | grep -i "error|fail" journalctl -u otel-collector | grep -i "error|fail"
# Check Collector metrics endpoint curl -s http://localhost:8888/metrics | grep -E "otelcol_receiver|otelcol_exporter|otelcol_processor"
# Check Collector health curl -s http://localhost:13133/ # Health extension
# Get Collector configuration kubectl get configmap otel-collector-config -n monitoring -o yaml
# Check internal telemetry curl -s http://localhost:8888/metrics | grep otelcol_process_uptime ```
Common Cause 1: Configuration Syntax Errors
Invalid YAML or wrong configuration structure.
Error pattern:
``
failed to load config: yaml: line 10: mapping values are not allowed here
service::pipelines::traces: no receivers definedDiagnosis:
```bash # Check configuration file syntax cat /etc/otelcol/config.yaml
# Validate YAML syntax python -c "import yaml; yaml.safe_load(open('/etc/otelcol/config.yaml'))"
# Or use yq yq eval /etc/otelcol/config.yaml
# Check Collector startup logs for config errors kubectl logs -l app=otel-collector -n monitoring | grep -i "config|error" | head -20
# Use config validation tool if available otelcol validate --config=/etc/otelcol/config.yaml ```
Solution:
Fix configuration syntax:
```yaml # Correct OpenTelemetry Collector configuration structure receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317 http: endpoint: 0.0.0.0:4318
processors: batch: timeout: 10s send_batch_size: 1024 memory_limiter: check_interval: 1s limit_mib: 512
exporters: otlp: endpoint: tempo:4317 tls: insecure: true prometheus: endpoint: 0.0.0.0:9090
service: pipelines: traces: receivers: [otlp] processors: [batch] exporters: [otlp] metrics: receivers: [otlp] processors: [batch] exporters: [prometheus] logs: receivers: [otlp] processors: [batch] exporters: [otlphttp]
extensions: health_check: endpoint: 0.0.0.0:13133 zpages: endpoint: 0.0.0.0:55679 ```
Common syntax fixes:
```yaml # WRONG: Missing receiver in pipeline service: pipelines: traces: processors: [batch] exporters: [otlp]
# CORRECT: Include receivers service: pipelines: traces: receivers: [otlp] processors: [batch] exporters: [otlp]
# WRONG: Invalid processor reference processors: memory_limiter: limit: 512 # Wrong field name
# CORRECT: Use correct field processors: memory_limiter: limit_mib: 512 ```
Common Cause 2: Exporter Connection Failures
Exporters cannot reach destination backends.
Error pattern:
``
Exporter "otlp" failed to export items: connection refused
failed to push to endpoint: context deadline exceededDiagnosis:
```bash # Check exporter metrics curl -s http://localhost:8888/metrics | grep -E "otelcol_exporter.*failed|otelcol_exporter.*sent"
# Test backend connectivity curl -v http://tempo:4317/v1/traces curl -v http://prometheus:9090/-/healthy curl -v http://loki:3100/ready
# Check from Collector pod kubectl exec -it otel-collector-pod -- curl http://tempo:4317/v1/traces
# Check exporter errors in logs kubectl logs -l app=otel-collector -n monitoring | grep -i "exporter|failed"
# Monitor export rate curl -s http://localhost:8888/metrics | grep "rate(otelcol_exporter_sent_spans_total[5m])" ```
Solution:
Fix exporter connectivity:
```yaml # OTLP exporter configuration exporters: otlp/tempo: endpoint: tempo:4317 tls: insecure: true timeout: 30s retry_on_failure: enabled: true initial_interval: 5s max_interval: 30s max_elapsed_time: 300s sending_queue: enabled: true num_consumers: 10 queue_size: 5000
prometheusremotewrite: endpoint: http://prometheus:9090/api/v1/write tls: insecure: true
otlphttp/loki: endpoint: http://loki:3100/loki/api/v1/push tls: insecure: true
# Verify network connectivity # For Kubernetes, check service endpoints kubectl get endpoints tempo -n monitoring kubectl get endpoints prometheus -n monitoring ```
Common Cause 3: Memory Limiter Exceeded
Memory limiter processor dropping data when memory limit hit.
Error pattern:
``
Memory limiter exceeded: dropping telemetry
otelcol_processor_refused_spans_total increasingDiagnosis:
```bash # Check memory limiter metrics curl -s http://localhost:8888/metrics | grep -E "otelcol_processor_memory_limiter"
# Check refused items curl -s http://localhost:8888/metrics | grep otelcol_processor_refused
# Monitor Collector memory usage kubectl top pods -l app=otel-collector -n monitoring ps aux | grep otelcol
# Check memory configuration kubectl describe pod -l app=otel-collector -n monitoring | grep -A 5 "memory_limiter"
# Look for memory errors in logs kubectl logs -l app=otel-collector -n monitoring | grep -i "memory|limit" ```
Solution:
Adjust memory limits:
```yaml # Memory limiter processor configuration processors: memory_limiter: check_interval: 1s limit_mib: 1024 # Increase based on available memory spike_limit_mib: 256
# For container deployment, also set container limits # Kubernetes deployment resources: limits: memory: 2Gi requests: memory: 1Gi
# Increase buffer sizes if needed processors: batch: timeout: 10s send_batch_size: 1024 send_batch_max_size: 2048
# If memory is still exceeded, consider sampling processors: probabilistic_sampler: sampling_percentage: 50 # Sample 50% of traces ```
Common Cause 4: Receiver Protocol Issues
Receivers not accepting telemetry from sources.
Error pattern:
``
Receiver error: failed to accept connection
otlp receiver: grpc protocol errorDiagnosis:
```bash # Check receiver metrics curl -s http://localhost:8888/metrics | grep -E "otelcol_receiver.*accepted|otelcol_receiver.*refused"
# Check if receivers are listening netstat -tlnp | grep 4317 netstat -tlnp | grep 4318
# Test receiver connectivity grpcurl -plaintext localhost:4317 list curl -v http://localhost:4318/v1/traces
# Check receiver errors kubectl logs -l app=otel-collector -n monitoring | grep -i "receiver|protocol"
# Check incoming telemetry rate curl -s http://localhost:8888/metrics | grep "rate(otelcol_receiver_accepted_spans_total[5m])" ```
Solution:
Fix receiver configuration:
```yaml # OTLP receiver configuration receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317 max_recv_msg_size_mib: 16 keepalive: server_parameters: max_connection_age: 120s max_connection_age_grace: 30s enforcement_policy: min_time_between_pings: 10s ping_without_stream_allowed: false http: endpoint: 0.0.0.0:4318 max_request_body_size: 16MiB cors: allowed_origins: - "*"
# For Jaeger receiver receivers: jaeger: protocols: grpc: endpoint: 0.0.0.0:14250 thrift_http: endpoint: 0.0.0.0:14268 thrift_binary: endpoint: 0.0.0.0:6832
# For Kafka receiver receivers: kafka: brokers: ["kafka:9092"] topic: "otel-spans" encoding: otlp_proto group_id: "otel-collector" ```
Common Cause 5: Pipeline Not Started
Pipelines fail to initialize due to missing components.
Error pattern:
``
Pipeline "traces" not started: no exporters
Service pipeline initialization failedDiagnosis:
```bash # Check pipeline status in logs kubectl logs -l app=otel-collector -n monitoring | grep -i "pipeline|initialized|started"
# Verify service configuration cat /etc/otelcol/config.yaml | grep -A 20 "service:"
# Check component registration kubectl logs -l app=otel-collector -n monitoring --since=5m | grep -i "registered"
# Check metrics for pipeline activity curl -s http://localhost:8888/metrics | grep otelcol_processor_spans_processed_total ```
Solution:
Fix pipeline definition:
```yaml # Complete pipeline configuration receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317
processors: batch: timeout: 10s
exporters: otlp: endpoint: tempo:4317 tls: insecure: true
# Service MUST reference defined components service: pipelines: traces: receivers: [otlp] # Must be defined in receivers section processors: [batch] # Must be defined in processors section exporters: [otlp] # Must be defined in exporters section extensions: [health_check]
extensions: health_check: endpoint: 0.0.0.0:13133
# Each component name in pipeline must match exactly # Common mistake: typo in component name service: pipelines: traces: receivers: [otlp_receivr] # WRONG - typo exporters: [otlp_exporter] # WRONG - name doesn't match ```
Common Cause 6: Batch Processor Timeout
Batch processor holding data too long before sending.
Error pattern:
``
Batch processor timeout exceeded
Data delayed in batch processorDiagnosis:
```bash # Check batch processor metrics curl -s http://localhost:8888/metrics | grep -E "otelcol_processor_batch"
# Check batch latency curl -s http://localhost:8888/metrics | grep "otelcol_processor_batch_latency"
# Monitor batch sizes curl -s http://localhost:8888/metrics | grep "otelcol_processor_batch_batch_send_size"
# Check for timeout issues in logs kubectl logs -l app=otel-collector -n monitoring | grep -i "batch|timeout" ```
Solution:
Optimize batch processor:
```yaml processors: batch: # Balance between throughput and latency timeout: 5s # Max wait before sending batch send_batch_size: 512 # Send when this many items accumulated send_batch_max_size: 1024 # Never exceed this size
# For high-volume with acceptable latency batch: timeout: 30s send_batch_size: 10000 send_batch_max_size: 20000
# For low-latency requirements batch: timeout: 1s send_batch_size: 100 send_batch_max_size: 200 ```
Common Cause 7: Processor Transform Errors
Transform processor failing due to invalid operations.
Error pattern:
``
Transform processor error: invalid attribute operation
Diagnosis:
```bash # Check processor metrics curl -s http://localhost:8888/metrics | grep otelcol_processor
# Check transform errors kubectl logs -l app=otel-collector -n monitoring | grep -i "transform|processor"
# Test with debug processor # Add debug processor to see data processors: debug: verbosity: detailed ```
Solution:
Fix transform processor configuration:
```yaml processors: # Correct attribute transformations attributes: actions: - key: environment value: production action: insert - key: deployment.environment from_attribute: environment action: upsert - key: sensitive.data action: delete
# Span processor for trace transformations transform: trace_statements: - context: span statements: - set(attributes["service.name"], "my-service") - keep_keys(attributes, ["service.name", "operation"]) - replace_match(attributes["http.url"], "http://*", "https://*")
# Resource processor resource: attributes: - key: k8s.cluster.name value: production-cluster action: insert - key: service.instance.id from_attribute: pod.name action: upsert ```
Common Cause 8: TLS/Certificate Issues
TLS configuration problems blocking secure connections.
Error pattern:
``
TLS error: certificate verify failed
x509: certificate signed by unknown authorityDiagnosis:
```bash # Test TLS connection openssl s_client -connect tempo:4317 -showcerts
# Check certificate validity curl -k https://tempo:4317/v1/traces
# Check Collector TLS configuration kubectl get configmap otel-collector-config -n monitoring -o yaml | grep -A 10 "tls"
# Look for certificate errors in logs kubectl logs -l app=otel-collector -n monitoring | grep -i "cert|tls|x509" ```
Solution:
Configure TLS properly:
```yaml # Exporter with TLS exporters: otlp: endpoint: tempo:4317 tls: insecure: false ca_file: /etc/otelcol/certs/ca.crt cert_file: /etc/otelcol/certs/client.crt key_file: /etc/otelcol/certs/client.key min_version: "1.2"
# For testing/insecure mode exporters: otlp: endpoint: tempo:4317 tls: insecure: true insecure_skip_verify: true
# Receiver TLS configuration receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317 tls: cert_file: /etc/otelcol/certs/server.crt key_file: /etc/otelcol/certs/server.key ```
Verification
After fixing Collector issues:
```bash # Check Collector is healthy curl -s http://localhost:13133/ curl -s http://localhost:8888/metrics | grep otelcol_process_uptime
# Verify receivers are accepting curl -s http://localhost:8888/metrics | grep "otelcol_receiver_accepted"
# Verify exporters are sending curl -s http://localhost:8888/metrics | grep "otelcol_exporter_sent"
# Check no refused items curl -s http://localhost:8888/metrics | grep "otelcol_processor_refused" # Should be 0 or minimal
# Test telemetry flow # Send test trace curl -X POST http://localhost:4318/v1/traces \ -H "Content-Type: application/json" \ -d '{"resourceSpans":[{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"test"}}]},"scopeSpans":[{"spans":[{"traceId":"test123","spanId":"span1","name":"test-operation"}]}]}]}'
# Verify backend received it curl -s http://tempo:3200/api/traces/test123 | jq '.'
# Use zpages for diagnostics # Navigate to http://localhost:55679/debug/servicez # Check pipeline status ```
Prevention
Monitor Collector health:
```yaml groups: - name: otel_collector_health rules: - alert: OpenTelemetryCollectorDown expr: up{job="otel-collector"} == 0 for: 2m labels: severity: critical annotations: summary: "OpenTelemetry Collector is down"
- alert: OpenTelemetryCollectorExporterErrors
- expr: rate(otelcol_exporter_send_failed_spans_total[5m]) > 0
- for: 5m
- labels:
- severity: warning
- annotations:
- summary: "OpenTelemetry Collector exporter errors"
- alert: OpenTelemetryCollectorMemoryLimit
- expr: otelcol_processor_memory_limiter_refused_spans_total > 0
- for: 2m
- labels:
- severity: warning
- annotations:
- summary: "OpenTelemetry Collector memory limiter dropping data"
- alert: OpenTelemetryCollectorReceiverErrors
- expr: rate(otelcol_receiver_refused_spans_total[5m]) > 0
- for: 5m
- labels:
- severity: warning
- annotations:
- summary: "OpenTelemetry Collector receiver refusing spans"
`
Regular Collector health check:
```bash #!/bin/bash # otelcol-health.sh
# Check health endpoint curl -s http://localhost:13133/ && echo "Health OK" || echo "Health FAILED"
# Check refused metrics REFUSED=$(curl -s http://localhost:8888/metrics | \ grep "otelcol_processor_refused_spans_total" | \ awk '{print $2}')
if [ "$REFUSED" -gt 0 ]; then echo "WARNING: $REFUSED spans refused" fi
# Check export errors FAILED=$(curl -s http://localhost:8888/metrics | \ grep "otelcol_exporter_send_failed_spans_total" | \ awk '{print $2}')
if [ "$FAILED" -gt 0 ]; then echo "WARNING: $FAILED spans failed to export" fi
# Check uptime UPTIME=$(curl -s http://localhost:8888/metrics | \ grep "otelcol_process_uptime_seconds" | \ awk '{print $2}') echo "Collector uptime: $UPTIME seconds" ```
OpenTelemetry Collector errors typically stem from configuration syntax, network connectivity, or resource limits. Validate configuration first, check receiver/exporter connectivity, and monitor memory usage to ensure reliable telemetry collection.
Related Articles
- [WordPress troubleshooting: Fix IAM Timeout Error - Complete Trouble](fix-iam-timeout-error)
- [Technical troubleshooting: Fix Cloudwatch Alarm Not Triggering Issue in Monit](cloudwatch-alarm-not-triggering)
- [Fix Datadog Agent Not Sending Metrics Issue in Monitoring](datadog-agent-not-sending-metrics)
- [Fix Elasticsearch Cluster Red Yellow Status Issue in Monitoring](elasticsearch-cluster-red-yellow-status)
- [Fix Alertmanager Notification Failed](fix-alertmanager-notification-failed)
<script type="application/ld+json"> { "@context": "https://schema.org", "@type": "TechArticle", "headline": "Fix OpenTelemetry Collector Error", "description": "Resolve OpenTelemetry Collector errors affecting telemetry collection. Fix pipeline failures, exporter issues, and configuration problems quickly.", "url": "https://www.fixwikihub.com/fix-opentelemetry-collector-error", "publisher": { "@type": "Organization", "name": "FixWikiHub", "url": "https://www.fixwikihub.com" }, "author": { "@type": "Person", "name": "FixWikiHub Editorial Team" }, "datePublished": "2025-11-26T13:18:17.495Z", "dateModified": "2025-11-26T13:18:17.495Z" } </script>