![OTEL](./otel.png) ![Looks Good To Me](./lgtm.png)
### OpenTelemetry ![2023-11-30 #7 What is OpenTelemetry ?](o11y-weekly-otel-post-qrcode.gif) - OTLP protocol: gRPC/HTTP protobuf services - Semantic Conventions: attributes and resources - Instrumentations - Collectors / Exporters: SDK and agent
## Start up the demo ! ![demo code link](./democode-qrcode.gif) ![Architecture](./docker-compose.png)
### OpenTelemetry Logs with Loki ![Loki](./loki.svg)
### OTEL Collector Agent mode - Java Logback File Appender - OpenTelemetry Collector Contrib Agent ![loki otel agent](./loki-agent.png)
### Java Logback File Appender ```xml
log/${SERVICE_NAME}.log
true
timestamp=%d{yyyy/MM/dd HH:mm:ss.SSSSSSSSS}\t service.version=${service.version}\t traceId=%X{trace_id}\t spanId=%X{span_id}\t message=%msg%n
```
### OTEL Loki Pipeline ```yaml exporters: loki: endpoint: http://loki:3100/loki/api/v1/push default_labels_enabled: level: true receivers: filelog/app: include: [ /app/log/*.log ] storage: file_storage/app multiline: line_start_pattern: timestamp= resource: service.name: ${env:SERVICE_NAME} service.namespace: ${env:SERVICE_NAMESPACE} host.name: ${env:HOSTNAME} deployment.environment: ${env:DEPLOYMENT_ENVIRONMENT} processors: batch/app: resource/app/loki: attributes: - action: insert key: loki.resource.labels value: service.name, service.namespace, service.version, host.name, deployment.environment, service.instance.id - action: insert key: loki.format value: raw service: pipelines: logs/app: receivers: [filelog/app] processors: [batch/app, resource/app/loki] exporters: [loki] ```
### Java Agent / OTLP mode - Logback OpenTelemetry Appender - OpenTelemetry Java Agent ![loki OTLP](loki-otlp.png)
### Maven Dependency ```xml
io.opentelemetry.instrumentation
opentelemetry-logback-appender-1.0
2.0.0-alpha
```
### Logback OTEL Configuration ```xml
```
### OpenTelemetry Java Agent ![java agent OTEL](./java-agent-otel.qrcode.gif) ```ini command=java -javaagent:/app/opentelemetry-javaagent.jar -Dservice.name=%(ENV_SERVICE_NAME)s -Dservice.namespace=%(ENV_SERVICE_NAMESPACE)s -Dhost.name=%(host_node_name)s -Ddeployment.environment=%(ENV_DEPLOYMENT_ENVIRONMENT)s -Dotel.resource.attributes=service.name=%(ENV_SERVICE_NAME)s,service.namespace=%(ENV_SERVICE_NAMESPACE)s,deployment.environment=%(ENV_DEPLOYMENT_ENVIRONMENT)s,host.name=%(host_node_name)s -jar /app/main.jar --spring.application.name=%(ENV_SERVICE_NAME)s ```
### Metrics with OTLP Mimir ![Mimir](./mimir.svg) ![OTLP JVM QR code](./otlp-jvm-qrcode.gif) ![Java metrics](./java-metrics.png)
### RED Method ![RED method](./metrics-red.png)
### Metrics to Traces Data Link ![Metrics to Traces Data Link](./metrics-traces-datalink.png)
### JVM Metrics ![JVM Metrics](./metrics-jvm-utilization.png)
### OTLP Micrometer registry ```xml
io.micrometer
micrometer-registry-otlp
```
### Configuration ```yaml management: otlp: metrics: export: enabled: true step: 10s url: http://mimir:9009/otlp/v1/metrics metrics: tags: deployment.environment: '${deployment.environment}' host.name: '${host.name}' service: name: '${service.name}' namespace: '${service.namespace}' version: '@project.version@' distribution: percentiles: all: 0.5, 0.95, 0.99 ```
### Tracing with Tempo
![traces error](./traces-error.png)
![traces durations](./traces-durations.png)
### Head Sampling - Start with low value: Parent Based and 1% or 10% - Increase if needed by taking care of observability backends / billing / rate limiting
### Head Sampling Configuration ```yaml environment: - SERVICE_NAME=client - SERVICE_NAMESPACE=demo - DEPLOYMENT_ENVIRONMENT=dev - OTEL_TRACES_SAMPLER=parentbased_traceidratio - OTEL_TRACES_SAMPLER_ARG=0.1 - OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf - OTEL_EXPORTER_OTLP_ENDPOINT=http://otelcontribcol-gateway:4318 - OTEL_METRICS_EXPORTER=none ```
### Traces monitoring ![Pipeline monitoring](traces-pipeline-monitoring.png)
### OTEL metrics ```yaml receivers: prometheus/gateway: config: scrape_configs: - job_name: otelcol-contrib/gateway scrape_interval: 10s static_configs: - targets: [0.0.0.0:8888] service: telemetry: metrics: level: detailed pipelines: metrics/gateway: receivers: [prometheus/gateway] exporters: [otlphttp/gateway/mimir] ```
### OTEL Collector host metrics ![OTEL Collector host metrics Dashboard](./otelcol-hostmetrics-qrcode.gif) ![OTEL Collector Host Metrics](otelcol-hostmetrics.png)
### Otelcol Configuration ```yaml receivers: # otelcontribcol metrics + host metrics prometheus/gateway: config: scrape_configs: - job_name: otelcol-contrib/gateway scrape_interval: 10s static_configs: - targets: [0.0.0.0:8888] hostmetrics/gateway: collection_interval: 10s scrapers: cpu: metrics: system.cpu.logical.count: enabled: true memory: metrics: system.memory.utilization: enabled: true system.memory.limit: enabled: true load: disk: filesystem: metrics: system.filesystem.utilization: enabled: true network: paging: processes: process: mute_process_user_error: true metrics: process.cpu.utilization: enabled: true process.memory.utilization: enabled: true process.threads: enabled: true process.paging.faults: enabled: true processors: batch/gateway: attributes/gateway: actions: - key: service.namespace action: upsert value: gateway - key: service.name action: upsert value: otelcol-contrib/gateway resourcedetection/system: detectors: ["system"] system: hostname_sources: ["os"] transform: metric_statements: - context: datapoint statements: - set(attributes["host.name"], resource.attributes["host.name"]) - set(attributes["process.command"], resource.attributes["process.command"]) - set(attributes["process.command_line"], resource.attributes["process.command_line"]) - set(attributes["process.executable.name"], resource.attributes["process.executable.name"]) - set(attributes["process.executable.path"], resource.attributes["process.executable.path"]) - set(attributes["process.owner"], resource.attributes["process.owner"]) - set(attributes["process.parent_pid"], resource.attributes["process.parent_pid"]) - set(attributes["process.pid"], resource.attributes["process.pid"]) service: telemetry: metrics: level: detailed logs: level: info pipelines: metrics/gateway: receivers: [prometheus/gateway, hostmetrics/gateway] processors: [attributes/gateway, resourcedetection/system, transform, batch/gateway] exporters: [otlphttp/gateway/mimir] ```
### Gateway Tail Sampling ![Gateway Tail Sampling](./traces-pipeline-monitoring.png)
### Latency Policy ```yaml # skip traces where latencies are < 100ms { name: latency-policy, type: latency, latency: {threshold_ms: 100} }, ```
### Error Policy ```yaml { name: status_code-error-policy, type: status_code, status_code: {status_codes: [ERROR]} }, ```
### Exclude Error Policy ```yaml { name: http-status-code-error-policy, type: string_attribute, string_attribute: { key: error.type, values: [4..], enabled_regex_matching: true, invert_match: true, }, }, ```
demo