Skip to content

Read this in other languages: English, 简体中文.

Observability Guide

Overview

HotPlex provides comprehensive observability through OpenTelemetry tracing, Prometheus metrics, and health checks.

OpenTelemetry Tracing

Configuration

bash
export OTEL_ENDPOINT="localhost:4317"
export OTEL_SERVICE_NAME="hotplex"
export OTEL_SAMPLING_RATE="1.0"

Spans

Span NameDescription
session.executeFull session execution
tool.useTool invocation
security.danger_blockWAF block event

Attributes

AttributeDescription
session.idSession identifier
namespaceNamespace
tool.nameTool name
tool.idTool invocation ID
danger.operationBlocked operation

Prometheus Metrics

Endpoints

GET /metrics

Metrics

MetricTypeDescription
hotplex_sessions_activegaugeActive sessions
hotplex_sessions_totalcounterTotal sessions created
hotplex_sessions_errorscounterSession errors
hotplex_tools_invokedcounterTool invocations
hotplex_dangers_blockedcounterWAF blocks
hotplex_request_duration_secondshistogramRequest latency

Grafana Dashboard

Import the dashboard from docs/grafana-dashboard.json.

Health Checks

Endpoints

GET /health       # Basic health
GET /health/ready # Readiness probe
GET /health/live  # Liveness probe

Response

json
{
  "status": "healthy",
  "checks": {
    "engine": true,
    "pool": true
  }
}

Logging

Structured Logging

json
{"level":"info","msg":"session started","session_id":"abc123","namespace":"default"}

Log Levels

LevelUse Case
debugDetailed debugging
infoNormal operations
warnRecoverable errors
errorFailures

Released under the MIT License.