Skip to content

Observability (Loki & Tempo)

Coro agents treat production telemetry as primary evidence. Base runtime guidance explicitly prefers live LogQL / TraceQL when code is ambiguous versus guessing from static reads (/concepts/architecture/, agent runtime .claude/CLAUDE.md).

Three MCP tools power this story:

ToolPurpose
loki_queryExecute LogQL (range queries) via the configured Loki base URL
tempo_get_traceFetch trace JSON by hex trace id
tempo_searchRun TraceQL search templates exposed by Grafana Tempo

They share the tool context clients instantiated at runner bootstrap (createLokiClient, createTempoClient) using in-memory Settings fields.

Configuring backends

There is no dashboard Settings field for Loki or Tempo today. Configure backends on the runner host via environment variables read at bootstrap (packages/runner/src/runner/build-settings.ts):

VariableMeaning
LOKI_BASE_URLGrafana Loki querier reachable from runner
LOKI_API_KEY / LOKI_USERNAMEOptional tenancy / auth pairing
TEMPO_BASE_URLTempo HTTP API gateway
TEMPO_API_KEYOptional bearer/API token

Kubernetes / systemd units should inject these centrally; developers can export locally for debugging.

Codifying conventions: URLs are supplied via env vars on the runner host, but tenant memory (and snippets under memory/snippets/) is the right place for recommended LogQL dashboards, label cardinality tips, and escalation contacts everyone should reuse in queries.

When evaluations call tools

Evaluator and QA-phase agents emphasise verifying acceptance criteria against real traffic whenever possible — e.g., confirm canary rollout before closing the job loop. Teach teams to cite dashboard deep links plus query text inside evaluation artefacts (post_artifact report-md).

Operational hygiene

  • Rate limits: heavy queries inflate wall-clock phases—wrap explorations behind narrow time windows (5m) first.
  • Secrets: disallow returning raw PII stacks; summarise counts / exemplar trace ids referencing Tempo indirectly.
  • Offline fallback: absent env vars, clients advertise available: false responses—agents escalate instead of hallucinating infra.
  • Skill bundles like observability-additions (layered .claude/skills/) — see /guides/add-skill/.
  • /guides/byo-mcp/ — if you need Grafana or Datadog-specific MCP adjuncts besides built-ins.