title: Observability description: Fourteen in-app dashboards surfacing what a Spring AI Playground agent did - token economics, tool and MCP behaviour, RAG quality, host runtime, Ollama monitoring, and a live trace tail.
Where: top navigation → Observability.
The observability layer is the visibility arm of Spring AI Playground’s safety model - the user-facing surface that answers what the agent did, in what order, against which integration, at what cost. Where the sandbox prevents unsafe actions at the call boundary, this layer captures every action that did happen and presents it through fourteen dashboards in the desktop app.
The pages under this section document the user surface. For the trace pipeline, storage tiers, configuration, and external export paths, see AI Agent Observability Architecture.
The dashboards are designed around three roles, all of which can be the same person on a desktop deployment:
Every dashboard is read-only and passive - opening it never alters trace data or model behaviour.
The fourteen dashboards are grouped into four sections in the left sidebar. Each group answers a different category of question:
flowchart TB
OV["Overview"]
subgraph U["AI Usage"]
direction TB
TC["Tokens & Cost"]
AM["AI Models"]
end
subgraph S["AI Stack"]
direction TB
TS["Tool Studio"]
MS["MCP Servers"]
MI["MCP Inspector"]
VD["Vector Database"]
AC["Agentic Chat"]
SF["Safety"]
end
subgraph R["Runtime"]
direction TB
HO["Host"]
OL["Ollama"]
WA["Web Application"]
LG["Logs"]
TR["Traces"]
end
OV --> U
OV --> S
OV --> R
The Overview tab is the landing surface - every group has its own dedicated tabs for depth, but Overview shows one panel from each so an operator can spot anomalies at a glance and drill in from there.
Every dashboard shares one ObservabilityGlobalSettings singleton - so picking Last 1H on the Tokens & Cost tab and clicking over to AI Models shows the same hour, and changing the refresh interval applies everywhere at once. Three surfaces touch this state:
Last 5m · 10m · 20m · 30m · 1h · 3h (default 30m). Clicking a chip switches the sliding window and retickes charts.Off · 1s · 2s · 5s · 10s · 30s · 60s (default 5s). When Off, charts only update on manual refresh.| Section | What it does |
|---|---|
| Refresh interval | Wider preset chips (3s · 5s · 10s · 30s) plus a numeric Custom field. Identical state to the header chip; opening either edits the same value. |
| Time range | Toggle between Sliding window (the same 6 presets as the header) and Fixed range (From + To DateTimePickers). Fixed range caps at 180 minutes; values outside that window are clipped server-side. When a fixed range is active, the header chips become read-only and auto-refresh pauses (the data window is static). |
| Per-tab settings | Optional - the current dashboard injects its own panel here. Example: Logs adds a “Reset to live tail” button. |
An Apply button at the bottom commits staged changes; closing the drawer without Apply discards them.
Code:
ObservabilityGlobalSettings - src/main/java/.../webui/observability/components/ObservabilityGlobalSettings.java (window enum + refresh choices + listener fan-out)ObservabilitySettingsPanel - .../webui/observability/components/ObservabilitySettingsPanel.java (the drawer body)TimeWindowPicker / RefreshIntervalPicker - header chipsObservabilityView.installSettingsDrawer(...) - drawer mountHost and Web Application ignore the time window: Host shows always-live metrics with rolling history retained by the dedicated SystemMetricsRingBuffer, and Web Application reads MeterRegistry gauges live and counters lifetime. Both still honor the refresh interval.
Dashboards are scoped by the kind of action that produced the data, not by whether a chat happened. Each surface in the app - Agentic Chat, Tool Studio, MCP Server (Inspector), Vector Database, and the running JVM itself - emits its own observation stream, and the dashboards crop those streams differently:
flowchart LR
subgraph SRC["Where data comes from"]
direction TB
S1["Chat turn"]
S2["Tool Studio<br/>Run test"]
S3["MCP Inspector<br/>browse · invoke"]
S4["Vector Database<br/>index · search"]
S5["JVM running"]
S6["Any logger"]
S7["Ollama server"]
end
Trace["TraceRecord"]
Prim["MCP primitive<br/>observations"]
Met["MeterRegistry +<br/>SystemMetrics"]
Log["Rolling app log"]
OllamaApi["Ollama<br/>/api/ps · /api/tags"]
S1 --> Trace
S2 --> Trace
S4 --> Trace
S3 --> Prim
S5 --> Met
S6 --> Log
S7 --> OllamaApi
subgraph TDASH["Trace-fed dashboards (8)"]
direction TB
D1["Overview"]
D2["Tokens & Cost"]
D3["AI Models"]
D4["Tool Studio"]
D5["MCP Servers"]
D6["Vector DB"]
D7["Agentic Chat"]
D8["Traces"]
end
Trace --> TDASH
Prim --> MI["MCP Inspector"]
Met --> RUN["Host ·<br/>Web Application"]
OllamaApi --> OLL["Ollama"]
Log --> LG["Logs"]
The five streams are independent and the dashboards mix them differently:
TraceRecord stream - every chat turn becomes one TraceRecord, but so does every Tool Studio test run and every Vector Database operation that fires through Spring AI. That single record carries gen_ai.* / spring.ai.tool / db.vector.client.operation child spans and surfaces across Overview, Tokens & Cost, AI Models, Tool Studio, MCP Servers, Vector Database, Agentic Chat, and Traces. A Tool Studio test that never touches chat still populates the Tool Studio dashboard plus Overview / Traces.MeterRegistry + system metrics - JVM heap, GC, threads, CPU, HTTP, Tomcat sessions, logback level counts are always live (no user action needed) and feed Host and Web Application./api/ps and /api/tags directly (running models, VRAM, installed inventory) - independent of traces and MeterRegistry.So clicking through MCP Inspector primitives, running a Tool Studio test, or uploading a document in Vector Database all generate data on their own dashboards even without sending a single chat message. Conversely, only the chat surface generates the conversation-level aggregates on Agentic Chat.