Sprint Vibe — AI Agent Observability

Overview

Sprint Vibe is a real-time observability platform for AI agent systems. It captures every tool call, session lifecycle event, agent spawn, context compaction, and error — across any agent platform — into a single dashboard.

Whether you're running a single Claude Code session or orchestrating a team of agents across multiple platforms, Sprint Vibe gives you visibility into what your agents are doing, what they've built, and what it costs.

What Sprint Vibe captures

Sessions

Agent lifecycles — start, stop, idle, resume, context compaction

Tool Calls

Every file read, edit, write, bash command, search, web fetch — with full input and output

Agent Teams

Lead agents, teammates, subagents — parent/child relationships and roles

Errors

Failed tool calls, permission denials, timeout errors, interrupted operations

Cost & Tokens

Model usage, token counts, estimated cost per session and per project

Context

Working directory, permission mode, model name, platform, channel

Supported platforms

Claude CodeFull support

OpenAI CodexOTel + notify

OpenClawWebSocket listener

Gemini / AntigravityREST API

LangChain / CrewAI / AutoGenREST API

Any agentREST API

Quick Start

Get an agent reporting to Sprint Vibe in under 2 minutes.

1. Create a project

2. Tell your agent to use Sprint Vibe

The fastest way to get started: paste this into your agent's prompt or system instructions:

Add to your agent's system prompt / CLAUDE.md / README

## Observability

This project uses Sprint Vibe for agent observability.
API endpoint: https://sprintvibe.app/api/v1/ingest
API key: sv_live_YOUR_KEY_HERE

Report activity by POSTing JSON to the endpoint with header:
  Authorization: Bearer sv_live_YOUR_KEY_HERE

Required fields:
  - session_id: your session/thread ID
  - event_type: session_start | tool_call | tool_result | error | session_end

Optional fields:
  - agent_name, model, tool_name, tool_input, payload

Tip: Any agent that can make HTTP requests can report to Sprint Vibe. No SDK required. No hooks to configure. Just POST JSON.

3. Or use platform-native hooks (recommended)

For deeper integration, use the platform-specific setup below. Native hooks capture everything automatically — you don't need to instrument your prompts.

Claude Code Setup Codex Setup OpenClaw Setup Custom Agent Setup

Data Model

Every event in Sprint Vibe follows the same structure regardless of which platform sent it.

Event types

Event	Description	Key Data
session_start	Agent session begins	agent_name, model, cwd, permission_mode
session_end	Agent session ends	reason (clear, logout, exit)
tool_call	Agent invokes a tool (before execution)	tool_name, tool_input (full parameters)
tool_result	Tool execution completes	tool_name, tool_input, tool_response
tool_error	Tool execution fails	tool_name, error message
user_prompt	User sends a message to the agent	prompt text
agent_stop	Agent turn completes (response delivered)	stop reason
subagent_start	Subagent/teammate spawned	agent_id, agent_type (Bash, Explore, Plan)
subagent_stop	Subagent finishes	agent_id, agent_type, transcript_path
context_compaction	Context window compressed	trigger (auto/manual)
notification	System notification	message, notification_type
permission_request	Agent requested permission (user allowed or denied)	tool_name, tool_input, permission_decision
tool_error	Tool execution failed	tool_name, error message, is_interrupt
subagent_start	Subagent/teammate spawned	agent_id, agent_type (Bash, Explore, Plan)
teammate_idle	Teammate agent became idle	teammate_name, team_name
task_completed	Task in task list marked done	task_id, task_subject, teammate_name, team_name
status_update	Agent status change	Custom status payload

Session fields

Event	Description	Key Data
session_id	Unique identifier for the session	Required. String.
platform	Which agent platform	claude_code, codex, openclaw, gemini, custom
agent_name	Human-readable agent name	e.g. "claude-opus", "planner", "qa-bot"
agent_role	Role in an agent team	lead, teammate, subagent
team_name	Group agents into teams	e.g. "backend", "frontend", "qa"
model	AI model being used	e.g. "claude-opus-4-6", "gpt-4", "gemini-2.5-pro"
parent_session_id	Links subagent to its parent	Parent session_id string
channel	Message source (multi-channel agents)	whatsapp, telegram, slack, discord
cwd	Working directory	Filesystem path
permission_mode	Agent permission level	default, plan, dontAsk, bypassPermissions

What you see in the dashboard

Each event type surfaces differently in Sprint Vibe:

Live Sessions panelShows active agents with pulsing status dots, model name, duration, last activity time

Activity FeedChronological stream of events with human-readable descriptions — "Editing app/page.tsx", "Running: git status", "Searching for useRealtime"

Session DetailFull timeline of a session with expandable tool calls showing input/output JSON

Agent TeamsParent → child relationships, which subagents were spawned and what they did

Error highlightingtool_error and failed tool calls are highlighted red with full error context

Context compaction alertsYellow warning when an agent compacts context — useful for spotting sessions running long

Claude Code

Claude Code has a built-in hooks system that fires on every tool call, session event, agent spawn, and context compaction. Sprint Vibe captures all of these automatically with zero code changes to your workflow.

Deepest integration — captures everything

What Claude Code reports

With hooks configured, Sprint Vibe automatically captures:

Event	Description	Key Data
SessionStart	Session begins or resumes	source (startup/resume/compact), model, agent_type
SessionEnd	Session closes	reason (clear/logout/exit)
UserPromptSubmit	User sends a message	Full prompt text
PreToolUse	Before every tool call	tool_name, tool_input (file paths, commands, search queries, URLs)
PostToolUse	After every tool call succeeds	tool_name, tool_input, tool_response (full output)
PostToolUseFailure	Tool call failed or errored	tool_name, tool_input, error message, is_interrupt
PermissionRequest	User allowed or denied a tool call	tool_name, tool_input, permission_suggestions
Stop	Agent turn completes	Session context, stop_hook_active
SubagentStart	Subagent spawned	agent_id, agent_type (Bash, Explore, Plan)
SubagentStop	Subagent finishes work	agent_id, agent_type, agent_transcript_path
TeammateIdle	Teammate agent became idle	teammate_name, team_name
TaskCompleted	Task marked done by agent	task_id, task_subject, task_description, teammate_name
Notification	Permission prompts, idle alerts, auth events	message, title, notification_type
PreCompact	Context about to be compacted	trigger (auto/manual)

Tool call detail

Every tool call includes the full input parameters. Here's what Sprint Vibe captures for each tool type:

Bashcommand, description, timeout

Readfile_path, offset, limit

Writefile_path, content

Editfile_path, old_string, new_string

Globpattern, path

Greppattern, path, glob, output_mode

WebFetchurl, prompt

WebSearchquery, domains

Taskprompt, description, subagent_type, model

MCP toolsmcp__server__tool format, full params

Setup

Step 1: Create the hook script at .claude/hooks/sprint-vibe-hook.sh in your project root:

.claude/hooks/sprint-vibe-hook.sh

#!/usr/bin/env bash
INPUT=$(cat)

curl -s -X POST "https://sprintvibe.app/api/v1/ingest" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d "$INPUT" \
  --connect-timeout 3 \
  --max-time 5 \
  > /dev/null 2>&1 &

exit 0

Step 2: Make it executable:

chmod +x .claude/hooks/sprint-vibe-hook.sh

Step 3: Register the hooks in .claude/settings.local.json:

.claude/settings.local.json

{
  "hooks": {
    "SessionStart": [
      { "matcher": "", "hooks": [{ "type": "command", "command": ".claude/hooks/sprint-vibe-hook.sh" }] }
    ],
    "SessionEnd": [
      { "matcher": "", "hooks": [{ "type": "command", "command": ".claude/hooks/sprint-vibe-hook.sh" }] }
    ],
    "UserPromptSubmit": [
      { "matcher": "", "hooks": [{ "type": "command", "command": ".claude/hooks/sprint-vibe-hook.sh" }] }
    ],
    "PreToolUse": [
      { "matcher": "", "hooks": [{ "type": "command", "command": ".claude/hooks/sprint-vibe-hook.sh" }] }
    ],
    "PostToolUse": [
      { "matcher": "", "hooks": [{ "type": "command", "command": ".claude/hooks/sprint-vibe-hook.sh" }] }
    ],
    "PostToolUseFailure": [
      { "matcher": "", "hooks": [{ "type": "command", "command": ".claude/hooks/sprint-vibe-hook.sh" }] }
    ],
    "PermissionRequest": [
      { "matcher": "", "hooks": [{ "type": "command", "command": ".claude/hooks/sprint-vibe-hook.sh" }] }
    ],
    "Stop": [
      { "matcher": "", "hooks": [{ "type": "command", "command": ".claude/hooks/sprint-vibe-hook.sh" }] }
    ],
    "SubagentStart": [
      { "matcher": "", "hooks": [{ "type": "command", "command": ".claude/hooks/sprint-vibe-hook.sh" }] }
    ],
    "SubagentStop": [
      { "matcher": "", "hooks": [{ "type": "command", "command": ".claude/hooks/sprint-vibe-hook.sh" }] }
    ],
    "TeammateIdle": [
      { "matcher": "", "hooks": [{ "type": "command", "command": ".claude/hooks/sprint-vibe-hook.sh" }] }
    ],
    "TaskCompleted": [
      { "matcher": "", "hooks": [{ "type": "command", "command": ".claude/hooks/sprint-vibe-hook.sh" }] }
    ],
    "Notification": [
      { "matcher": "", "hooks": [{ "type": "command", "command": ".claude/hooks/sprint-vibe-hook.sh" }] }
    ],
    "PreCompact": [
      { "matcher": "", "hooks": [{ "type": "command", "command": ".claude/hooks/sprint-vibe-hook.sh" }] }
    ]
  }
}

Note: The hook script always exits 0 and runs curl in the background. It will never block or slow down Claude Code, even if Sprint Vibe is unreachable.

Step 4: Start Claude Code. Every action is now captured live.

What shows up in the dashboard

With all hooks enabled, you'll see:

File operations — "Editing components/header.tsx", "Reading lib/utils.ts (lines 1-50)", "Creating app/new-page.tsx"
Shell commands — "Running: git status", "Running: npm test", "Running: docker build"
Searches — "Searching for useRealtime in *.tsx", "Globbing **/*.test.ts"
Web activity — "Fetching docs.anthropic.com", "Searching: Next.js 16 middleware"
Subagents — "Spawned Explore agent", "Spawned Bash agent", "Task completed"
Permission decisions — "User denied Bash: rm -rf /", "User allowed Write: config.ts"
Tool failures — "Edit failed: old_string not found in file", "Bash timed out after 120s"
Team coordination — "Teammate idle: frontend-agent", "Task completed: Fix auth bug"
Session lifecycle — start, stop, resume, context compaction warnings
User prompts — what you asked the agent to do

OpenAI Codex CLI

Codex CLI does not have a blocking hooks system like Claude Code. Instead, it offers two integration paths: the notify callback and OpenTelemetry export. Sprint Vibe supports both.

What Codex can report

Event	Description	Key Data
agent-turn-complete	Agent finishes a turn (via notify)	thread-id, turn-id, cwd, last-assistant-message
codex.tool_decision	Tool approved/denied (via OTel)	tool name, approval source, duration
codex.tool_result	Tool execution result (via OTel)	duration, success, output snippet
codex.api_request	API call to OpenAI (via OTel)	status, duration, errors, token counts
codex.user_prompt	User prompt submitted (via OTel)	length (content redacted by default)

Important: Codex CLI does not support pre-execution hooks. You cannot intercept tool calls before they run. The community has requested this (GitHub #2109, 384 upvotes) but OpenAI has not shipped it yet.

Option A: Notify callback (simplest)

Add to ~/.codex/config.toml:

~/.codex/config.toml

notify = ["bash", "-c", "curl -s -X POST https://sprintvibe.app/api/v1/ingest -H 'Content-Type: application/json' -H 'Authorization: Bearer YOUR_API_KEY' -d \"$1\" &", "--"]

This fires on every agent-turn-complete event. The JSON payload includes the thread ID, turn ID, working directory, and the assistant's response. Sprint Vibe maps thread-id to session_id automatically.

Option B: OTel bridge (full observability)

For deeper coverage, use Codex's OpenTelemetry export with a lightweight bridge that forwards structured logs to Sprint Vibe.

~/.codex/config.toml

[otel]
environment = "production"

[otel.exporter.otlp-http]
endpoint = "https://sprintvibe.app/api/v1/otel"
headers = { "Authorization" = "Bearer YOUR_API_KEY" }

This captures tool calls, API requests, token usage, and approval decisions as structured OTel logs and traces.

Note: OTel bridge support is coming soon. In the meantime, use the notify callback for basic turn-level tracking, or use the REST API to instrument your Codex wrapper code directly.

Option C: SDK wrapper (most control)

Wrap your Codex API calls to emit events at each stage:

codex-sprintvibe.ts

const SV_URL = "https://sprintvibe.app/api/v1/ingest"
const SV_KEY = "YOUR_API_KEY"

async function track(sessionId: string, eventType: string, data: Record<string, unknown> = {}) {
  fetch(SV_URL, {
    method: "POST",
    headers: { "Content-Type": "application/json", "Authorization": `Bearer ${SV_KEY}` },
    body: JSON.stringify({ session_id: sessionId, event_type: eventType, platform: "codex", ...data }),
  }).catch(() => {})
}

// Track session lifecycle
const threadId = process.env.CODEX_THREAD_ID || crypto.randomUUID()
await track(threadId, "session_start", { agent_name: "codex", model: "codex" })

// Track tool calls
await track(threadId, "tool_call", { tool_name: "code_edit", tool_input: { file: "app.ts" } })
await track(threadId, "tool_result", { tool_name: "code_edit", payload: { success: true } })

// Track completion
await track(threadId, "session_end")

OpenClaw

OpenClaw is a self-hosted multi-agent AI gateway that connects messaging platforms (WhatsApp, Telegram, Discord, Slack) to AI agents. Sprint Vibe integrates via OpenClaw's Gateway WebSocket protocol for real-time event streaming.

What OpenClaw reports

Event	Description	Key Data
agent (ws event)	Agent run events — streamed in real-time	runId, status, tool calls, outputs
presence	Connection/disconnection events	Client state, version
session lifecycle	Session create, reset, compact	sessionKey, sessionId, reason
message routing	Inbound messages across channels	channel, agent, content

Important: OpenClaw does not yet have granular tool:call / tool:result hook events (GitHub #10502). The WebSocket agent event stream is the richest source of data currently available.

Option A: WebSocket listener (real-time)

Connect to OpenClaw's Gateway WebSocket and forward events to Sprint Vibe:

openclaw-bridge.ts

import WebSocket from "ws"

const SV_URL = "https://sprintvibe.app/api/v1/ingest"
const SV_KEY = "YOUR_API_KEY"
const OC_URL = "ws://127.0.0.1:18789"
const OC_TOKEN = "your-gateway-token"

const ws = new WebSocket(OC_URL)

ws.on("open", () => {
  // Handshake
  ws.send(JSON.stringify({
    type: "req", id: "1", method: "connect",
    params: {
      minProtocol: 1, maxProtocol: 1,
      client: { id: "sprintvibe", displayName: "Sprint Vibe", version: "1.0.0", platform: "node", mode: "remote" },
      auth: { token: OC_TOKEN }
    }
  }))
})

ws.on("message", (raw) => {
  const msg = JSON.parse(raw.toString())

  if (msg.type === "event") {
    fetch(SV_URL, {
      method: "POST",
      headers: { "Content-Type": "application/json", "Authorization": `Bearer ${SV_KEY}` },
      body: JSON.stringify({
        session_id: msg.payload?.sessionKey || msg.payload?.runId || "openclaw-main",
        event_type: msg.event === "agent" ? "tool_call" : "status_update",
        platform: "openclaw",
        agent_name: msg.payload?.agentName || "openclaw",
        channel: msg.payload?.channel,
        payload: msg.payload,
      }),
    }).catch(() => {})
  }
})

Option B: Webhook hook (push into Sprint Vibe)

Use OpenClaw's custom hook mappings to forward events. Add to openclaw.json:

openclaw.json (hooks section)

{
  "hooks": {
    "enabled": true,
    "token": "your-shared-secret",
    "mappings": {
      "sprintvibe": {
        "action": "agent",
        "transform": { "module": "sprintvibe-bridge" }
      }
    },
    "transformsDir": "./transforms"
  }
}

The channel field maps directly — WhatsApp, Telegram, Discord, Slack messages all appear tagged in the Sprint Vibe feed so you can see which channel triggered which agent action.

Gemini / Antigravity

Google Gemini agents (including those running via the Antigravity framework) integrate via Sprint Vibe's REST API. Instrument your agent code to report events at each stage.

TypeScript integration

gemini-sprintvibe.ts

const SV_URL = "https://sprintvibe.app/api/v1/ingest"
const SV_KEY = "YOUR_API_KEY"

async function track(sessionId: string, eventType: string, data: Record<string, unknown> = {}) {
  fetch(SV_URL, {
    method: "POST",
    headers: { "Content-Type": "application/json", "Authorization": `Bearer ${SV_KEY}` },
    body: JSON.stringify({ session_id: sessionId, event_type: eventType, platform: "gemini", ...data }),
  }).catch(() => {})
}

// Session start
await track("gemini-001", "session_start", {
  agent_name: "antigravity-planner",
  model: "gemini-2.5-pro",
})

// Tool calls
await track("gemini-001", "tool_call", {
  agent_name: "antigravity-planner",
  tool_name: "web_search",
  tool_input: { query: "latest API docs" },
})

await track("gemini-001", "tool_result", {
  tool_name: "web_search",
  payload: { results_count: 10, top_result: "developers.google.com/..." },
})

// Session end
await track("gemini-001", "session_end")

Python integration

gemini_sprintvibe.py

import httpx

SV_URL = "https://sprintvibe.app/api/v1/ingest"
SV_KEY = "YOUR_API_KEY"

def track(session_id: str, event_type: str, **kwargs):
    httpx.post(SV_URL, headers={
        "Content-Type": "application/json",
        "Authorization": f"Bearer {SV_KEY}",
    }, json={
        "session_id": session_id,
        "event_type": event_type,
        "platform": "gemini",
        **kwargs,
    }, timeout=5)

# Usage
track("gemini-001", "session_start", agent_name="planner", model="gemini-2.5-pro")
track("gemini-001", "tool_call", tool_name="code_gen", tool_input={"prompt": "build a form"})
track("gemini-001", "session_end")

Custom Agents

Any agent system — LangChain, CrewAI, AutoGen, local LLMs (Ollama, llama.cpp), or your own framework — can report to Sprint Vibe with a single HTTP POST per event.

Python SDK

sprintvibe.py

import httpx
from uuid import uuid4

class SprintVibe:
    def __init__(self, api_key: str, url: str = "https://sprintvibe.app"):
        self.url = f"{url}/api/v1/ingest"
        self.headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {api_key}",
        }

    def event(self, session_id: str, event_type: str, **kwargs):
        httpx.post(self.url, headers=self.headers, json={
            "session_id": session_id,
            "event_type": event_type,
            **kwargs,
        }, timeout=5)

    def session(self, agent_name: str = None, model: str = None, **kwargs):
        """Start a tracked session. Returns a session helper."""
        sid = str(uuid4())
        self.event(sid, "session_start", agent_name=agent_name, model=model, **kwargs)
        return SprintVibeSession(self, sid)

class SprintVibeSession:
    def __init__(self, sv: SprintVibe, session_id: str):
        self.sv = sv
        self.session_id = session_id

    def tool_call(self, tool_name: str, tool_input: dict = None, **kwargs):
        self.sv.event(self.session_id, "tool_call", tool_name=tool_name, tool_input=tool_input, **kwargs)

    def tool_result(self, tool_name: str, payload: dict = None, **kwargs):
        self.sv.event(self.session_id, "tool_result", tool_name=tool_name, payload=payload, **kwargs)

    def error(self, tool_name: str, message: str, **kwargs):
        self.sv.event(self.session_id, "tool_error", tool_name=tool_name, payload={"error": message}, **kwargs)

    def end(self):
        self.sv.event(self.session_id, "session_end")

# ── Usage with LangChain ──
sv = SprintVibe("sv_live_...")
session = sv.session(agent_name="langchain-agent", model="gpt-4", team_name="research")

session.tool_call("web_search", {"query": "quantum computing papers 2026"})
session.tool_result("web_search", {"results": 15})
session.tool_call("summarize", {"text": "..."})
session.tool_result("summarize", {"summary": "Key findings..."})
session.end()

TypeScript SDK

sprintvibe.ts

class SprintVibe {
  private url: string
  private headers: Record<string, string>

  constructor(apiKey: string, url = "https://sprintvibe.app") {
    this.url = `${url}/api/v1/ingest`
    this.headers = { "Content-Type": "application/json", "Authorization": `Bearer ${apiKey}` }
  }

  async event(sessionId: string, eventType: string, data: Record<string, unknown> = {}) {
    fetch(this.url, {
      method: "POST",
      headers: this.headers,
      body: JSON.stringify({ session_id: sessionId, event_type: eventType, ...data }),
    }).catch(() => {})
  }

  session(opts: { agentName?: string; model?: string; teamName?: string } = {}) {
    const sid = crypto.randomUUID()
    this.event(sid, "session_start", {
      agent_name: opts.agentName, model: opts.model, team_name: opts.teamName,
    })
    return {
      toolCall: (toolName: string, toolInput?: object) =>
        this.event(sid, "tool_call", { tool_name: toolName, tool_input: toolInput }),
      toolResult: (toolName: string, payload?: object) =>
        this.event(sid, "tool_result", { tool_name: toolName, payload }),
      error: (toolName: string, message: string) =>
        this.event(sid, "tool_error", { tool_name: toolName, payload: { error: message } }),
      end: () => this.event(sid, "session_end"),
      id: sid,
    }
  }
}

// ── Usage ──
const sv = new SprintVibe("sv_live_...")
const session = sv.session({ agentName: "my-agent", model: "llama-3.2", teamName: "backend" })

await session.toolCall("code_edit", { file: "app.ts", action: "add_function" })
await session.toolResult("code_edit", { success: true, lines_changed: 15 })
await session.end()

cURL (one-liner)

curl -X POST https://sprintvibe.app/api/v1/ingest \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sv_live_YOUR_KEY" \
  -d '{
    "session_id": "my-session-001",
    "event_type": "tool_call",
    "platform": "custom",
    "agent_name": "my-agent",
    "model": "llama-3.2",
    "tool_name": "web_search",
    "tool_input": {"query": "hello world"},
    "team_name": "research"
  }'

Platform Features

Sprint Vibe goes beyond event logging. These features turn raw agent telemetry into actionable insights for teams running AI agents at scale.

🧪

A/B Testing

Compare models, prompts, and configurations across real tasks

👥

Parallel Teams

Coordinate multiple agents on the same codebase without conflicts

📸

Screenshots & Visual Diffs

Capture and compare UI output across agent sessions

📦

Artifacts

Code diffs, test results, build outputs — structured and browsable

🧠

Spec Decomposition

AI breaks product specs into agent-executable quests

💰

Cost Optimization

Waste detection, model recommendations, token budget forecasting

▶️

Replay & Debugging

Step through agent sessions to understand decisions and failures

🛡️

Quality Gates

Automated checks before agent work reaches production

A/B Testing

Spin up multiple Claude Code instances on the same task, each in a different branch. Watch them work in parallel on the dashboard. Pick the one that did it better. Merge. Done.

How it works

Create a quest— Define what you want built — acceptance criteria, description, any constraints.

Branch it— Create 2-3 git branches from the same commit. Open a separate terminal for each. Start Claude Code in each one — same quest, same codebase, different branch.

Watch them race— Sprint Vibe shows all sessions side-by-side in real time. Same quest, different approaches. See who reads what files, who starts editing first, who gets stuck, who finishes first.

Compare results— When they finish, Sprint Vibe shows the comparison: total cost, time to complete, files changed, tool calls, errors hit, tests passing. Plus screenshots if it's UI work.

Pick the winner— Review the actual diffs. Merge the branch that did it better. Discard the rest. You paid 2-3x the cost but got the best possible output.

What the comparison shows

Event	Description	Key Data
Time	How long each instance took to complete the quest	Session duration, time-to-first-edit
Cost	Total tokens and estimated spend per branch	Token count, $ estimate
Approach	Which files each instance read, what order, what strategy	Tool call sequence, file access pattern
Output	The actual diff — lines added/removed, files touched	Git diff per branch
Errors	How many times each instance hit errors and how it recovered	tool_error count, retry patterns
Quality	Tests pass, build succeeds, lint clean, screenshot diff score	Quality gate results per branch

Why this matters

Agent output is non-deterministic. The same prompt, same model, same codebase can produce wildly different results. Sometimes Claude Code nails it in 2 minutes. Sometimes it goes in circles for 20 minutes. A/B testing lets you hedge — run the task 2-3x in parallel and pick the best result. The extra cost is insurance against a bad run.

Tip: Sprint Vibe auto-detects when multiple sessions are working on the same quest. The dashboard groups them into a comparison view automatically — no manual setup needed. Just assign the same quest ID to each session.

Tagging sessions for comparison

To group sessions into an A/B comparison, pass the same quest context via metadata:

In each branch's CLAUDE.md or system prompt

## Sprint Vibe
Quest: "Implement user settings page"
Quest ID: quest-settings-page
Branch: feature/settings-v1   # (or v2, v3 in other branches)
API Key: sv_live_YOUR_KEY

Or use the metadata field in hook payloads to tag the branch name. Sprint Vibe matches sessions by quest ID and shows them in a comparison grid.

Parallel Agent Teams

Run multiple agents on the same codebase simultaneously. Sprint Vibe tracks who is working on what, detects conflicts before they happen, and coordinates shared state.

File ownership map

Sprint Vibe parses every Read, Write, and Edit tool call to maintain a real-time map of which agent is touching which files. The dashboard shows:

Active file locksWhich files are currently being edited by which agent, with timestamps

Conflict detectionTwo agents editing the same file = immediate red alert. One reading while another edits = yellow warning.

Stale state warningsAgent A reads a file, Agent B modifies it, Agent A writes based on stale content — flagged instantly

Dependency graphAgent A wrote lib/api.ts, Agent B is importing from it — shows the relationship as a live graph

Team coordination events

With all Claude Code hooks enabled, Sprint Vibe captures the full team lifecycle:

Event	Description	Key Data
SubagentStart	A teammate or subagent is spawned	agent_id, agent_type, parent session
SubagentStop	A teammate finishes	agent_id, transcript_path for review
TeammateIdle	Agent has no more work	teammate_name, team_name
TaskCompleted	Agent marks a task done	task_id, task_subject, description
PermissionRequest	User approves or denies an agent action	tool_name, decision (allow/deny)

Tip: When a user denies a tool call (PermissionRequest with deny decision), Sprint Vibe flags it as a potential policy violation. Use this to understand which agents are attempting risky operations and adjust their configurations.

Branch and worktree tracking

Sprint Vibe tracks the cwd (working directory) of every session. When agents use git worktrees or branches, the dashboard shows which branch each agent is on, which branches have merge conflicts with main, and suggests an optimal merge order based on dependency analysis.

Coordination timeline

A unified view with time on the horizontal axis and one lane per agent. Color-coded events with connecting lines show when one agent's output becomes another's input. Critical path highlighting shows which agent is the bottleneck. Deadlock detection flags circular dependencies between agents.

Screenshots & Visual Diffs

When agents build or modify UIs, text diffs don't tell the full story. Sprint Vibe captures screenshots of UI output and compares them visually.

Capture flow

1Agent completes a UI-touching quest or attempt

2Sprint Vibe triggers screenshot capture via Playwright (configurable routes and viewports)

3Screenshots stored in Supabase Storage, linked to the attempt

4Visual diff computed: pixel-level comparison with AI noise filtering (ignores animations, anti-aliasing)

5Diff score (0.0 = identical, 1.0 = completely different) attached to the attempt

Dashboard views

Side-by-side — baseline (before agent) vs. current (after agent)
Overlay mode — semi-transparent red/green showing what changed
Slider mode — drag a divider between before and after
Per-route tracking — captures every route the agent touched, not just one page
Threshold alerts — if diff score exceeds your threshold, flag for human review

Integration with A/B testing

When comparing agent configurations, visual diff scores become a metric: "Opus produced a clean layout (0.02 diff score); Sonnet introduced a layout shift (0.15 diff score)." This is especially powerful for frontend quests where text diffs look identical but the rendered output is different.

Sending screenshots via API

POST /api/v1/artifacts

curl -X POST https://sprintvibe.app/api/v1/artifacts \
  -H "Authorization: Bearer sv_live_YOUR_KEY" \
  -F "type=screenshot" \
  -F "session_id=abc-123" \
  -F "quest_id=quest-456" \
  -F "route=/dashboard" \
  -F "viewport=1280x720" \
  -F "file=@screenshot.png"

Artifacts

Agent sessions produce more than code changes — build logs, test results, migrations, generated docs, dependency changes. Sprint Vibe extracts and structures these automatically from event payloads.

Auto-extracted artifact types

Event	Description	Key Data
code_diff	Extracted from Edit tool calls	file_path, old_string, new_string, additions, deletions
file_created	Extracted from Write tool calls	file_path, content size, language
file_deleted	Extracted from Bash rm commands	file_path
test_results	Extracted from Bash running test commands	pass count, fail count, output
build_output	Extracted from Bash running build commands	success/failure, output, duration
dependency_change	Changes to package.json, Cargo.toml, requirements.txt	added/removed/updated packages
screenshot	Captured or uploaded UI screenshots	route, viewport, diff_score

Attempt summary

Every quest attempt gets an automatic summary card: "3 files created, 12 files edited, +247/-89 lines, tests: 42 passed / 2 failed, build: success, estimated cost: $2.15." Click through to see each artifact with syntax highlighting, diff views, and collapsible build logs.

Code review view

Artifacts include unified diffs with syntax highlighting, grouped by file. Add annotations, flag issues, and link to the specific tool call that produced each change. Connects to your git repo for one-click PR creation from agent output.

AI Spec Decomposition

The biggest bottleneck in using agent teams is breaking work into agent-sized pieces. Sprint Vibe's AI decomposition takes a product spec and generates a structured quest tree that agents can execute.

Workflow

Input— Paste a product spec, PRD, GitHub issue, or feature description. Markdown, plain text, or a URL.

AI decomposition— Sprint Vibe breaks the spec into quests: MainQuests (epic-level), SubQuests (task-level), SideQuests (cleanup/tech debt). Each gets a title, description, acceptance criteria, complexity estimate, and predicted file scope.

Dependency mapping— The AI identifies which quests depend on which, which can run in parallel, and which files each quest will likely touch — flagging potential conflicts before agents start.

Human review— The decomposition appears as a draft quest board. Adjust, merge, split, reorder. Dependency arrows show the execution graph.

Agent assignment— Sprint Vibe suggests which quests to assign to which agent configurations: 'Quests A and B can run in parallel with Opus; Quest C depends on both and should use Sonnet for cost efficiency.'

Execution— Quests are created on the Kanban board. As agents complete quests, dependent quests are unblocked and moved to Ready.

API

POST /api/v1/projects/:id/decompose

{
  "spec": "Build a user settings page with: profile editing (name, avatar, bio), notification preferences (email, push, SMS toggles), connected accounts (GitHub, Google OAuth), danger zone (delete account with confirmation). Must be responsive and match the existing dashboard theme.",
  "context": {
    "tech_stack": "Next.js 16, Tailwind, Supabase",
    "existing_patterns": "Server components for data, client components for interactivity"
  }
}

// Response:
{
  "quests": [
    {
      "type": "main", "title": "User Settings Page",
      "sub_quests": [
        { "type": "sub", "title": "Profile editing section", "parallel_group": 1, "estimated_files": ["app/settings/page.tsx", "components/profile-form.tsx"] },
        { "type": "sub", "title": "Notification preferences", "parallel_group": 1, "estimated_files": ["components/notification-prefs.tsx", "lib/notifications.ts"] },
        { "type": "sub", "title": "Connected accounts", "parallel_group": 1, "estimated_files": ["components/connected-accounts.tsx", "app/api/oauth/route.ts"] },
        { "type": "sub", "title": "Danger zone (delete account)", "parallel_group": 2, "depends_on": ["Profile editing section"], "estimated_files": ["components/danger-zone.tsx", "app/api/account/delete/route.ts"] }
      ]
    }
  ],
  "conflicts": ["Profile editing and Connected accounts both likely touch app/settings/page.tsx"],
  "estimated_total_cost": "$8.50 with Opus, $2.10 with Sonnet"
}

Auto-generated SideQuests

When a tool_error or PostToolUseFailure event fires, Sprint Vibe can auto-create a SideQuest linked to the error — extracting the error message as the quest title and the tool context as the description. These appear in the Backlog column for triage.

Cost Optimization

Beyond tracking what you spent — Sprint Vibe shows what you wasted and how to spend less.

Waste detection

Sprint Vibe analyzes tool call patterns and flags optimization opportunities with estimated token savings:

Event	Description	Key Data
Redundant reads	Agent reads the same file 3+ times without editing	~500-2000 tokens wasted per occurrence
Duplicate searches	Near-identical Grep/Glob patterns in the same session	~200-800 tokens wasted per occurrence
Unnecessary git status	Agent runs git status repeatedly with no changes between	~100-300 tokens wasted per call
Over-reading	Agent reads a 2000-line file when it only needs lines 50-80	~1000-5000 tokens wasted
Failed then retry	Agent tries the same approach 3+ times before changing strategy	Entire attempt cost

Model recommendations

Sprint Vibe tracks which quest types succeed with which models and surfaces recommendations: "Your Opus sessions on test-writing quests have the same pass rate as Sonnet but cost 5x more. Consider routing trivial quests to Sonnet and reserving Opus for architectural tasks."

Context compaction analytics

Track when agents compact context and correlate with outcomes: "Sessions that compact before the 30-minute mark have a 40% higher failure rate." Sprint Vibe captures every PreCompact event with trigger type (auto/manual).

Token budget forecasting

Based on quest complexity estimates and historical data, Sprint Vibe predicts cost before execution: "This quest board has 8 quests estimated at ~$45 total with Opus, ~$12 with Sonnet."

Waste dashboard

Example cost insight

{
  "period": "2026-02-07",
  "total_spend": "$34.20",
  "effective_spend": "$28.50",
  "waste_breakdown": {
    "redundant_reads": "$2.10 (43 occurrences)",
    "duplicate_searches": "$0.80 (12 occurrences)",
    "failed_attempts": "$2.40 (3 quests retried from scratch)",
    "over_qualified_model": "$0.40 (Opus used for 2 trivial test-writing quests)"
  },
  "recommendations": [
    "Route test-writing quests to Sonnet (saves ~$0.20/quest)",
    "Add offset/limit to file reads in components/ (agents read full files 67% of the time)",
    "Session 'abc-123' read lib/utils.ts 7 times — consider pinning to context"
  ]
}

Replay & Debugging

When an agent produces bad output, you need to understand why. Sprint Vibe's replay system lets you step through any session to see exactly what happened and where it went wrong.

Session replay

Playback controls — play, pause, step-forward, step-back, speed (1x/2x/5x)
File state reconstruction — at any point in the timeline, see the state of every file the agent touched, reconstructed from Read/Write/Edit events
Decision annotations — human-readable descriptions of what the agent was thinking at each step
Branch points — highlights where the agent changed strategy (tried approach A, got an error, switched to approach B)
Subagent branching — when a subagent spawns, the timeline branches showing parallel execution

Diff scrubber

A slider that scrubs through the session chronologically. At each position, shows the cumulative diff from session start. Lets you see exactly when a regression was introduced — "the bug appeared at event #47 when the agent edited auth.ts."

Failure path analysis

Sprint Vibe auto-detects tool_error and PostToolUseFailure events and backtracks to the decision that caused them: "Agent read api.ts, misunderstood the return type, wrote incorrect test assertions, tests failed, agent attempted 3 fixes before succeeding." The critical path is highlighted vs. recovery steps.

Session comparison

Select two sessions (same quest, different attempts or different agents) and view side-by-side. See where approaches diverge: "Opus explored the codebase for 2 minutes then went straight to implementation; Sonnet read 15 files before starting." Feeds directly into A/B testing insights.

Permission audit trail

Every PermissionRequest event is logged with the tool name, input, and whether the user allowed or denied it. The replay view highlights denied permissions in red — showing moments where the user intervened to stop an agent action. Use this to refine permission policies and understand which agents attempt risky operations.

Quality Gates

Automated checks that run when an agent completes work. Catch failures before they reach production — without manual review bottlenecks.

Available checks

Event	Description	Key Data
tests_pass	Run the project's test suite	Command, pass/fail count, output
build_succeeds	Verify the project compiles	Command, success/failure, output
lint_clean	No linting errors introduced	Command, warning/error count
type_check	TypeScript type safety	tsc --noEmit output
visual_regression	Screenshot diff within threshold	Diff score, threshold
acceptance_criteria	LLM-judge evaluates quest criteria	Criteria met/unmet, reasoning
no_secret_leak	Scan diffs for credentials	Patterns matched (API_KEY, SECRET, etc.)
diff_size	Changes within expected scope	Additions, deletions, files changed

Gate configuration

Project Settings → Quality Gates

{
  "gates": [
    { "type": "tests_pass", "command": "npm test", "required": true },
    { "type": "build_succeeds", "command": "npm run build", "required": true },
    { "type": "type_check", "command": "npx tsc --noEmit", "required": true },
    { "type": "lint_clean", "command": "npm run lint", "required": false },
    { "type": "visual_regression", "max_diff_score": 0.05, "required": false },
    { "type": "acceptance_criteria", "method": "llm_judge", "required": true },
    { "type": "no_secret_leak", "patterns": ["API_KEY", "SECRET", "PASSWORD", "sv_live_"], "required": true }
  ],
  "trigger": "attempt_complete"
}

Dashboard integration

Quest cards on the Kanban board show gate status: green checkmark for passed, red X for failed, yellow partial for non-required checks failing. The "Review" column only receives quests that pass all required gates. Failed gates show actionable error messages with links to the specific tool calls that caused the failure.

API Reference

POST /api/v1/ingest

Ingest a single event. This is the only endpoint you need.

Request

POST https://sprintvibe.app/api/v1/ingest
Authorization: Bearer sv_live_...
Content-Type: application/json

{
  // ── Required ──
  "session_id": "string",         // Unique session/thread identifier
  "event_type": "string",         // See event types table above

  // ── Identity ──
  "platform": "string",           // claude_code | codex | openclaw | gemini | custom
  "agent_name": "string",         // Human-readable agent name
  "agent_role": "string",         // lead | teammate | subagent
  "team_name": "string",          // Group agents into teams
  "model": "string",              // Model name (claude-opus-4-6, gpt-4, etc.)

  // ── Tool data ──
  "tool_name": "string",          // Name of tool being called
  "tool_input": {},               // Tool parameters (any JSON object)

  // ── Context ──
  "channel": "string",            // whatsapp | telegram | slack | discord | web
  "parent_session_id": "string",  // Links subagent to parent session
  "cwd": "string",                // Working directory
  "permission_mode": "string",    // default | plan | dontAsk | bypassPermissions

  // ── Payload ──
  "payload": {},                  // Any additional data (tool output, errors, etc.)
  "metadata": {},                 // Arbitrary metadata tags

  // ── Claude Code auto-detection ──
  // When hook_event_name is present, platform is auto-set to "claude_code"
  // and event_type is auto-mapped from the hook event name.
  "hook_event_name": "string"     // SessionStart | PreToolUse | PostToolUse | etc.
}

Response (200)

{
  "success": true,
  "event_id": "uuid"
}

Error responses

// 401 - Missing or invalid API key
{ "error": "Missing API key" }
{ "error": "Invalid API key" }

// 500 - Server error
{ "error": "Failed to store event" }

GET /api/v1/health

Health check. No authentication required.

Response (200)

{ "status": "ok" }

Rate limits

The ingestion API is designed for high-throughput agent telemetry. There are no rate limits during the beta period. In production, expect per-key limits of at least 1,000 events/minute.

Authentication

API keys

API keys authenticate agent events. Each key is scoped to a specific project within a tenant. The key format is sv_live_ followed by 64 hex characters. Only the SHA-256 hash is stored — the plaintext is shown once on creation.

Create keys in Project Settings → API Keys. You can create multiple keys per project (e.g. one per environment, one per agent team) and revoke them individually.

Dashboard authentication

The Sprint Vibe dashboard uses OAuth (GitHub and Google) via Supabase Auth. All dashboard queries are scoped to your tenant via Row Level Security — you can only see data from projects you have access to.

Security model

API keys are SHA-256 hashed before storage — Sprint Vibe never stores plaintext keys
All ingestion goes through a SECURITY DEFINER function — no direct table access
Row Level Security on all tables — tenant isolation is enforced at the database level
Hook scripts run curl in the background with a 5-second timeout — they never block your agent
All data in transit is encrypted (HTTPS/TLS)