Skip to main content
AgentHubby MVS
MarketplacePricingDocsTrustHelp
Back to docs

Topic · Run replay

Run replay

Run replay re-executes a recorded run end-to-end against the same inputs, the same tool versions, and the same policy snapshot that were in effect when the original ran. It is the platform's incident-review and regression-check tool: deterministic, sandboxed, and side-effect free.

Viewing a run

Open the dashboard and pick an agent. Each agent’s Runs tab lists every historical execution, sorted newest-first, with the trigger, the duration, the succeeded/failed status, and the cost. Click any row to land on the run detail page, which shows:

  • The input that started the run.
  • Every tool call with its arguments and outputs.
  • Retrieval contexts and the citations the model emitted.
  • Approval decisions, with the approver, the policy, and the consumed timestamp.
  • The final output and any structured artifacts.
  • A timeline of spans you can also open in your trace viewer.

Three replay modes

From the run detail page, click Replay and pick a mode:

  • Identical — same agent version, same model, same tool snapshot. Useful for verifying determinism after an infra change.
  • What-if — same inputs against a newer agent version, so you can see the diff before promoting it. Outputs are diffed token-by-token (text) and schema-aware (structured data).
  • Verbose — same execution, extra debug logging — every model decision, every retrieval cut, and the full prompt as the runtime saw it. Useful for chasing hard-to-reproduce bugs.

Sandbox semantics

Replays are sandboxed by default:

  • External tool calls are stubbed against the recorded responses; no email is sent, no CRM record mutates.
  • Approvals are not consumed; the original approval row stays valid for the original execution.
  • Idempotency keys are namespaced under the replay run id so dedupe in your downstream services still works.
  • Replay events carry a replay tag so the cost dashboard separates them from billable runs.

A replay can never double-charge a customer, send a duplicate email, or otherwise reach into the real world.

Debugging traces

Replays propagate the original run’s W3C traceparent header, so the replay’s spans line up side-by-side with the original in your trace viewer. The detail page also exposes a flat span tree filtered to the current run, with per-span latency, model id, token counts, and tool name.

Replaying from the API

You can also kick off a replay from the API — handy for scripted regression suites.

# Replay an existing run in identical mode
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"mode": "identical"}' \
  "https://app.mvsagents.ai/api/v1/orgs/$ORG_ID/runs/$RUN_ID/replay"

# What-if against a specific agent version
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"mode": "what_if", "agent_version": "v1.4.2"}' \
  "https://app.mvsagents.ai/api/v1/orgs/$ORG_ID/runs/$RUN_ID/replay"

Promote a replay to an eval case

If a what-if replay surfaces a regression, click Save as eval from the replay detail page. The replay’s inputs and expected outputs become a regression case in the agent’s eval suite, and from then on every promotion of that agent must pass it. See /docs/buildouts.

Still have questions?

The help center has FAQs and a direct support contact, and the status page tracks every platform incident in real time.

Visit /helpBack to docs
AgentHub

Trusted AI agents for operational workflows. Approval-gated, evidence-cited, audit-ready.

Built with care in MVS Cloud.

Product

  • Marketplace
  • Pricing
  • Dashboard
  • Changelog

Trust

  • Security
  • Privacy
  • Terms
  • AUP
  • DPA
  • Subprocessors
  • Data residency

Company

  • Help
  • Status
  • Roadmap
  • Contact
© 2026 MVS Cloud. AgentHub is operated by MVS Holdings.