Agent simulation concepts

This page covers the key concepts behind agent simulation: the two simulation modes and how Snowglobe’s mocking layer works. Understanding these will help you make better decisions when setting up your agent.

Two simulation modes

Snowglobe offers two modes for agent simulation, depending on how much data you have available.

Probe mode

Probe mode is the default. It requires minimal input, just your agent description, tool schemas, and some tool input/output samples. Snowglobe generates a diverse set of conversations designed to exercise your tools broadly, covering happy paths, error cases, and edge cases. You can optionally pass historical data to give Snowglobe more context, but it isn’t required. Use probe mode when:

You’re building a new agent and don’t have production data yet
You want broad coverage across all your tools
You want to quickly discover how your agent handles unexpected inputs

Distribution matching mode

Distribution matching mode is available after you provide historical conversation data to the Guardrails team. In this mode, Snowglobe tunes simulations to match the patterns in your real traffic: chat topics, message length, conversation length, tool call distributions, and more. Use distribution matching mode when:

You have production conversation data available
You want simulations that closely mirror real user behavior
You need to validate performance against realistic usage patterns

Distribution matching requires a tuning step by the Guardrails team. This mode is not available for greenfield applications without historical data. Email admin@guardrailsai.com to get started with distribution matching.

How tool mocking works

Agents use tools. Some tools perform simple lookups (retrieve the weather, check a status), while others execute complex workflows that modify state (create orders, update customer records, process refunds). When Snowglobe generates simulated conversations, it creates synthetic users, IDs, and scenarios that don’t exist in your real systems. Tools that query your database or API will fail because the referenced data isn’t there. This is the central challenge of agent simulation: how do you run end-to-end conversations when the simulated data doesn’t exist in your backend?

Dynamic mocking

Snowglobe solves this with a dynamic mocking layer built into the SDK:

You wrap your tool functions with the @snowglobe_tool decorator
When your agent runs normally (outside of simulation), the decorator does nothing. Your tools execute as usual
During a simulation, when your agent makes a tool call, the request is intercepted and sent to Snowglobe instead of your real backend
Snowglobe generates a realistic and logically consistent mock response based on the tool’s schema, the input arguments, and the conversation context
The mock response is injected back into your agent’s flow as if the real tool had responded

Your agent completes the full conversation loop (reasoning, calling tools, processing responses, replying to the user) without any production data being read or modified.

What gets mocked

Snowglobe’s mock responses cover a range of scenarios, not just happy paths:

Successful responses with realistic data matching your tool’s output schema
Error responses that simulate downstream failures (e.g., “customer not found”, “order already cancelled”)
Edge cases like empty results, malformed data, or unexpected field values

This variety is intentional. It tests how your agent handles failure, not just success.

Not all tools need mocking

Some tools work fine with simulated data as-is. For example, a weather lookup tool will return valid results regardless of whether the “user” asking is real or simulated. It just needs a location string, which your agent already provides. Snowglobe gives you control over which tools get mocked. You only need to apply the @snowglobe_tool decorator to functions where simulated inputs would cause failures. Typically, these are tools that:

Query a database for specific user or entity records
Modify state (create, update, delete operations)
Depend on IDs or references that must exist in your system

Tools that are stateless or accept generic inputs (like a calculator, a search engine, or a weather API) can often run undecorated during simulations.

Zero impact on production code

The mocking layer has no effect on your production code:

The @snowglobe_tool decorator is completely inert during normal execution
No production data is read, written, or modified during simulations, other than historical data passed to Snowglobe via the UI.
You don’t need to set up test databases or seed fake data
Mock responses are generated dynamically per-conversation, so every simulation produces fresh scenarios

You can run hundreds of simulated conversations against your real agent code in minutes.

The getting started guide walks through every step of setup.

Documentation Index

​Two simulation modes

​Probe mode

​Distribution matching mode

​How tool mocking works

​Dynamic mocking

​What gets mocked

​Not all tools need mocking

​Zero impact on production code

Getting started

Two simulation modes

Probe mode

Distribution matching mode

How tool mocking works

Dynamic mocking

What gets mocked

Not all tools need mocking

Zero impact on production code