Data EngineeringMay 1, 2026·6 min read

Paper Trading Bot Backtest Parity Runbook

CuteMarkets Team

Research

Paper Trading Bot Backtest Parity Runbook

Backtest-to-paper drift is one of the most expensive failure modes in systematic trading research. The strategy can be profitable in historical replay, the paper bot can be technically running, and the actual live paper trades can still represent a different object.

That is why a paper trading bot needs a parity runbook.

The runbook is not a performance report. It is an operations checklist that asks whether the live route still matches the research route closely enough to be worth observing.

What Parity Means

Parity means the paper bot and backtest agree on the important decisions:

the same strategy profile
the same ticker universe
the same entry window
the same signal timing
the same contract-selection rules
the same option DTE window
the same quote-quality gates
the same risk controls
the same exit rules

Perfect equality is not always possible. Live market data has latency, broker behavior, and fresh quote conditions. The goal is not to erase those differences. The goal is to classify them.

If the formal backtest rejects a setup because the option bar is unstable, and the paper bot also rejects it for a related quote-side reason, the mismatch may be acceptable. If the paper bot enters because it used a different timing rule, that is a more serious problem.

Step 1: Freeze The Candidate

Before running parity, freeze the candidate in a launch contract.

The contract should identify:

profile name
tickers
option DTE window
contract status rules
risk per trade
max trades per day
strict parity flag
quote freshness settings
routing policy
kill-switch baseline
artifact root

Do not run parity from memory. The launch contract is the source of truth.

Step 2: Run A Targeted Test Gate

Run a small targeted test gate before paper validation. The tests should cover:

profile lookup
CLI argument rendering
contract loading
parity replay command generation
broker adapter dry-run behavior
state and funnel serialization
quote freshness and rejection logic
risk-control evaluation

This is not a full research sweep. It is a confidence check that the operational path still loads the expected code.

Step 3: Check Import Origin

Import-origin mistakes are easy to miss in research repos with multiple workspaces, notebooks, or copied scripts.

Before running parity, print or check the module path for the key runtime objects:

CLI entrypoint
strategy profile registry
paper bot runtime
parity helper
broker adapter

The purpose is simple: make sure the bot is running the checked-out code you think it is running.

Step 4: Replay Benchmark Sessions

Pick benchmark sessions before looking at today's live behavior. Good benchmark sessions include:

one expected trade day
one expected no-trade day
one known quote-rejection day
one exit-heavy or stop-heavy day
one edge case around expiry or missing data

The parity command should run from the launch contract and write outputs into a run-labeled artifact directory.

For each benchmark, record:

backtest decision
paper-style decision
selected contract
entry time
exit time
rejection reason
quote-side decision
PnL difference if both paths traded

Step 5: Classify Mismatches

Every mismatch needs a label.

Mismatch	Typical meaning
`signal_mismatch`	Entry setup or timing logic differs
`contract_mismatch`	Option selector chose a different instrument
`microstructure_reject`	Quote, spread, bar range, or liquidity gate blocked the trade
`data_missing`	Required bars, contracts, or quotes were unavailable
`broker_reject`	Paper broker did not accept the order
`state_mismatch`	Existing position or duplicate-entry guard changed behavior
`risk_control_block`	Kill switch, daily loss cap, or max-trade rule blocked entry

Some mismatches are productively conservative. Others invalidate the paper observation. The runbook should separate those categories instead of treating every difference as a generic failure.

Step 6: Run A Dry-Run Smoke

A dry-run smoke is a single live-data cycle that does not place a real paper order. It should verify:

credentials are present without printing them
data provider calls work
broker account lookup works or is safely mocked
state path is writable
trade, funnel, and execution artifacts can be created
kill switch baseline loads
no duplicate active-trade state exists unexpectedly

Run this during the session if the bot depends on live quotes or live bars. A weekend smoke can prove imports and paths, but it cannot validate market-session behavior.

Step 7: Start Limited Paper

Only after parity and smoke should the bot run a limited paper loop.

Keep the first scope small:

one frozen candidate
one contract
one basket or one symbol group
one artifact root
low max trades per day
strict logging
daily review required

The first objective is not to increase activity. It is to collect clean drift evidence.

Step 8: Review The Session

The daily review should be short enough to complete every day.

Minimum checklist:

opened versus expected trades
no-trade symbols and rejection reasons
contract mismatches
parity mismatches
fill failures
broker rejects
stale orders
duplicate-entry prevention
kill-switch state
daily loss-cap state
open positions after the close
decision for next session

If the review cannot explain the day, do not scale the bot. Fix observability first.

What To Keep Public

For public research notes, publish the method and the failure classes:

backtest realism requirements
quote-aware fill assumptions
parity ladder
promotion discipline
paper review checklist

Avoid publishing private server paths, process IDs, secrets, broker account details, and exact operational state from live workspaces.

That distinction lets a team share useful engineering lessons without exposing the execution environment.

How CuteMarkets Fits

CuteMarkets provides the data surface that makes parity measurable:

contract lists for instrument identity
chain snapshots for current surface checks
quotes for spread and freshness
trades for print evidence
aggregates for historical replay
typed Python access through cutemarkets-python
public research runtime through cutebacktests

The broker still owns order placement. The paper bot owns state, risk controls, and review. The data API owns market evidence.

Backtest parity is the bridge between those pieces.

Product links

Build the workflow with CuteMarkets

This article is part of the broader CuteMarkets product and research stack. Use the landing pages below to move from the blog into the specific API workflow you want to evaluate.

Options Data API

See the canonical product page for real-time and historical options data.

Historical Options Data API

Inspect the historical contracts, quotes, trades, and aggregates workflow.

Options Chain API

Go straight to chain snapshots, expirations, and strike discovery.

Pricing

Review plans before you move from free evaluation into production usage.

Back to Blog