Paper Trading Bot Backtest Parity Runbook
CuteMarkets Team
Research

Paper Trading Bot Backtest Parity Runbook
Backtest-to-paper drift is one of the most expensive failure modes in systematic trading research. The strategy can be profitable in historical replay, the paper bot can be technically running, and the actual live paper trades can still represent a different object.
That is why a paper trading bot needs a parity runbook.
The runbook is not a performance report. It is an operations checklist that asks whether the live route still matches the research route closely enough to be worth observing.
What Parity Means
Parity means the paper bot and backtest agree on the important decisions:
- the same strategy profile
- the same ticker universe
- the same entry window
- the same signal timing
- the same contract-selection rules
- the same option DTE window
- the same quote-quality gates
- the same risk controls
- the same exit rules
Perfect equality is not always possible. Live market data has latency, broker behavior, and fresh quote conditions. The goal is not to erase those differences. The goal is to classify them.
If the formal backtest rejects a setup because the option bar is unstable, and the paper bot also rejects it for a related quote-side reason, the mismatch may be acceptable. If the paper bot enters because it used a different timing rule, that is a more serious problem.
Step 1: Freeze The Candidate
Before running parity, freeze the candidate in a launch contract.
The contract should identify:
- profile name
- tickers
- option DTE window
- contract status rules
- risk per trade
- max trades per day
- strict parity flag
- quote freshness settings
- routing policy
- kill-switch baseline
- artifact root
Do not run parity from memory. The launch contract is the source of truth.
Step 2: Run A Targeted Test Gate
Run a small targeted test gate before paper validation. The tests should cover:
- profile lookup
- CLI argument rendering
- contract loading
- parity replay command generation
- broker adapter dry-run behavior
- state and funnel serialization
- quote freshness and rejection logic
- risk-control evaluation
This is not a full research sweep. It is a confidence check that the operational path still loads the expected code.
Step 3: Check Import Origin
Import-origin mistakes are easy to miss in research repos with multiple workspaces, notebooks, or copied scripts.
Before running parity, print or check the module path for the key runtime objects:
- CLI entrypoint
- strategy profile registry
- paper bot runtime
- parity helper
- broker adapter
The purpose is simple: make sure the bot is running the checked-out code you think it is running.
Step 4: Replay Benchmark Sessions
Pick benchmark sessions before looking at today's live behavior. Good benchmark sessions include:
- one expected trade day
- one expected no-trade day
- one known quote-rejection day
- one exit-heavy or stop-heavy day
- one edge case around expiry or missing data
The parity command should run from the launch contract and write outputs into a run-labeled artifact directory.
For each benchmark, record:
- backtest decision
- paper-style decision
- selected contract
- entry time
- exit time
- rejection reason
- quote-side decision
- PnL difference if both paths traded
Step 5: Classify Mismatches
Every mismatch needs a label.
| Mismatch | Typical meaning |
|---|---|
signal_mismatch | Entry setup or timing logic differs |
contract_mismatch | Option selector chose a different instrument |
microstructure_reject | Quote, spread, bar range, or liquidity gate blocked the trade |
data_missing | Required bars, contracts, or quotes were unavailable |
broker_reject | Paper broker did not accept the order |
state_mismatch | Existing position or duplicate-entry guard changed behavior |
risk_control_block | Kill switch, daily loss cap, or max-trade rule blocked entry |
Some mismatches are productively conservative. Others invalidate the paper observation. The runbook should separate those categories instead of treating every difference as a generic failure.
Step 6: Run A Dry-Run Smoke
A dry-run smoke is a single live-data cycle that does not place a real paper order. It should verify:
- credentials are present without printing them
- data provider calls work
- broker account lookup works or is safely mocked
- state path is writable
- trade, funnel, and execution artifacts can be created
- kill switch baseline loads
- no duplicate active-trade state exists unexpectedly
Run this during the session if the bot depends on live quotes or live bars. A weekend smoke can prove imports and paths, but it cannot validate market-session behavior.
Step 7: Start Limited Paper
Only after parity and smoke should the bot run a limited paper loop.
Keep the first scope small:
- one frozen candidate
- one contract
- one basket or one symbol group
- one artifact root
- low max trades per day
- strict logging
- daily review required
The first objective is not to increase activity. It is to collect clean drift evidence.
Step 8: Review The Session
The daily review should be short enough to complete every day.
Minimum checklist:
- opened versus expected trades
- no-trade symbols and rejection reasons
- contract mismatches
- parity mismatches
- fill failures
- broker rejects
- stale orders
- duplicate-entry prevention
- kill-switch state
- daily loss-cap state
- open positions after the close
- decision for next session
If the review cannot explain the day, do not scale the bot. Fix observability first.
What To Keep Public
For public research notes, publish the method and the failure classes:
- backtest realism requirements
- quote-aware fill assumptions
- parity ladder
- promotion discipline
- paper review checklist
Avoid publishing private server paths, process IDs, secrets, broker account details, and exact operational state from live workspaces.
That distinction lets a team share useful engineering lessons without exposing the execution environment.
How CuteMarkets Fits
CuteMarkets provides the data surface that makes parity measurable:
- contract lists for instrument identity
- chain snapshots for current surface checks
- quotes for spread and freshness
- trades for print evidence
- aggregates for historical replay
- typed Python access through
cutemarkets-python - public research runtime through
cutebacktests
The broker still owns order placement. The paper bot owns state, risk controls, and review. The data API owns market evidence.
Backtest parity is the bridge between those pieces.
Product links
Build the workflow with CuteMarkets
This article is part of the broader CuteMarkets product and research stack. Use the landing pages below to move from the blog into the specific API workflow you want to evaluate.
Options Data API
See the canonical product page for real-time and historical options data.
Historical Options Data API
Inspect the historical contracts, quotes, trades, and aggregates workflow.
Options Chain API
Go straight to chain snapshots, expirations, and strike discovery.
Pricing
Review plans before you move from free evaluation into production usage.