HomeBlogPaper Trading Bot Backtest Parity Runbook
Data EngineeringMay 1, 2026·6 min read

Paper Trading Bot Backtest Parity Runbook

CuteMarkets

CuteMarkets Team

Research

Paper Trading Bot Backtest Parity Runbook

Paper Trading Bot Backtest Parity Runbook

Backtest-to-paper drift is one of the most expensive failure modes in systematic trading research. The strategy can be profitable in historical replay, the paper bot can be technically running, and the actual live paper trades can still represent a different object.

That is why a paper trading bot needs a parity runbook.

The runbook is not a performance report. It is an operations checklist that asks whether the live route still matches the research route closely enough to be worth observing.

What Parity Means

Parity means the paper bot and backtest agree on the important decisions:

  • the same strategy profile
  • the same ticker universe
  • the same entry window
  • the same signal timing
  • the same contract-selection rules
  • the same option DTE window
  • the same quote-quality gates
  • the same risk controls
  • the same exit rules

Perfect equality is not always possible. Live market data has latency, broker behavior, and fresh quote conditions. The goal is not to erase those differences. The goal is to classify them.

If the formal backtest rejects a setup because the option bar is unstable, and the paper bot also rejects it for a related quote-side reason, the mismatch may be acceptable. If the paper bot enters because it used a different timing rule, that is a more serious problem.

Step 1: Freeze The Candidate

Before running parity, freeze the candidate in a launch contract.

The contract should identify:

  • profile name
  • tickers
  • option DTE window
  • contract status rules
  • risk per trade
  • max trades per day
  • strict parity flag
  • quote freshness settings
  • routing policy
  • kill-switch baseline
  • artifact root

Do not run parity from memory. The launch contract is the source of truth.

Step 2: Run A Targeted Test Gate

Run a small targeted test gate before paper validation. The tests should cover:

  • profile lookup
  • CLI argument rendering
  • contract loading
  • parity replay command generation
  • broker adapter dry-run behavior
  • state and funnel serialization
  • quote freshness and rejection logic
  • risk-control evaluation

This is not a full research sweep. It is a confidence check that the operational path still loads the expected code.

Step 3: Check Import Origin

Import-origin mistakes are easy to miss in research repos with multiple workspaces, notebooks, or copied scripts.

Before running parity, print or check the module path for the key runtime objects:

  • CLI entrypoint
  • strategy profile registry
  • paper bot runtime
  • parity helper
  • broker adapter

The purpose is simple: make sure the bot is running the checked-out code you think it is running.

Step 4: Replay Benchmark Sessions

Pick benchmark sessions before looking at today's live behavior. Good benchmark sessions include:

  • one expected trade day
  • one expected no-trade day
  • one known quote-rejection day
  • one exit-heavy or stop-heavy day
  • one edge case around expiry or missing data

The parity command should run from the launch contract and write outputs into a run-labeled artifact directory.

For each benchmark, record:

  • backtest decision
  • paper-style decision
  • selected contract
  • entry time
  • exit time
  • rejection reason
  • quote-side decision
  • PnL difference if both paths traded

Step 5: Classify Mismatches

Every mismatch needs a label.

MismatchTypical meaning
signal_mismatchEntry setup or timing logic differs
contract_mismatchOption selector chose a different instrument
microstructure_rejectQuote, spread, bar range, or liquidity gate blocked the trade
data_missingRequired bars, contracts, or quotes were unavailable
broker_rejectPaper broker did not accept the order
state_mismatchExisting position or duplicate-entry guard changed behavior
risk_control_blockKill switch, daily loss cap, or max-trade rule blocked entry

Some mismatches are productively conservative. Others invalidate the paper observation. The runbook should separate those categories instead of treating every difference as a generic failure.

Step 6: Run A Dry-Run Smoke

A dry-run smoke is a single live-data cycle that does not place a real paper order. It should verify:

  • credentials are present without printing them
  • data provider calls work
  • broker account lookup works or is safely mocked
  • state path is writable
  • trade, funnel, and execution artifacts can be created
  • kill switch baseline loads
  • no duplicate active-trade state exists unexpectedly

Run this during the session if the bot depends on live quotes or live bars. A weekend smoke can prove imports and paths, but it cannot validate market-session behavior.

Step 7: Start Limited Paper

Only after parity and smoke should the bot run a limited paper loop.

Keep the first scope small:

  • one frozen candidate
  • one contract
  • one basket or one symbol group
  • one artifact root
  • low max trades per day
  • strict logging
  • daily review required

The first objective is not to increase activity. It is to collect clean drift evidence.

Step 8: Review The Session

The daily review should be short enough to complete every day.

Minimum checklist:

  • opened versus expected trades
  • no-trade symbols and rejection reasons
  • contract mismatches
  • parity mismatches
  • fill failures
  • broker rejects
  • stale orders
  • duplicate-entry prevention
  • kill-switch state
  • daily loss-cap state
  • open positions after the close
  • decision for next session

If the review cannot explain the day, do not scale the bot. Fix observability first.

What To Keep Public

For public research notes, publish the method and the failure classes:

  • backtest realism requirements
  • quote-aware fill assumptions
  • parity ladder
  • promotion discipline
  • paper review checklist

Avoid publishing private server paths, process IDs, secrets, broker account details, and exact operational state from live workspaces.

That distinction lets a team share useful engineering lessons without exposing the execution environment.

How CuteMarkets Fits

CuteMarkets provides the data surface that makes parity measurable:

  • contract lists for instrument identity
  • chain snapshots for current surface checks
  • quotes for spread and freshness
  • trades for print evidence
  • aggregates for historical replay
  • typed Python access through cutemarkets-python
  • public research runtime through cutebacktests

The broker still owns order placement. The paper bot owns state, risk controls, and review. The data API owns market evidence.

Backtest parity is the bridge between those pieces.