VWAP Z-Score Strategy: How We Evaluated c36 and Why It Still Was Not Promoted

Daniel Ratke
Research & Engineering

Term map
Backtesting vocabulary for this article
Treat signal timestamp, point-in-time universe, quote-aware fill, reject reason, replay artifact, walk-forward test, and cache key as first-class terms. They separate reproducible research from a backtest that only preserves the final performance table.
Follow the linked definitions for Point-in-time contracts, Quote-aware fills, Reject reasons, Replay artifact, Cache key, Signal timestamp, Look-ahead leakage, Walk-forward test, Slippage model, Same-bar fill, Promotion gate, and Options data API.
Repository reference: cutebacktests
Abstract
A VWAP z-score strategy can be profitable and still fail the admission test that matters. That is one of the clearest lessons from the c36 branch in this repository. The high-quality version of the strategy produced +16004 PnL on 15 trades with DSR 0.6400, yet the repo still refused to promote it because the exact failed gate was trades_per_week_ok.
This shows how a portfolio-minded research process differs from a marketing-minded one. A marketing process would stop at the profit figure and the positive DSR. A portfolio process asks whether the strategy is active enough, clean enough, and additive enough to justify a slot. In c36's case, the answer was still no.
This c36 evaluation belongs with VWAP Mean Reversion Signal Quality and Density, Backtesting Engine Loop, and Backtesting Robustness. Keep VWAP deviation, z-score threshold, setup density, next-bar entry, stop/target logic, Sharpe, Sortino, DSR, PBO, and drawdown explicit.
Question
The useful question is not whether c36 made money. It did. The useful question is why the repo still kept it below c66 in the promotion ladder.
Many traders use "profitable" and "deployable" as if they meant the same thing. They do not. A profitable branch can still be too sparse, too unstable, or too narrow to serve the role a live portfolio needs it to serve.
Method: How the VWAP Z-Score Strategy Was Evaluated
As described in Episode 8, c36 is an option-native descendant of the c18 VWAP mean-reversion family. It uses VWAP residual z-scores, bounded VWAP slope, sigma constraints, relative-volume requirements, and short holding windows. The high-quality version requires stronger excursions and cleaner conditions. The opportunity version relaxes those conditions to gain trade count.
The evaluation then does not stop at raw PnL. The branch is judged in portfolio context. Is the quality branch active enough? Does the opportunity branch preserve the same quality if density increases? Does the strategy deserve promotion, open-paper status, or a lower rung?
Evidence / Results
The c36 results are now well defined in the repo:
c36_vwap_mr_option_native_quality_v1:+16004PnL,15trades,DSR 0.6400- failed only
trades_per_week_ok c36_vwap_mr_option_native_opportunity_v1:85trades,+2987PnL- denser version did not preserve the same quality profile
The portfolio map in Toward The One Piece Of Sharpe and PAPER_BOTS.md then places c36 below c66. It remains a backup_candidate or open_paper_only, not the lead paper bot.
What Worked
What worked was the signal itself. c36 is not a fake branch. It is one of the repo's few strategies that produced a clean positive quality result under a relatively strict research process. That is why it remains in the current crew at all.
The branch also worked as a diagnostic case. It showed that the repo was willing to separate "interesting enough to keep studying" from "strong enough to promote." That distinction is one of the healthiest things about the recent research process.
What Failed
What failed was the role fit. The strategy was too sparse in its best form, and the denser version was not good enough to justify replacing the selective one. That is a portfolio failure, not a conceptual failure.
The c36 result is valuable precisely because it is not dramatic. The branch did not die in a blow-up. It stopped below the promotion line because one important constraint stayed unresolved. This is how many real strategies remain stuck. They are good enough to keep alive and not good enough to scale.
Takeaway
The c36 VWAP z-score strategy made money, but it still was not promoted because the portfolio bar is higher than simple profitability. The exact failed gate, trades_per_week_ok, tells the whole story. The edge was real. The role fit was not.
If you want the wider signal-level view, VWAP Mean Reversion Backtest: The Logic, the Edge, and the Failure Modes covers the branch in more detail. If you want the density tradeoff, Intraday Mean Reversion Options: Why Signal Quality Drops When You Chase Density is the natural companion. Join the research log to get the next backtest and failure report.
How the terminology applies
For VWAP Z-Score Strategy: How We Evaluated c36 and Why It Still Was Not Promoted, the backtesting workflow should treat Point-in-time contracts, Quote-aware fills, Reject reasons, Replay artifact, Cache key, and Signal timestamp as operational state rather than glossary decoration. That framing keeps the research claim causal: the strategy can only select instruments, prices, and labels that existed at the decision time.
A developer implementing this Validation idea should persist Look-ahead leakage, Walk-forward test, Slippage model, Same-bar fill, Promotion gate, and Options data API beside the result, instead of leaving those words in a term card. It also turns attractive performance into an auditable record where fills, skips, thresholds, and replay inputs can be challenged independently.
The review artifact for VWAP Z-Score Strategy: How We Evaluated c36 and Why It Still Was Not Promoted becomes more useful when OPRA-originating data, OCC option symbol, Bid/ask spread, Midpoint, Quote/trade condition, and Quote vs trade semantics appear in the same body of evidence as the selected rows. When a result is promoted, these fields should appear in the run manifest, rather than a prose summary or final equity curve.
In production notes for this backtesting workflow, REST snapshot, WebSocket stream, Entitlement gate, Quote freshness, Timestamp semantics, and Pagination cursor define the checks that decide whether the workflow is reproducible. The result is a backtest that can be rerun, compared across threshold families, and rejected when the evidence is not strong enough.
For VWAP Z-Score Strategy: How We Evaluated c36 and Why It Still Was Not Promoted, the practical acceptance test is simple: another developer should be able to read the body, identify the exact inputs, reproduce the request sequence, and explain the accepted and rejected rows without relying on the bottom terminology grid. If a phrase appears in the page vocabulary, it should correspond to a stored field, a validation check, a replay step, or an implementation decision in the backtesting workflow.
This is also the reason the article should not measure success only by the final chart, table, or headline metric. The better standard is whether the data path, timing model, entitlement state, and evidence trail survive review. When those pieces are written directly into the body, the terminology becomes part of the workflow readers can implement.
The z-score is not the whole row
The c36 evaluation should keep VWAP z-score beside the option market that expressed it. The signal row needs underlying OHLCV aggregate fields, VWAP, z-score, signal timestamp, entry cutoff, and market session. The option row needs point-in-time contract discovery, selected OCC option symbol, DTE bucket, moneyness band, bid, ask, spread percent, quote freshness, implied volatility, Greeks, and open interest.
That split explains why a high-quality branch can still miss promotion. A clean z-score can appear on a contract with thin top-of-book size, a stale NBBO, or a quote condition that blocks the fill model. A contract can also pass execution checks while the setup density remains too low for portfolio use. Those are separate gates, and the result should show which one failed.
For future retests, store the cache key and replay manifest with the same care as the PnL summary. If the run changes because of a different quote window, schema version, or pagination policy, the reviewer should see that before comparing Sharpe, DSR, or trades per week.
Terminology
Market-data terms used in this article
These terms keep the article connected to the CuteMarkets knowledge base and to the exact API workflow behind the research.
Point-in-time contracts
Contract discovery anchored to the research date so a backtest does not use future listings.
Quote-aware fills
Entry and exit assumptions based on bid/ask quotes, quote age, spread width, and side-specific fill rules.
Reject reasons
Logged explanations for skipped contracts or fills, including stale quote, wide spread, no bid, or missing data.
Replay artifact
The saved request, selection, fill, reject, and metric record that lets another developer audit the backtest.
Cache key
The structured identifier that keeps provider, endpoint, ticker, timestamp, plan, and schema state from being mixed.
Signal timestamp
The exact time a strategy made a decision, used to reconstruct the visible universe and quote window causally.
Look-ahead leakage
A research error where a fill, contract, indicator, or label uses information unavailable at decision time.
Walk-forward test
A validation method that repeatedly trains and evaluates across separated time windows instead of trusting one optimized sample.
Slippage model
A fill-cost assumption based on bid/ask side, midpoint, spread percent, quote age, and liquidity policy.
Same-bar fill
An intraday backtest assumption that can become invalid when signal, entry, stop, and target ordering is ambiguous.
Promotion gate
The written threshold that decides whether a research candidate can move into paper trading or production monitoring.
Options data API
The product surface for chains, contracts, quotes, trades, aggregates, Greeks, IV, open interest, and expirations.
OPRA-originating data
The U.S. listed-options source context behind quotes, trades, exchange participation, and consolidated option-market records.
OCC option symbol
The exact option contract identifier that preserves root, expiration, call or put side, and strike.
Bid/ask spread
The execution interval between bid and ask that determines whether a contract is realistically tradable.
Midpoint
The computed center between bid and ask, useful as a reference price but not proof that an order would fill.
Quote/trade condition
The condition-code, exchange, correction, sequence, and timestamp context that explains how a quote or trade row can be used.
Quote vs trade semantics
The distinction between executable bid/ask markets, printed transactions, and bar-level summaries.
REST snapshot
A reproducible request for current or historical market state, used for initialization, backfills, and audit logs.
WebSocket stream
A persistent live connection that needs subscription topics, reconnect tracking, freshness labels, and REST repair paths.
Entitlement gate
The product, plan, quote, live, delayed, historical, or commercial-use boundary checked before data is shown.
Quote freshness
The age, timestamp, and live or delayed state of a bid/ask record before it is used in a scanner, backtest, or UI.
Timestamp semantics
The exchange, provider, ingestion, session, and application time context attached to a market-data record.
Pagination cursor
The continuation token or next URL that keeps large chains, trades, quotes, and historical windows complete.

Written by
Daniel Ratke
Research & Engineering
Daniel covers the deeper research notes: options backtesting, execution realism, robustness testing, data engineering, and strategy validation.
Product links
Build the workflow with CuteMarkets
This article is part of the broader CuteMarkets product and research stack. Use the landing pages below to move from the blog into the specific API workflow you want to evaluate.
Beginner options path
Send newcomers to the beginner path for calls, puts, chains, Greeks, IV, and risk.
Options Data API
See the main options overview for real-time and historical options data.
Historical Options Data API
Inspect the historical contracts, quotes, trades, and aggregates workflow.
Options Chain API
Go straight to chain snapshots, expirations, and strike discovery.
Pricing
Review plans before you move from free evaluation into production usage.