HomeBlogVWAP Mean Reversion Backtest: The Logic, the Edge, and the Failure Modes
Case StudyApril 17, 2026·4 min read

VWAP Mean Reversion Backtest: The Logic, the Edge, and the Failure Modes

CuteMarkets

CuteMarkets Team

Research

VWAP Mean Reversion Backtest: The Logic, the Edge, and the Failure Modes

Repository reference: cutebacktests

Abstract

VWAP mean reversion is one of the most common intraday ideas because it has a clean market intuition. Short-horizon dislocations away from a central intraday benchmark may snap back once the move becomes stretched. The difficulty is not the intuition. The difficulty is turning that intuition into a strategy that preserves both quality and enough sample size to matter.

In this repository, the best example is c36, the option-native descendant of the c18 VWAP mean-reversion family. In Episode 8, the quality version, c36_vwap_mr_option_native_quality_v1, produced +16004 PnL on 15 trades with DSR 0.6400, while failing only trades_per_week_ok. The opportunity version reached 85 trades and +2987 PnL, but the quality shape decayed. That is a scientifically useful result because it shows a real edge and a real bottleneck at the same time.

Question

The practical question is not whether VWAP mean reversion makes sense. It is whether the edge survives once the setup is defined tightly enough to be causal and monetized honestly through options.

That is the question the c36 branch answers well. It is not a loose discretionary fade. It is a constrained intraday mean-reversion model with explicit VWAP residual z-scores, bounded VWAP slope, sigma controls, relative-volume filtering, and short holding periods. The research value of the branch is that it makes the selectivity versus density tradeoff visible.

Method: How the VWAP Mean Reversion Backtest Was Structured

As described in Episode 8, c36 keeps the same core signal family while varying the degree of selectivity.

The quality version raises the entry threshold, requires stronger relative volume, narrows acceptable sigma and slope conditions, and cuts the time-in-trade budget. In plain language, it asks for cleaner dislocations and exits them quickly. The opportunity version loosens those requirements so more trades can form, even if the average setup is less extreme.

The branch is then monetized through quote-aware single-leg option execution in the 0-2DTE window. This is an important detail because it keeps the comparison focused. The underlying mean-reversion family stays conceptually stable, and the main experimental question becomes whether widening the opportunity set preserves enough quality to justify the extra trades.

Evidence / Results

The repository's summary now gives a clean comparison:

  • c36_vwap_mr_option_native_quality_v1: +16004 PnL, 15 trades, DSR 0.6400
  • failed only trades_per_week_ok
  • c36_vwap_mr_option_native_opportunity_v1: 85 trades and +2987 PnL
  • the denser branch lost enough quality that it did not replace the higher-quality profile

This is one of the most valuable negative-positive pairs in the repo. The quality branch says the edge is not imaginary. The opportunity branch says density cannot be purchased for free. The repo therefore ended with a strategy that was interesting enough to keep and not strong enough to promote.

What Worked

What worked was the signal logic itself. The repo did not find a dead branch here. It found a branch with a real positive profile that survived stricter evaluation than many other ideas in the same period. That is why c36 remains backup_candidate or open_paper_only in the portfolio map rather than being closed.

This also makes c36 a strong public case study. Many strategy writeups only show complete failures or obvious winners. This one shows something much closer to real research: a credible signal with a real operational weakness.

What Failed

What failed was density. The best-quality version simply did not trade often enough to satisfy the repo's portfolio admission bar. The exact failed condition was trades_per_week_ok. That is not a cosmetic failure. It means the branch could make money and still fail the job it needed to do as a component of a diversified portfolio.

The opportunity version then showed why loosening the filters was not an easy repair. Trade count rose sharply, but the quality of the branch did not remain strong enough. In other words, the branch could be selective and sparse or denser and weaker, but it did not yet find the middle ground that would justify promotion.

Takeaway

The c36 result is one of the best examples in this repo of how a good backtest can still stop short of deployment. The strategy had real signal and real profits in its quality form. It also had a real density problem. That combination is precisely why VWAP mean reversion remains a live research question here rather than a closed one.

If you want the options-expression angle of this tradeoff, Intraday Mean Reversion Options: Why Signal Quality Drops When You Chase Density goes one step further. If you want the c36 decision itself, VWAP Z-Score Strategy: How We Evaluated c36 and Why It Still Was Not Promoted focuses on the admission bar. Join the research log to get the next backtest and failure report.