Episode 7: Failure Week Was Productive
CuteMarkets Team
Research

Scope
This episode focuses on the cluster of negative results logged around 2026-04-06 to 2026-04-08 in RUNS.md.
It is one of the most important episodes in the series because it saved time. Many ideas were tested, failed, and properly closed instead of being allowed to drift around the backlog forever.
Result Snapshot
| Lane | Outcome | Main blocker |
|---|---|---|
c23 wave-failure reclaim | no_feasible_profile | too few trades, low trades/week, correlation issue |
c26 gap reclaim continuation | no_feasible_profile | failed DSR, Sharpe, Sortino, PBO; sparse sample |
c29 open-drive pullback | 0 trades | no effective sample |
c30 ORB retest higher-low | no_feasible_profile | weak quality despite some activity |
c32 gap-failure fade | no_feasible_profile | failed DSR, Sharpe, Sortino, trades/week |
c37 debit-spread companion | 0 trades on SPY | structure too sparse |
lfcm_catalyst_momentum | closed | 0 valid catalyst days even after data-path repair |
This is what a useful cemetery looks like.
Strategy Context: What These Models Were Actually Trying To Do
c23 was a failed-break reclaim model. In code terms, it looked for an early downside sweep through the opening structure, required the market to reclaim back above VWAP or the opening-range midpoint, and then entered long on follow-through. The quality version tightened the reclaim window, required relative volume, and demanded a stronger reclaim close. The logic is intuitively attractive because it tries to monetize failed downside auctioning. The repo result says that attractiveness was not enough: the setup remained too sparse and too correlated with other existing sleeves to earn a place.
c26 was a gap-reclaim continuation model built on the event-drive variant. It required a meaningful gap up, then asked whether the session could hold support and continue after the reclaim. The quality version increased the minimum gap size, required stronger relative volume, and demanded a larger breakout fraction versus the opening range. This is a classic event-momentum hypothesis: a large pre-open dislocation plus early acceptance should continue. The repo found that the path did not generalize well enough. Even when the trade existed, quality metrics and overfitting diagnostics remained too weak.
c29 and c30 were both long continuation families, but with different structural emphasis. c29 required a strong opening drive, then a shallow pullback that stayed above VWAP or the opening-range high, and finally a resumption of the original drive. c30 waited for an opening-range breakout, then required the retest to hold as a higher low above VWAP before taking the continuation break of the retest high. Both ideas are familiar to discretionary traders. The repo result is valuable because it shows how quickly these intuitive narratives become statistically fragile once you insist on explicit drive magnitude, retracement bounds, relative volume, and time-budget rules. c29 became so constrained that it produced zero trades in the tested lane. c30 produced some trades, but not enough quality.
c32 was the mirror-image failure-fade idea. It looked for a gap-up session that failed to reclaim VWAP and then shorted the continuation of that failed bounce. The quality version made the gap threshold larger and shortened the deadline. This is a plausible opening-reversal archetype: strong overnight enthusiasm that cannot be maintained after the open. In the repo, however, the pattern did not survive the feasibility bar. It was able to tell a compelling market story more easily than it could produce robust out-of-sample evidence.
c37 was not a new underlying signal at all. It took the long-only VWAP mean-reversion logic from the c18 family and tried to express it through 2-5 DTE vertical debit spreads with quote-aware spread execution, rather than through the 0-2 DTE single-leg expression used by c36. That meant it inherited not only the mean-reversion assumptions of c18, but also additional structural requirements about short-leg bids, debit-to-width ratio, and spread quality. The important negative result here is structural: changing the monetization layer alone can be enough to extinguish a strategy's usable sample.
The LFCM catalyst lane failed for a different reason. It was never primarily an intraday price-pattern strategy. It depended on the existence of historically valid catalyst headlines plus premarket activity. By April 8, the repo had already repaired the premarket data path and allowed Alpaca as a secondary provider. The lane still produced zero valid catalyst days. That makes it one of the cleanest closures in the repo because the data excuse was removed before the idea was killed.
Why These Failures Matter
Each of these failures answers a different question.
c29 and c37 tell us there are ideas that do not even clear the sample-creation threshold. That is an early and clean rejection.
c23, c26, c30, and c32 tell us there are ideas that can create some trades but still fail the combination of robustness, return quality, and frequency needed for promotion.
The LFCM lane tells us something even stronger. After the repo fixed the audit path and added Alpaca as the allowed secondary provider, the lane still had:
22529ticker-days with premarket bars0valid catalyst headline days0candidate ticker-days
That is not a data excuse anymore. That is a strategy-universe result.
What Worked
What worked was the decision process itself.
The repo did not do the usual thing where failed branches are left in a vague "interesting, revisit later" state. It named the blockers. In most cases those blockers were exactly the ones that matter for a live portfolio:
- sample too sparse
- quality metrics too weak
- overlap or correlation too high
- opportunity not strong enough after realistic filtering
This is one of the strongest credibility signals in the entire project. A public series that only reports survivors looks like marketing. A public series that reports why a lane was killed looks like research.
What Did Not Work
The obvious answer is "those models did not work." But there is a more general negative result here.
What did not work was the temptation to rescue every interesting intuition with one more parameter pass.
The repo could have easily spent another week on:
- looser thresholds for
c29 - different spreads for
c37 - more permissive catalyst heuristics for LFCM
Instead, the evidence said stop. That is especially important for the wave-style branches because discretionary intuition can keep those ideas alive far longer than the statistics warrant. A reclaim, a higher-low retest, or a failed gap often looks excellent on a chart after the fact. The repo's value here is that it translated those chart narratives into explicit entry windows, retracement bounds, RVOL floors, and regime filters, then showed that the resulting objects still did not clear the bar.
Why This Week Matters
This is the episode that teaches the audience what a serious kill decision looks like.
In mild One Piece language, not every island is hiding treasure. Some are just empty. The project got better because it stopped camping on empty islands.
Public Build Takeaway
Episode 7 should be one of the most shared posts in the series, because it saves other researchers from checking the same dead ends without context.
The lesson is:
- publish the graveyard
- explain the blocker, not just the death certificate
- treat negative results as reusable information
That is how a public research journey becomes useful to other people rather than merely entertaining.
Product links
Build the workflow with CuteMarkets
This article is part of the broader CuteMarkets product and research stack. Use the landing pages below to move from the blog into the specific API workflow you want to evaluate.
Options Data API
See the canonical product page for real-time and historical options data.
Historical Options Data API
Inspect the historical contracts, quotes, trades, and aggregates workflow.
Options Chain API
Go straight to chain snapshots, expirations, and strike discovery.
Pricing
Review plans before you move from free evaluation into production usage.