HomeBlogEpisode 7: Failure Week Was Productive
Research SeriesApril 8, 2026·3 min read

Episode 7: Failure Week Was Productive

CuteMarkets

CuteMarkets Team

Research

Episode 7: Failure Week Was Productive

Scope

This episode focuses on the cluster of negative results logged around 2026-04-06 to 2026-04-08 in RUNS.md.

It is one of the most important episodes in the series because it saved time. Many ideas were tested, failed, and properly closed instead of being allowed to drift around the backlog forever.

Result Snapshot

LaneOutcomeMain blocker
c23 wave-failure reclaimno_feasible_profiletoo few trades, low trades/week, correlation issue
c26 gap reclaim continuationno_feasible_profilefailed DSR, Sharpe, Sortino, PBO; sparse sample
c29 open-drive pullback0 tradesno effective sample
c30 ORB retest higher-lowno_feasible_profileweak quality despite some activity
c32 gap-failure fadeno_feasible_profilefailed DSR, Sharpe, Sortino, trades/week
c37 debit-spread companion0 trades on SPYstructure too sparse
lfcm_catalyst_momentumclosed0 valid catalyst days even after data-path repair

This is what a useful cemetery looks like.

Strategy Context: What These Models Were Actually Trying To Do

c23 was a failed-break reclaim model. In code terms, it looked for an early downside sweep through the opening structure, required the market to reclaim back above VWAP or the opening-range midpoint, and then entered long on follow-through. The quality version tightened the reclaim window, required relative volume, and demanded a stronger reclaim close. The logic is intuitively attractive because it tries to monetize failed downside auctioning. The repo result says that attractiveness was not enough: the setup remained too sparse and too correlated with other existing sleeves to earn a place.

c26 was a gap-reclaim continuation model built on the event-drive variant. It required a meaningful gap up, then asked whether the session could hold support and continue after the reclaim. The quality version increased the minimum gap size, required stronger relative volume, and demanded a larger breakout fraction versus the opening range. This is a classic event-momentum hypothesis: a large pre-open dislocation plus early acceptance should continue. The repo found that the path did not generalize well enough. Even when the trade existed, quality metrics and overfitting diagnostics remained too weak.

c29 and c30 were both long continuation families, but with different structural emphasis. c29 required a strong opening drive, then a shallow pullback that stayed above VWAP or the opening-range high, and finally a resumption of the original drive. c30 waited for an opening-range breakout, then required the retest to hold as a higher low above VWAP before taking the continuation break of the retest high. Both ideas are familiar to discretionary traders. The repo result is valuable because it shows how quickly these intuitive narratives become statistically fragile once you insist on explicit drive magnitude, retracement bounds, relative volume, and time-budget rules. c29 became so constrained that it produced zero trades in the tested lane. c30 produced some trades, but not enough quality.

c32 was the mirror-image failure-fade idea. It looked for a gap-up session that failed to reclaim VWAP and then shorted the continuation of that failed bounce. The quality version made the gap threshold larger and shortened the deadline. This is a plausible opening-reversal archetype: strong overnight enthusiasm that cannot be maintained after the open. In the repo, however, the pattern did not survive the feasibility bar. It was able to tell a compelling market story more easily than it could produce robust out-of-sample evidence.

c37 was not a new underlying signal at all. It took the long-only VWAP mean-reversion logic from the c18 family and tried to express it through 2-5 DTE vertical debit spreads with quote-aware spread execution, rather than through the 0-2 DTE single-leg expression used by c36. That meant it inherited not only the mean-reversion assumptions of c18, but also additional structural requirements about short-leg bids, debit-to-width ratio, and spread quality. The important negative result here is structural: changing the monetization layer alone can be enough to extinguish a strategy's usable sample.

The LFCM catalyst lane failed for a different reason. It was never primarily an intraday price-pattern strategy. It depended on the existence of historically valid catalyst headlines plus premarket activity. By April 8, the repo had already repaired the premarket data path and allowed Alpaca as a secondary provider. The lane still produced zero valid catalyst days. That makes it one of the cleanest closures in the repo because the data excuse was removed before the idea was killed.

Why These Failures Matter

Each of these failures answers a different question.

c29 and c37 tell us there are ideas that do not even clear the sample-creation threshold. That is an early and clean rejection.

c23, c26, c30, and c32 tell us there are ideas that can create some trades but still fail the combination of robustness, return quality, and frequency needed for promotion.

The LFCM lane tells us something even stronger. After the repo fixed the audit path and added Alpaca as the allowed secondary provider, the lane still had:

  • 22529 ticker-days with premarket bars
  • 0 valid catalyst headline days
  • 0 candidate ticker-days

That is not a data excuse anymore. That is a strategy-universe result.

What Worked

What worked was the decision process itself.

The repo did not do the usual thing where failed branches are left in a vague "interesting, revisit later" state. It named the blockers. In most cases those blockers were exactly the ones that matter for a live portfolio:

  • sample too sparse
  • quality metrics too weak
  • overlap or correlation too high
  • opportunity not strong enough after realistic filtering

This is one of the strongest credibility signals in the entire project. A public series that only reports survivors looks like marketing. A public series that reports why a lane was killed looks like research.

What Did Not Work

The obvious answer is "those models did not work." But there is a more general negative result here.

What did not work was the temptation to rescue every interesting intuition with one more parameter pass.

The repo could have easily spent another week on:

  • looser thresholds for c29
  • different spreads for c37
  • more permissive catalyst heuristics for LFCM

Instead, the evidence said stop. That is especially important for the wave-style branches because discretionary intuition can keep those ideas alive far longer than the statistics warrant. A reclaim, a higher-low retest, or a failed gap often looks excellent on a chart after the fact. The repo's value here is that it translated those chart narratives into explicit entry windows, retracement bounds, RVOL floors, and regime filters, then showed that the resulting objects still did not clear the bar.

Why This Week Matters

This is the episode that teaches the audience what a serious kill decision looks like.

In mild One Piece language, not every island is hiding treasure. Some are just empty. The project got better because it stopped camping on empty islands.

Public Build Takeaway

Episode 7 should be one of the most shared posts in the series, because it saves other researchers from checking the same dead ends without context.

The lesson is:

  • publish the graveyard
  • explain the blocker, not just the death certificate
  • treat negative results as reusable information

That is how a public research journey becomes useful to other people rather than merely entertaining.