bsig

PL640

★ VVIX Top-Decile + VIX9D < VIX → Short SPY (Convexity Bid Precedes Vol Spike)

index equity (short) · ~10-day hold · 116 events

0.88

Sharpe

35.8%

CAGR

-59.2%

MaxDD ⚠

3.85

t-stat

0.90

OOS Sharpe

Mechanism

VVIX (the vol-of-VIX) pushing into its top decile signals heavy convexity buying — desks bidding up VIX options to hedge tails — while VIX9D < VIX means the 9-day front end is still calmer than 30-day vol, i.e. the spot tape remains complacent. That divergence — aggressive tail-hedging underneath a calm front end — has historically led volatility spikes and SPY drawdowns over the following ~2 weeks, as the hedging flow front-runs the realized move.

Rule

Short SPY when trailing VVIX sits in the top 10th percentile of its 252-day distribution while VIX9D / VIX ≤ 0.95 (calm front end). Hold ~10 trading days; exit early on a SPY profit move or stop.

Caveats

Short-vol drawdown caveats: 116 events / 4,881 days, but the −59.2% MaxDD is real — this shorts SPY into complacent tapes and bleeds during sustained bull runs, with returns concentrated in 2008/2020-type clusters. Short SPY carries financing/borrow cost. t-stat 3.85, OOS Sharpe 0.90. Passes the full winner gate (BH-significant at FDR=0.05, positive OOS Sharpe) but size for the drawdown and regime concentration.

Source: yfinance ^VVIX / ^VIX / ^VIX9D / SPY; project queue specbacktests/PL640_vvix_high_vix9d_low_short_spy.py

PL613

★ Lean-Hog Herd Contraction → Long Lean Hogs (HE=F)

commodity futures · ~5-month hold · 18 events

0.82

Sharpe

12.7%

CAGR

-30.8%

MaxDD

3.97

t-stat

0.89

OOS Sharpe

Mechanism

Cobweb hog cycle: the lean-hog front-month price is a regime proxy for USDA's Quarterly Hogs & Pigs breeding-herd data (PDF-only, not in FRED). A sustained low-margin price regime is the leading indicator that drives herd liquidation — producers cull the breeding herd when margins stay negative, which tightens supply after the ~10-month biological lag and squeezes prices higher. Buying the depressed regime front-runs that lagged supply response.

Rule

Long HE=F front-month when its trailing 252-day percentile rank ≤ 0.15 (bottom 15th) AND it has closed below its trailing 252-day SMA for 60+ consecutive trading days. Hold the earlier of 100 trading days (~5 months) or +18% above entry.

Caveats

Sparse, regime-driven: only 18 events across 5,928 days, with a −30.8% MaxDD. HE=F front-month roll/contango is a real holding cost not fully captured, and the price proxy stands in for the actual NASS herd data. t-stat 3.97, OOS Sharpe 0.89. Passes the full winner gate (BH-significant at FDR=0.05, positive OOS Sharpe) but the small event count means meaningful regime dependence — size accordingly.

Source: yfinance HE=F; price-regime proxy for USDA NASS Quarterly Hogs and Pigsbacktests/PL613_hog_herd_contraction_long_he.py

PL742

★★ FBI NICS Background-Check Acceleration → Long Gun Makers (RGR + SWBI)

single-name equity · 30-day hold · 15 events

2.72

Sharpe

188.7%

CAGR

-18.2%

MaxDD

3.63

t-stat

3.13

OOS Sharpe

Mechanism

FBI NICS monthly background checks are a high-frequency leading indicator of firearm unit sales. When year-over-year check growth accelerates sharply (fear events, policy/legislative uncertainty, seasonal surges), the demand flows directly into the two liquid US gun manufacturers — Ruger (RGR) and Smith & Wesson (SWBI) — ahead of the earnings beats and analyst upgrades that follow. The acceleration (second derivative), not the level, front-runs the revenue surprise.

Rule

When FBI NICS monthly background checks show YoY growth accelerating ≥10pp month-over-month, go long equal-weight RGR + SWBI for 30 trading days from the approximate release date (~5 weeks after month-end).

Caveats

Regime-concentration caveats: 15 events / 450 active days. The 2020–2021 COVID demand surge dominates the in-sample window and inflates the 188.7% CAGR — expect materially lower forward returns. NICS checks include denied checks and permit re-checks (not a clean unit-sales proxy), and release dates are approximated. Two-name basket carries idiosyncratic single-name risk. Passes the full winner gate (BH-significant at FDR=0.05, OOS Sharpe 3.13), but size for the regime dependence.

Source: pipeline; FBI NICS monthly firearm background-check statistics (public) + yfinance RGR/SWBIbacktests/PL742_nics_accel_long_rgr_swbi.py

Z4

★ Wells Notice → Litigation Release Gap (Long Stressed Equity)

single-name equity · ~6 months hold · 10 events

0.66

Sharpe

23.0%

CAGR

-65%

MaxDD ⚠

1.25

t-stat

10

events ⚠

Mechanism

SEC Wells Notice is a self-disclosed pre-litigation notification — the issuer reveals in its 10-Q risk factors that SEC staff intends to recommend enforcement. This is the moment of maximum uncertainty, and the market typically prices in a worst-case outcome. Subsequent negotiation usually narrows the charges (settlements + reduced penalties), and a "relief rally" of partial information accumulates between Wells disclosure and final litigation release (~6-month median gap).

Rule

Long the issuer between Wells Notice self-disclosure (in 10-Q) and SEC litigation release. Hold from Wells disclosure date to litigation release date (median ~6 months).

Caveats

Heavy caveats: N=10 events. Curated panel skewed to small-caps (NKLA, BLNK type names); 8 of 18 attempted tickers delisted from yfinance — survivorship bias likely understates the volatility (the survivors did well; the bankrupted ones drop out). -65% MaxDD is real. If the Wells results in a referred criminal parallel case, the relief rally never materializes. Position-size accordingly — this is not a "set and forget" strategy.

Source: own research; SEC EDGAR Wells Notice 10-Q disclosures + Litigation Releasesbacktests/Z4_wells_to_litigation_gap.py

Validation

Low

Confidence

Moderate (N=10)

Sample Size

Medium

Overfit Risk

Medium

Regime Risk

High

Cost Impact

Validation Summary

N=10 surviving events from a curated panel where 8 of 18 tickers were delisted from yfinance -- classic survivorship bias that flatters returns. The t-stat of 1.25 does not clear significance. Max drawdown of -65% on small-cap stressed names makes this extremely difficult to hold live, and borrow/spread costs on names like NKLA and BLNK would materially erode the 23% CAGR.

What Breaks This

A Wells Notice escalates to a parallel criminal referral, eliminating the "relief rally" entirely and producing total impairment.

Z6

★ Antitrust Complaint Filing → Long Defendant (Reversion)

single-name equity · 90-day hold · 17 events

0.70

Sharpe

16.4%

CAGR

-30%

MaxDD

1.47

t-stat

+5.7%

excess vs SPY

Mechanism

Government antitrust win rates in federal court have been below 50% over the last decade (DOJ + FTC combined). The market reacts negatively to complaint filings, often overweighting the loss probability — partly because most "probe" coverage has already leaked the case before filing. The actual complaint filing thus marks a local information low: the defendant files motions to dismiss within ~30 days, and 50%+ of these motions succeed at trial.

Rule

Long the defendant single-name on the close of any day a federal antitrust complaint is filed by DOJ Antitrust Division or FTC; exit after 90 trading days.

Caveats

17 events curated from 2015-2024 (Microsoft-Activision, Visa, Google search, etc.). Sharpe 0.70, +5.7pp excess CAGR vs SPY. Cases where DOJ secures a rare preliminary injunction (e.g., some recent merger-block actions) cause permanent impairment, not reversion. The strategy implicitly bets on continued government weakness in antitrust litigation — a regime that may shift under future administrations.

Source: own research; justice.gov/atr/antitrust-case-filings + ftc.gov/legal-librarybacktests/Z6_antitrust_reversion.py

Validation

Medium

Confidence

Moderate (N=17)

Sample Size

Low

Overfit Risk

Medium

Regime Risk

Low

Cost Impact

Validation Summary

Clean mechanism (government antitrust win rates below 50%) with 17 events across mostly mega-cap defendants (MSFT, GOOGL, META, AAPL) so transaction costs are negligible. t-stat of 1.47 is below the 1.96 threshold, and the strategy implicitly bets on continued weak government enforcement -- a policy regime that could shift. Rule is simple (one parameter: 90-day hold), so overfitting risk is low.

What Breaks This

A future administration achieves a string of successful antitrust breakup orders, shifting the market's prior from "complaints are toothless" to "complaints are existential."

Demoted: no longer passes the tightened winner gate. Failed: Sharpe missing; CAGR missing; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=0, n_events=16.

Y5

★ De-SPAC Short Post High-Redemption Close

single-name equity short · 40-day hold · 16 events

2.58

t-stat

+14.2%

per-event excess vs SPY

75%

hit rate

16

events

Mechanism

When a SPAC completes its merger with >85% redemption rate, the resulting public entity has a tiny float relative to the warrant overhang + sponsor-promote share unlocks. The early post-close window features pump-driven mispricing as the tiny remaining float gets squeezed by retail; then 90-180 days later, PIPE lockups expire and sponsor-promote shares hit the tape (Gahng-Ritter-Zhang RFS 2023). Klausner-Ohlrogge-Ruan also documented ~50% underperformance over 12 months for high-redemption de-SPACs.

Rule

Short the de-SPAC common stock at the close of day +5 after deal close, when redemption rate exceeded 85% and the stock closed above $10. Cover at day +45 or on a 30% decline.

Why it stands out

Strongest event-driven finding from this round. t=2.58 across 16 events. 75% hit rate. Mean +14.2% short excess per event over ~2 months. Mechanism is mechanical (forced supply unlock) and academically documented.

Caveats

Hard-to-borrow names with low float can short-squeeze 200%+ before the structural decline begins — position sizing must cap individual exposure and pre-check borrow. The strategy is regime-dependent on the SPAC boom of 2020-2023; future SPAC cohorts will differ. Curated event list from 16 deals; broader basket may dilute the per-event edge.

Source: Klausner-Ohlrogge-Ruan 2022; Gahng-Ritter-Zhang RFS 2023backtests/Y5_de_spac_short.py

Validation

Medium

Confidence

Moderate (N=16)

Sample Size

Low

Overfit Risk

High

Regime Risk

High

Cost Impact

Validation Summary

Strongest event-driven t-stat in the catalog at 2.58 with 75% hit rate and a clear mechanical mechanism (warrant overhang + sponsor-promote unlock). However, this is entirely a 2020-2023 SPAC boom phenomenon -- future SPAC volume is a fraction of peak. Short borrow on low-float de-SPACs can exceed 50% annualized, which would consume most of the +14.2% per-event excess. Academic backing (Gahng-Ritter-Zhang RFS 2023) adds credibility.

What Breaks This

The SPAC structure evolves or volume dries up entirely, eliminating the trade universe; alternatively, a short squeeze on a tiny float wipes out multiple events' gains in one position.

V3

★ Felder Margin Debt / GDP Regime Indicator

equities · quarterly signal · 31-year history

0.77

Sharpe

11.2%

CAGR

-36.6%

MaxDD

0.66

SPY B&H Sharpe

Mechanism

Jesse Felder's variant of margin-debt analysis: instead of raw YoY change, scale FINRA margin debt by nominal GDP. Above ~2.5% of GDP marks leverage-greed extremes. When 12-month rolling change in (margin debt / GDP) rolls over from extremes, the marginal levered buyer has stopped buying — a slow equity beta-cut signal.

Rule

Cut SPY exposure to 50% when 12-month change in (FINRA margin debt / nominal GDP) rolls negative from a level above 2.5% of GDP; restore to 100% only after a fresh 12-month positive change.

Why it works

Sharpe 0.77 beats SPY's 0.66 across a 31-year sample with materially shallower drawdown (-36.6% vs ~-55%). GDP normalization is more robust than raw YoY because it adjusts for the absolute scale of the economy — margin debt of $700B meant something very different in 2000 vs 2024.

Caveats

Related variant of F03 (Margin Debt YoY, also in this sidebar at 11.5% CAGR). The two signals will have overlapping trade windows — picking one as primary makes more sense than running both at full size. V3's GDP-normalization is the more defensible variant. Quarterly publication lag delays the signal. Margin debt has structurally migrated to options + synthetic leverage post-2020, possibly weakening the FINRA proxy.

Source: Jesse Felder (The Felder Report)backtests/V3_felder_margin_debt_gdp.py

Validation

High

Confidence

Strong (N=7900 days, 31yr)

Sample Size

Low

Overfit Risk

Low

Regime Risk

Low

Cost Impact

Validation Summary

31-year sample with t-stat 4.33 is among the most robust in the catalog. The rule is dead simple (one threshold + GDP normalization) applied to quarterly data, making overfitting nearly impossible. Sharpe 0.77 beats SPY's 0.66 with materially lower drawdown (-36.6% vs -55%). The only caveat: excess CAGR is essentially zero -- the value is in risk-adjusted improvement, not absolute outperformance. Quarterly rebalance means near-zero transaction costs.

What Breaks This

Margin debt migrates structurally to options, swaps, and crypto leverage not captured by the FINRA series, making the proxy uninformative.

O5

★ BTC Difficulty Drop → Long Miner Basket

crypto / miner equities · 14-day hold · 48 events

0.80

Sharpe

32.5%

CAGR

2.45

t-stat

48

events

Mechanism

Bitcoin difficulty adjusts every ~2016 blocks (~14 days). When difficulty drops by >3%, it means weaker miners have capitulated and gone offline. The survivors capture proportionally more hashrate and revenue per machine until equilibrium re-establishes (when difficulty rises again). Their stocks re-rate on the improved economics within the 14-day window before the next adjustment.

Rule

When BTC difficulty adjusts by less than -3% (i.e., a downward adjustment), go long equal-weight basket {MARA, RIOT, CLSK} for the next 14 trading days (one difficulty epoch).

Caveats

Negative difficulty adjustments sometimes coincide with broad crypto bear markets that swamp the per-miner improvement. Backtest uses original "mempool size" 90th-pct as paired rule (O4); both work but capture overlapping events.

Source: own research; api.blockchain.info/charts/difficultybacktests/O5_btc_difficulty_drop.py

Validation

Medium

Confidence

Strong (N=48)

Sample Size

Low

Overfit Risk

High

Regime Risk

Medium

Cost Impact

Validation Summary

Strong sample size (48 events) and clear mechanical mechanism (difficulty drops = miner capitulation = survivors capture more revenue). t-stat 2.45 clears significance. However, the miner equity basket (MARA/RIOT/CLSK) carries extreme volatility with -55% MaxDD and actually underperforms BTC buy-and-hold (excess CAGR -12%). The 32.5% CAGR is largely beta to BTC's secular appreciation, not alpha from the signal itself.

What Breaks This

Post-ETF institutional flows decouple miner equity prices from hashrate dynamics, or Bitcoin transitions to proof-of-stake (unlikely but structurally possible via fork).

O4

★ BTC Mempool Backlog → Long Miner Basket

crypto / miner equities · 20-day hold · 113 events

0.73

Sharpe

31.9%

CAGR

2.22

t-stat

113

events

Mechanism

When Bitcoin mempool congestion is high (unconfirmed transaction count in the 90th percentile of its history), transaction fees spike. Fees are a real revenue line for miners — 5-15% of revenue historically — but in mempool-congested periods they can exceed 50% briefly. Mining equities (MARA/RIOT/CLSK) re-rate on the upside surprise within ~3-4 weeks.

Rule

When BTC mempool unconfirmed-transaction count closes above its 90th percentile of trailing 1-year history, long equal-weight {MARA, RIOT, CLSK} for the next 20 trading days.

Caveats

Spam-driven mempool floods (e.g., 2023 ordinals/inscriptions craze) can cause backlog without sustainable fee revenue. The signal is correlated with O5 (difficulty drop) — many of the 113 events coincide. Don't double-count by trading both simultaneously.

Source: own research; mempool.space + blockchain.info APIsbacktests/O4_btc_mempool.py

Validation

Medium

Confidence

Strong (N=113)

Sample Size

Low

Overfit Risk

High

Regime Risk

Medium

Cost Impact

Validation Summary

Largest event sample in the crypto batch (113 events) with t-stat 2.22 above significance. The fee-revenue mechanism is real but partially confounded by BTC price momentum (mempool fills during bull runs). MaxDD of -92.5% is catastrophic -- this is a levered BTC-beta play. Excess CAGR vs BTC B&H is -12.8%, meaning the signal adds no alpha over simply holding BTC. Many events overlap with O5 (difficulty drop), so running both double-counts.

What Breaks This

Layer-2 adoption (Lightning Network, rollups) structurally reduces mempool congestion, eliminating the fee-spike mechanism.

P6

★★ Howell Global Liquidity 13-Week Lead → Long BTC

crypto · 13-week structural · 2014–2026

1.04

Sharpe

43.0%

CAGR

-64.6%

MaxDD

4.29

t-stat

+29pp

excess vs BTC B&H

Mechanism

Michael Howell of CrossBorder Capital argues Bitcoin is "the most liquidity-sensitive asset on the planet." Risk-on flows into the longest-duration risk asset lag global central-bank liquidity changes by ~13 weeks because of intermediation and portfolio-rotation friction. When summed CB balance sheets (Fed + ECB + BOJ in USD) are growing AND accelerating, BTC has a structural tailwind.

Rule

Compute the 13-week change in summed (Fed total assets + ECB + BOJ in USD). Long BTC-USD when both the 13-week change is positive AND its second derivative is positive (accelerating). Flat/short when both turn negative.

Why it stands out

Strongest YouTube-creator-sourced signal in the catalog. Sharpe 1.04, t-stat 4.29 across 10+ years of BTC data. +29 percentage points of excess CAGR over BTC buy-and-hold (43% vs 14% for buy-and-hold over the same window). The mechanism is mechanical: BTC is a duration asset, liquidity drives duration premia.

Caveats

PBOC and SNB balance sheet data unavailable via free FRED — proxy uses Fed + ECB + BOJ only. The full CrossBorder Global Liquidity Index includes more components. BTC's -64.6% drawdown still bites; this is a CAGR-improving overlay on BTC, not a downside protector. 2025 break-down where M2 rose but BTC stalled flagged by Howell himself.

Source: Michael Howell (CrossBorder Capital); Forward Guidance / Capital Warsbacktests/P6_howell_liquidity_btc.py

Validation

High

Confidence

Strong (4265 days, 10yr+)

Sample Size

Low

Overfit Risk

Medium

Regime Risk

Low

Cost Impact

Validation Summary

Strongest crypto signal: Sharpe 1.04, t-stat 4.29, +29pp excess CAGR over BTC buy-and-hold across 10+ years. The mechanism (central bank liquidity drives BTC as a duration asset) has theoretical backing and a well-known proponent (Howell/CrossBorder Capital). The rule is simple (two conditions: level and acceleration) and uses free FRED data. MaxDD of -64.6% is still brutal, and the proxy omits PBOC/SNB. Howell himself flagged a 2025 breakdown where M2 rose but BTC stalled.

What Breaks This

BTC's correlation with global liquidity decouples as ETF flows create a separate demand dynamic, or central banks coordinate a permanent QT regime.

P12

★ Steno Multi-CB Liquidity Flip → Long SPY + BTC

cross-asset · 8-week hold

0.65

Sharpe

16.2%

CAGR

-48.5%

MaxDD

2.23

t-stat

+2.4pp

excess vs SPY

Mechanism

Andreas Steno Larsen's framework combines growth, inflation, and liquidity. The liquidity component specifically tracks multi-CB nowcasts. When the summed Fed + ECB + BOJ balance sheet (in USD) goes from contracting to expanding on a 4-week basis, it signals a regime change that historically produces outsized 8-week equity + BTC returns.

Rule

Compute 4-week change in summed (Fed total assets + ECB + BOJ in USD). When the 4-week change flips from negative to positive, hold a 50/50 SPY + BTC basket for the next 8 weeks (40 trading days).

Caveats

Same CB-coverage limitation as P6 (no PBOC/SNB). MaxDD of -48.5% comes from BTC half of the basket through 2022. The signal captures inflection points, not sustained regimes — it's a tactical overlay, not core allocation.

Source: Andreas Steno Larsen / Steno Researchbacktests/P12_steno_global_liquidity.py

Validation

Medium

Confidence

Strong (86 flips, 11yr)

Sample Size

Low

Overfit Risk

Medium

Regime Risk

Low

Cost Impact

Validation Summary

Decent sample (86 flip events) with t-stat 2.23 above significance. The rule is straightforward (4-week CB balance sheet change flips sign). However, excess CAGR over SPY is only +2.4pp and MaxDD of -48.5% comes from the BTC half of the basket. The signal captures inflection points well but half the portfolio carries crypto-grade drawdowns. The same CB-coverage limitation as P6 applies (no PBOC/SNB).

What Breaks This

Central banks shift to passive balance-sheet management (no active QE/QT cycles), flattening the signal and producing only noise flips.

R-D11

★ FINRA Short Interest Long/Short Basket

equities · monthly rebalance · ~30 large/mid-caps

0.79

Sharpe

13.7%

CAGR

-17%

MaxDD

11

events ⚠️

Mechanism

Short interest measures the % of float currently sold short. Heavily-shorted names tend to underperform because (a) the shorts are informed and (b) borrow costs cap upside. Lightly-shorted names benefit from absence of supply pressure. FINRA publishes daily Short Volume Ratio (SVR) via the Reg SHO daily files — free, machine-readable.

Rule

Rank a fixed universe of ~30 liquid US large/mid-caps by trailing-month average daily Short Volume Ratio. Long bottom decile (low short interest), short top decile, equal-weight, monthly rebalance.

Caveats

FINRA Reg SHO Daily API has a ~1-year rolling window cap — so historical sample is short. Underperforms long-only on the same universe (Sharpe 0.79 vs bench 2.92 in this sample) because we're in a meme-rally regime where bottom-SVR longs lag. Backtested 2024-2026 only; longer-term properties unknown without alternative data source.

Source: own research; FINRA Reg SHO Daily APIbacktests/R-D11_short_interest.py

Validation

Low

Confidence

Weak (N=11, 1yr only)

Sample Size

Medium

Overfit Risk

High

Regime Risk

Medium

Cost Impact

Validation Summary

Only 1 year of history (2024-2026) due to FINRA Reg SHO API's rolling window cap. The Sharpe of 0.79 looks decent but the benchmark (same-universe long-only) achieved Sharpe 2.92 -- meaning the L/S strategy massively underperforms simple buy-and-hold of the same 30 names. The short leg bleeds in a meme-rally regime. 11 monthly rebalance events is insufficient for any statistical confidence. The academic literature (Diether/Lee/Werner) supports the concept but this implementation cannot be validated.

What Breaks This

Continued bull market in heavily-shorted names (meme stocks, high-beta tech) causes the short leg to hemorrhage, overwhelming any long-side alpha.

R-N5

★ TSMC Monthly Revenue Surprise → AI-Semi Basket

AI/semi single-names · 1-quarter hold · 19 events

0.63

Sharpe

12.7%

CAGR

-45%

MaxDD

19

events

Mechanism

TSMC publishes monthly NT$ revenue around the 10th of each month — the highest-frequency public read on leading-edge logic demand. Revenue acceleration vs trailing average signals demand strength at TSMC, which propagates to its fab customers (NVDA, AMD, AVGO) and equipment suppliers (ASML) BEFORE quarterly earnings pre-announcements.

Rule

When TSMC monthly YoY revenue growth exceeds the trailing-3-month average by 8 percentage points, long equal-weight {NVDA, AVGO, AMD, ASML} basket for ~1 quarter (originally 10 days, expanded for sample-size reasons).

Caveats

19 events across ~5 years from Macrotrends quarterly scrape. -45% MaxDD because basket carries 2022 semi drawdown. Backtest extended hold to 1 quarter from spec'd 10 days for statistical power; shorter hold might still work but wasn't tested.

Source: own research; TSMC monthly revenue via Macrotrendsbacktests/R-N5_tsmc_revenue.py

Validation

Medium

Confidence

Moderate (N=19)

Sample Size

Medium

Overfit Risk

Medium

Regime Risk

Low

Cost Impact

Validation Summary

Mechanism is sound (TSMC as leading indicator for semi demand) and the basket is liquid (NVDA/AVGO/AMD/ASML). t-stat 2.29 clears significance with 19 events. However, excess CAGR vs the same basket buy-and-hold is deeply negative (-40%) -- the signal adds no timing value over simply owning the semi names. The 8pp threshold and quarterly hold expansion from the original 10-day spec introduce mild overfitting concern. MaxDD -45% is severe.

What Breaks This

TSMC stops publishing monthly revenue (they've discussed this), or the relationship between foundry revenue and fabless customer stock performance breaks as AI capex dominates the mix.

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=60, n_events=10.

F9

★ USTR Section 301 Comment-Window Close → Long US Importers

consumer/retail · 5-day hold · 10 events

1.75

Sharpe

37.2%

CAGR

+52pp

excess vs SPY

58%

hit rate

10

events ⚠️

Mechanism

The USTR runs a public comment window before imposing or modifying tariffs under Section 301. During that window, the market prices in maximum tariff overhang on China-import-heavy US single names (AAPL, NKE, BBY, WSM, etc). Once the comment window closes WITHOUT immediate action, the overhang dissipates temporarily — a "no new news" window opens where the priced-in tariff fear unwinds before the actual rule takes effect 30-90 days later.

Rule

Long equal-weight basket of {AAPL, NKE, BBY, WSM, WHR} for 5 trading days starting the day after a USTR Section 301 public comment window closes, provided the final action announcement is at least 30 days away.

Why it stands out

Highest Sharpe (1.75) among the policy-catalyst batch. The mechanism is concrete (relief rally from tariff overhang dissipating). Hit rate 58% across 10 events is consistent with an asymmetric payoff — wins are bigger than losses.

Caveats

N=10 events. Sample period skewed toward 2018-2024 Trump-era + Biden-era trade actions; if future administrations don't use the Section 301 mechanism, no signal. The "30+ days until final action" filter requires judgment when applied real-time. Best as a tactical overlay, not a continuous strategy.

Source: own research; USTR Federal Register noticesbacktests/F9_ustr_s301.py

Validation

Low

Confidence

Moderate (N=10)

Sample Size

Low

Overfit Risk

High

Regime Risk

Low

Cost Impact

Validation Summary

Highest Sharpe in the catalog at 1.75, but the t-stat is only 0.85 -- well below significance. The Sharpe is inflated by tiny exposure (only 60 days in-market total across 10 events). N=10 events entirely from the 2018-2025 Trump/Biden trade-war era. Hit rate is 58% with asymmetric payoffs, which is consistent with the mechanism, but the sample is too small and regime-specific to trust. The basket (AAPL/NKE/BBY/WSM) is liquid so costs are negligible.

What Breaks This

Future administrations abandon Section 301 in favor of other trade mechanisms (executive orders, IEEPA), eliminating the specific event calendar this signal depends on.

Demoted: no longer passes the tightened winner gate. Failed: Sharpe 0.45 ≤ 0.50; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

F5

★ FDA PDUFA + Favorable AdComm Pre-Drift → Long Biotech

small/mid biotech · 10-day hold · 24 events

0.45

Sharpe

10.8%

CAGR

0.42

t-stat

24

events

Mechanism

FDA approval-likely PDUFA dates show a 3-4% pre-drift as event-gamma hedgers buy options into the announcement. The trick is filtering: most PDUFAs have binary outcomes, so the AdComm vote (held weeks before the PDUFA action date) acts as a leading indicator. Restricting to PDUFAs where the AdComm voted >70% favorable removes the binary-fail names and leaves a subsample with ~80%+ approval rate.

Rule

For any FDA PDUFA action date where the prior AdComm committee voted >70% favorable on the drug, long the sponsor's stock for the 10-day window ending T-1 before the PDUFA decision date.

Why it's interesting

24 events is a usable sample. The AdComm filter is the key insight — without it, you'd be exposed to binary-fail risk. With it, you're trading the approval-near-certain subset.

Caveats

t-stat of 0.42 is low — Sharpe is positive but not statistically slam-dunk. Many PDUFA dates slip (CRL responses, manufacturing-readiness delays) and the original event-list lost 4 tickers (BLUE/SAGE/MRTX/ITCI) to yfinance delistings — so live deployment needs robust handling of corporate-action events. Borrow / option costs on biotech names can be punitive.

Source: own research; FDA Drug Approvals calendar + AdComm transcriptsbacktests/F5_fda_pdufa.py

Validation

Low

Confidence

Strong (N=24)

Sample Size

Medium

Overfit Risk

Low

Regime Risk

High

Cost Impact

Validation Summary

The AdComm filter is clever (restricts to high-approval-probability PDUFAs), but the t-stat of 0.42 is far below significance -- the positive Sharpe of 0.45 is indistinguishable from noise. 24 events is a decent sample yet still fails to produce a reliable signal. Biotech names carry punitive borrow costs and wide bid-ask spreads. Lost 4 tickers to delistings, introducing survivorship concern. The mean per-event return of +0.4% barely covers a round-trip spread.

What Breaks This

FDA accelerates its review timeline (PDUFA VII goals), compressing the pre-decision drift window to zero, or the AdComm process is reformed to reduce the information advantage.

N4

★ Semiconductor Billings 3-Month Acceleration → Long SOXX

semi equity ETF · 60-day hold · 110 events

0.55

Sharpe

12.0%

CAGR

-57%

MaxDD

110

events

Mechanism

Semiconductor industry sales (originally SIA's monthly press release; here proxied by Census M3 Computers & Electronics New Orders, FRED A36SNO) lead semi equipment book-to-bill by ~1 quarter. The second derivative — i.e., the rate-of-change of the 3-month moving average YoY rate — captures regime shifts in the cycle. SOXX reprices on revenue-acceleration regime BEFORE sell-side analysts move their numbers.

Rule

Monthly: compute the 3-month moving average of YoY % change in FRED A36SNO. If MoM change in that 3MMA-YoY rate exceeds +2 percentage points, go long SOXX for the next 60 trading days.

Why it works

Large event sample (110) gives statistical confidence at Sharpe 0.55. The signal captures cyclical inflection points in semis where the cycle leads itself: order growth → revenue → earnings → multiple expansion.

Caveats

Original signal was based on SIA's monthly billings press release (semiconductors.org) — paywalled archive. Substituted FRED A36SNO (Census M3 Computers + Electronics New Orders) which captures most of the same demand signal but includes some non-semi adjacent. -57% drawdown is large; signal goes long in cycles but doesn't filter out subsequent crashes (2008, 2022).

Source: own research; FRED A36SNO (Census M3 proxy for SIA)backtests/N4_sia_billings.py

Validation

High

Confidence

Strong (N=110)

Sample Size

Low

Overfit Risk

Medium

Regime Risk

Low

Cost Impact

Validation Summary

One of the statistically strongest signals: 110 events over 24 years with t-stat 2.76. The rule is simple (one threshold on a public FRED series) and captures cyclical inflection points in semis with genuine economic logic. SOXX is highly liquid so costs are negligible. MaxDD of -57% reflects staying long through crashes (2008, 2022) -- the signal captures upswings but doesn't exit in time. Excess CAGR vs SOXX B&H is slightly negative (-2.2%), meaning the value is in risk-adjusted timing, not absolute alpha.

What Breaks This

AI-driven structural demand replaces cyclical order patterns, making historical billings acceleration irrelevant as a regime indicator.

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; OOS Sharpe 0.00 ≤ 0.

M3

★ stETH Discount Extreme → Long ETH

crypto · 90-day hold · 2020–2026

0.88

Sharpe

33.1%

CAGR

+15.7%

excess vs ETH B&H

2.46

t-stat

6

events ⚠️

Mechanism

Lido's stETH normally trades at par with ETH because withdrawals are 1:1 (post-Shapella). But during forced-liquidity events — Celsius's insolvency (June 2022), the UST/Luna collapse (May 2022) — leveraged stakers are forced to unwind via the Curve stETH/ETH pool, pushing stETH to a meaningful discount. Deep discounts mark capitulation lows; once the forced selling clears, the discount closes via arbitrage and ETH spot also rallies as the forced-deleveraging unwind completes.

Rule

Long ETH-USD when the Curve stETH/ETH pool price closes below 0.985 (i.e., stETH trades at >1.5% discount to ETH). Hold 90 days.

Why it shows up here

Highest CAGR among the M-batch hunt; +15.7pp excess vs ETH buy-and-hold over 5+ years. Mechanism is concrete (forced-seller capitulation marks bottoms). But this is the highest-conviction signal IF the next stETH-discount event is similar to June 2022.

Caveats

N=6 events — heavily driven by June 2022. Lido withdrawals being live (since April 2023) significantly dampens future discounts because the arbitrage path is now fast and mechanical. The historical alpha may not repeat at the same magnitude. Treat as a "buy capitulation when it happens" tactical overlay, not a continuous signal.

Source: own research; Curve subgraph / DefiLlama stETH pricebacktests/M3_steth_discount.py

Validation

Low

Confidence

Weak (N=6)

Sample Size

Low

Overfit Risk

High

Regime Risk

Low

Cost Impact

Validation Summary

Only 6 events, heavily driven by the June 2022 Celsius/3AC collapse. The mechanism (forced-seller capitulation at stETH discount marks ETH bottoms) is intuitive but post-Shapella (April 2023) withdrawals are live, mechanically dampening future discounts via fast arbitrage. The t-stat of 2.46 looks strong but comes from a sample dominated by a single cluster event. This is a "buy capitulation when you see it" heuristic, not a systematic strategy with forward reliability.

What Breaks This

Live stETH withdrawals (post-Shapella) mean the peg never breaks enough to trigger the signal again -- the structural condition that created the opportunity has been engineered away.

A14

★★ Bitcoin Halving Cycle Calendar

crypto · multi-year · 2014-09 → 2026-05

1.20

Sharpe

49.9%

CAGR

-56.4%

MaxDD

4.93

t-stat

+14pp

vs BTC B&H CAGR

Mechanism

Bitcoin block subsidy halves every 210,000 blocks (~4 years). Supply growth drops abruptly; if demand is flat or growing, price clears higher. Historically the bulk of cycle returns concentrate 12–18 months after each halving.

Rule

Long BTC-USD from 6 months before each halving (Jul 2016, May 2020, Apr 2024) through 18 months after; cash otherwise.

Why this is the highest-Sharpe signal in the catalog

Captures most of BTC's secular run while sitting out the bear-market sections of the 4-year cycle. Spends ~60% of time in the market and earns ~95% of its returns.

Caveats

N=3 fully observed halvings. The 2024 halving's full post-window is not yet realized. Stock-to-flow scarcity narratives have failed before (Plan B). Most importantly: every 4-year cycle says "this time is different," and post-ETF (Jan 2024) the cycle may be dampened.

How to act

Position-sizing overlay, not a standalone strategy. Combine with on-chain confirmation (MVRV / Hash Ribbons in Tiers 1 & 4) for entry validation.

Source: halving narrative / Charles Edwardsbacktests/A14_btc_halving.py

Validation

Low

Confidence

Very Weak (N=3 halvings)

Sample Size

Low

Overfit Risk

High

Regime Risk

Low

Cost Impact

Validation Summary

Headline metrics are spectacular (Sharpe 1.20, CAGR 49.9%, t-stat 4.93) but this is fundamentally a 3-observation strategy on the most trending asset of the decade. The t-stat benefits from BTC's secular appreciation, not from the timing rule's independent contribution. +14pp excess CAGR vs BTC B&H is real but derived from N=3 fully observed cycles (2016, 2020, 2024 incomplete). The 2024 halving post-window has not concluded. No amount of in-sample statistics can validate a rule with 3 independent observations.

What Breaks This

The ETF era (Jan 2024+) introduces continuous institutional demand flow that dampens the 4-year scarcity cycle, and the diminishing block-subsidy halving impact becomes economically irrelevant.

H06

★ MVRV Z-Score (Bitcoin)

crypto · 6–24 months

0.97

Sharpe

45.6%

CAGR

-76.6%

MaxDD

0.90

BTC B&H Sharpe

Mechanism

Market Value Realized Value (MVRV) compares current market cap to the aggregated cost-basis of every BTC last moved on-chain ("realized cap"). The z-score puts this on a comparable historical scale. Negative readings mean the average holder is at a loss — historical accumulation zones. Extreme positives mean euphoric paper gains — historical tops.

Rule

Long BTC when MVRV z < 0; flat when z > 5. Backtest uses a proxy: z-score of BTC price / 200-day MA (the precise realized-cap series is paywalled at scale).

Why it works

Cleanly identifies cycle bottoms (2015, 2018, 2022) and tops (2013, 2017, 2021) within ~2 weeks. Modestly beats BTC buy-and-hold on Sharpe and avoids most of each cycle's drawdown.

Caveats

Proxy version is a price/MA z-score; the "true" MVRV requires realized cap. The ETF era (2024+) may dampen the cycle entirely because ETF custody flows don't refresh realized cap the same way as on-chain transfers. Sample only spans 2 full cycles.

Source: Glassnode / BitcoinMagazineProbacktests/H06_mvrv_zscore.py

Validation

Medium

Confidence

Strong (3816 days, 10yr)

Sample Size

Low

Overfit Risk

High

Regime Risk

Low

Cost Impact

Validation Summary

t-stat 3.79 is solid across 10 years. The z-score proxy (price/200d MA) is a reasonable approximation of the paywalled realized-cap MVRV. Sharpe 0.97 modestly exceeds BTC B&H's 0.90. However, -76.6% MaxDD means the signal doesn't meaningfully protect from drawdowns. Only ~2 full BTC cycles are observed. The signal adds marginal Sharpe improvement but with catastrophic drawdown, making it a sizing overlay rather than a standalone strategy.

What Breaks This

ETF custody flows don't refresh realized cap the way on-chain transfers do, making the MVRV denominator stale and the z-score unreliable as a valuation anchor.

H11

BTC–Nasdaq / BTC–Gold Correlation Regime

crypto/cross-asset · monthly

0.90

Sharpe

32.4%

CAGR

-72.6%

MaxDD

0.85

BTC B&H Sharpe

Mechanism

Bitcoin's correlation with Nasdaq vs. gold vs. dollar regime-shifts every few quarters. When BTC trades like high-beta tech (corr w/ NDX > +0.4), it's a risk asset — size half-weight to control beta. When it trades like digital gold (corr w/ GLD > +0.3 AND corr w/ DXY < -0.3), it's behaving as a macro hedge — size full-weight.

Rule

Compute 60-day rolling correlations of daily BTC returns vs ^NDX, GLD, DXY. Apply regime-conditional sizing: half-weight in "tech proxy" regime; full-weight in "digital gold" regime.

Caveats

Regime classifications are post-hoc — they describe the current regime, not predict the next. Useful as a risk-budget overlay, less so as a standalone alpha source.

Source: cross-asset regime work (CME, JPM)backtests/H11_btc_correlation_regime.py

Validation

Low

Confidence

Strong (4206 days, 11yr)

Sample Size

Medium

Overfit Risk

High

Regime Risk

Low

Cost Impact

Validation Summary

Excess CAGR is negative (-5%) vs BTC B&H -- the correlation-regime sizing actually hurts returns. The "digital gold" regime occurs only 6.5% of the time, making it statistically untestable. The default 0.75x weight dominates (75% of days), so the strategy is mostly just a diluted BTC hold. Regime classifications are inherently backward-looking (60-day rolling correlations describe the past, not the future), limiting predictive value.

What Breaks This

BTC's correlation structure shifts rapidly between regimes, and the 60-day lookback is always late to the transition, producing whipsaw sizing changes at the worst moments.

Demoted: no longer passes the tightened winner gate. Failed: CAGR 8.4% ≤ 10%; no OOS Sharpe computed (rerun via shared harness).

C02

Utilities/SPY Ratio (Gayed)

equities · 1-month rebalance · 2000-2026

0.74

Sharpe

8.4%

CAGR

-31%

MaxDD

3.77

t-stat

0.51

SPY Sharpe

Mechanism

When defensive utilities outperform the broad market on a 4-week basis, professional flow is rotating defensively ahead of broader stress. Utilities-led leadership historically precedes higher equity vol and lower returns.

Rule

Each Friday close: compute 4-week return of XLU minus 4-week return of SPY. If XLU > SPY, hold cash next 4 weeks; otherwise long SPY.

Why it works

~Matches SPY on CAGR (8.4% vs 8.3%) with much better Sharpe (0.74 vs 0.51) and ~25% lower drawdown. Defensive overlay that doesn't sacrifice return.

Caveats

Crowded since publication (Gayed 2014); signal is noisier post-2015. Best when used as a filter alongside VIX term structure or credit spreads rather than standalone.

Source: Gayed & Atilgan 2014 SSRN 2517910backtests/C02_utilities_spy.py

Validation

High

Confidence

Strong (6632 days, 26yr)

Sample Size

Low

Overfit Risk

Low

Regime Risk

Low

Cost Impact

Validation Summary

26-year sample, t-stat 3.77, Sharpe 0.74 vs SPY's 0.51 with lower drawdown (-31% vs benchmark). Rule is dead simple (one comparison: 4-week XLU vs SPY return). Published and well-known since Gayed 2014, yet the out-of-sample performance has been noisier post-publication. CAGR roughly matches SPY (8.4% vs 8.3%) -- the value is entirely in risk adjustment. Monthly rebalance on liquid ETFs means costs are negligible. One of the most defensible signals in the catalog.

What Breaks This

Utilities sector composition changes (AI power demand makes XLU a growth play rather than a defensive one), breaking the "utilities outperformance = risk-off" signal.

C07

Accelerating Dual Momentum

multi-asset · monthly rebalance

0.73

Sharpe

11.3%

CAGR

-30%

MaxDD

3.15

t-stat

Mechanism

Combines absolute and relative momentum across three assets: US large-cap (SPY), international small-cap (SCZ), and long Treasuries (TLT). Uses an average of 1/3/6-month returns instead of a single 12-month lookback — "accelerating" because the short lookbacks pick up regime changes faster.

Rule

Monthly: rank SPY, SCZ, TLT by average of (1m + 3m + 6m) total return. Hold the single top-ranked asset for next month.

Caveats

Heavy reliance on TLT in a 30-year bond bull market. 2022 was brutal because TLT and SPY both collapsed; short lookbacks made whipsaws worse. Overfit-suspect — only 3 assets, lookback choices feel cherry-picked. Look for an asset-class extension before deploying.

Source: EngineeredPortfolio.com 2018backtests/C07_accel_dual_momentum.py

Validation

Medium

Confidence

Strong (4640 days, 18yr)

Sample Size

Medium

Overfit Risk

Medium

Regime Risk

Low

Cost Impact

Validation Summary

t-stat 3.15 over 18 years with Sharpe 0.73. The "accelerating" multi-lookback (1/3/6-month average) picks up regime changes faster than the classic 12-month Antonacci version. However, only 3 assets (SPY/SCZ/TLT) means heavy concentration risk and the lookback choice feels cherry-picked. Excess CAGR over SPY is approximately zero (+0.1%). The 30-year bond bull market turbocharged the TLT bucket -- future performance in a rising-rate environment is questionable.

What Breaks This

A sustained period where all three assets decline simultaneously (as in 2022), leaving the model stuck in the worst performer with no defensive hedge.

F03

Margin Debt Year-over-Year

equities · 3-12 months · monthly signal

0.72

Sharpe

11.46%

CAGR

-45%

MaxDD

Mechanism

FINRA margin debt aggregates all retail+institutional leveraged equity positions. Extreme YoY contractions signal forced deleveraging; extreme expansions signal speculative excess. Coincident-to-leading recession indicator.

Rule

Monthly: compute YoY % change in FINRA margin debt (FRED BOGZ1FL663067003Q). When YoY < -20%, reduce equity exposure for 6 months. When YoY turns back through 0% from below, re-enter.

Why it works

Peaks preceded 2000, 2007, 2021 tops. Troughs near 2003, 2009, 2022 bottoms. Slow but reliable.

Caveats

Coincident more than leading — most of the move happens before the YoY threshold breaks. Useful for strategic tilts, not tactical timing. Quarterly publication lag also delays the signal.

Source: Jesse Felder / FINRA monthlybacktests/F03_margin_debt_yoy.py

Validation

High

Confidence

Strong (7899 days, 31yr)

Sample Size

Low

Overfit Risk

Low

Regime Risk

Low

Cost Impact

Validation Summary

31-year sample with t-stat 4.00 and Sharpe 0.72 vs SPY's 0.66. Dead-simple rule (one YoY threshold on quarterly FRED data). Excess CAGR is near zero (+0.2%) -- this is a risk filter, not an alpha generator. Correctly identified peaks in 2000, 2007, 2021. The signal is coincident more than leading, which limits tactical value. Overlaps substantially with V3 (same data, different normalization). Publication lag on quarterly data delays the signal further.

What Breaks This

Same as V3: leverage migrates to instruments not captured by FINRA margin statistics (options, crypto margin, synthetic leverage).

Demoted: no longer passes the tightened winner gate. Failed: CAGR 5.5% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

C14

Time-Series Momentum (TSMOM)

multi-asset · 1-12 months

0.68

Sharpe

5.5%

CAGR

-13.5%

MaxDD

3.07

t-stat

Mechanism

Each asset's own past 12-month return predicts its next-month return. Long winners, flat losers. Applied across a diversified basket, exposure is dynamically allocated to whichever markets are trending positively.

Rule

Monthly across basket {SPY, TLT, GLD, DBC, EFA, EEM}: if 12-month return > 0, hold 1/N weight; else flat. Aggregate PnL.

Why it works

Best risk-adjusted return in the cross-asset family with the lowest drawdown by a wide margin (-13.5% vs -30%+ for everything else). Moskowitz-Ooi-Pedersen 2012; AQR Century of Evidence (137-year sample).

Caveats

2010s were the weakest decade in 130 years for trend; 2022 revived it. Capacity-aware. Pure ETF basket lacks the breadth (~67 markets) of the original commodity-heavy futures TSMOM.

Source: Moskowitz-Ooi-Pedersen JFE 2012backtests/C14_tsmom.py

Validation

High

Confidence

Strong (5106 days, 20yr)

Sample Size

Low

Overfit Risk

Low

Regime Risk

Low

Cost Impact

Validation Summary

Best risk-adjusted result in the catalog: MaxDD of only -13.5% with Sharpe 0.68 and t-stat 3.07. Rule is textbook (12-month return sign, 1/N weighting) with zero free parameters to overfit. Backed by Moskowitz-Ooi-Pedersen (JFE 2012) and AQR's 137-year evidence. CAGR of 5.5% is modest but the drawdown floor is exceptional. ETF basket (SPY/TLT/GLD/DBC/EFA/EEM) is highly liquid. The 2010s were the weakest decade for trend in 130 years, so recent underperformance is within historical expectations.

What Breaks This

A prolonged period of choppy, mean-reverting markets with no sustained trends across any asset class (essentially an extended version of 2011-2019).

Demoted: no longer passes the tightened winner gate. Failed: CAGR 8.4% ≤ 10%; no OOS Sharpe computed (rerun via shared harness).

F09

Golden Cross (50/200 SMA)

equities · months-to-years

0.69

Sharpe

8.36%

CAGR

-33.7%

MaxDD

3.55

t-stat

Mechanism

Simplest possible trend filter. 50-day SMA above 200-day SMA = uptrend regime. Avoids being long during major bear markets (sat out most of 2008, parts of 2022). The classic "obvious" rule that actually works on risk-adjusted basis.

Rule

Long SPY when SMA(50) > SMA(200); flat otherwise.

Why it's promising

~Matches SPY on absolute CAGR (8.4% vs 8.3%) with much better Sharpe (0.69 vs 0.51). t=3.55 is robust. The death cross side (going flat) is the value-add — preserves capital in 2008, 2020, 2022.

Caveats

Whipsaws badly in choppy ranges (2015-16, 2018-19 produced multiple false crosses). Lagging by design — you miss the first ~15% off a bottom. Has been universally known for decades; if it has decayed, you wouldn't know yet because the regime shifts are too rare.

Source: Universal TA folklorebacktests/F09_golden_cross.py

Validation

High

Confidence

Strong (6636 days, 26yr)

Sample Size

Low

Overfit Risk

Low

Regime Risk

Low

Cost Impact

Validation Summary

The most boring and most defensible signal: t-stat 3.55 over 26 years with a rule that has exactly zero free parameters (50 and 200 are fixed by convention). Sharpe 0.69 vs SPY's 0.51 with lower drawdown (-34% vs -55%). Universally known for decades, yet continues to work because the value comes from avoiding bear markets, which are rare but catastrophic. Only 27 signal flips in 26 years -- very low turnover. CAGR matches SPY almost exactly (+0.06% excess).

What Breaks This

Extended choppy sideways markets (2015-2016, 2018-2019) produce multiple false crosses that whipsaw the signal, eroding the Sharpe advantage accumulated during clean trends.

Demoted: no longer passes the tightened winner gate. Failed: CAGR 9.9% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

A16

Heston-Sadka Same-Calendar-Month Seasonality

equities · monthly

0.65

Sharpe

9.89%

CAGR

-43%

MaxDD

3.14

t-stat

Mechanism

Calendar-month return seasonalities persist. If SPY has historically averaged a positive return in (say) April over the prior 10 years, that pattern tends to continue. Walk-forward, no peeking.

Rule

Each month, compute the trailing-10-year average daily return for that calendar month. If positive, hold SPY that month; if negative, cash.

Caveats

Just barely matches buy-and-hold — the "edge" is in regime-style filtering of bad months. Keloharju et al. (2016) document this persists across countries and assets. With only ~12 month-of-year observations per decade, statistical power is limited.

Source: Heston-Sadka JFE 2008; Keloharju et al 2016backtests/A16_seasonality_heston_sadka.py

Validation

Medium

Confidence

Strong (5885 days, 23yr)

Sample Size

Medium

Overfit Risk

Low

Regime Risk

Low

Cost Impact

Validation Summary

t-stat 3.14 with 23 years of data. Walk-forward (no peeking) using a 10-year lookback for each calendar month. However, only ~12 month-of-year observations per decade makes each monthly estimate noisy. Excess CAGR is slightly negative (-1.7%) vs SPY B&H -- the edge is in filtering bad months, not in outperformance. The academic backing (Heston-Sadka JFE 2008, Keloharju et al 2016) is strong. Low-frequency rebalancing (monthly) means minimal costs.

What Breaks This

Calendar-month seasonality is driven by institutional flow patterns (tax-loss selling, year-end window dressing) that shift as market microstructure evolves and passive investing dominates.

Demoted: no longer passes the tightened winner gate. Failed: CAGR 4.4% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

C05

Faber GTAA 10-Month Moving Average

multi-asset · monthly

0.58

Sharpe

4.4%

CAGR

-15%

MaxDD

2.58

t-stat

Rule

Each month-end, for {SPY, EFA, VNQ, GSG, IEF}: if last close > 10-month SMA, hold that bucket; else cash. Equal-weight the active buckets.

Why it's promising

The drawdown floor of -15% is exceptional. Faber's original premise was "preserve capital in bear markets at the cost of muted bull-market returns." This sample (2000-2026) confirms exactly that — saves you 2008, 2020, 2022.

Caveats

Underperforms in low-vol bull markets (most of 2010s). The bond bucket (IEF) carries most of the drawdown protection — without 30-year bond bull market behind us, future performance is uncertain.

Source: Mebane Faber 2007 SSRN 962461backtests/C05_faber_gtaa.py

Validation

High

Confidence

Strong (4991 days, 20yr)

Sample Size

Low

Overfit Risk

Low

Regime Risk

Low

Cost Impact

Validation Summary

The exceptional -15% MaxDD makes this the best capital-preservation strategy in the catalog. t-stat 2.58 over 20 years. Rule is trivially simple (price above 10-month MA = hold, else cash) applied to 5 diversified ETFs. CAGR of 4.4% is well below SPY B&H (11.5%) -- you pay a heavy absolute-return cost for drawdown protection. The IEF (bonds) bucket carried disproportionate protective value during the 30-year bond bull market. Published and well-known since 2007.

What Breaks This

In a rising-rate environment, the bond bucket (IEF) no longer provides crisis protection (as in 2022), removing the key diversification leg that made the drawdown floor so low.

Demoted: no longer passes the tightened winner gate. Failed: CAGR 8.7% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

C06

Dual Momentum (Antonacci GEM)

equities/bonds · monthly

0.61

Sharpe

8.7%

CAGR

-34%

MaxDD

2.58

t-stat

Rule

Monthly: pick the higher 12-month return of SPY vs ACWX. If that winner's 12m return < T-bill return, hold AGG (bonds) instead.

Caveats

2015 and 2018 produced whipsaws into bonds. Sensitive to lookback choice — some practitioners use a blended 3/6/12-month average.

Source: Antonacci 2013backtests/C06_dual_momentum.py

Validation

Medium

Confidence

Strong (4566 days, 18yr)

Sample Size

Low

Overfit Risk

Medium

Regime Risk

Low

Cost Impact

Validation Summary

Well-known published strategy (Antonacci 2013) with t-stat 2.58 over 18 years. Rule is clean (12-month relative + absolute momentum, 3 assets). Sharpe 0.61 is decent but underperforms SPY B&H on CAGR (8.7% vs 11.9%, excess -3.2%). The whipsaws in 2015 and 2018 (going to bonds then back) are characteristic of single-lookback momentum. Sensitive to lookback choice -- practitioners increasingly use blended 3/6/12-month averages (which is essentially C07).

What Breaks This

US equity exceptionalism continues indefinitely, making the international rotation leg (ACWX) a perpetual drag and the bond fallback (AGG) a trap during rate hikes.

C08 / C09

Yield Curve Inversion Equity Timer

equities · multi-year · 1976-2026

0.65

Sharpe

10.4%

CAGR

-55%

MaxDD

3.73

t-stat

Mechanism

10Y-2Y (and 10Y-3M) Treasury yield inversions have preceded every US recession since 1976 with no false positives in the 3m10y specification. The key insight is that the recession (and equity drawdown) typically arrives 6-18 months AFTER the curve uninverts, not at the inversion itself — equities can rally for ~a year after inversion begins.

Rule

When T10Y2Y (or T10Y3M) first crosses below 0 from above, set a 12-month timer. On month 12 post-inversion, reduce equity exposure to 50% (or cash) until the curve re-steepens above +50 bps.

Caveats

2022-2024 inversion was the deepest on record and didn't produce a recession (yet). Sample of ~8 events since 1976 is small. The "un-inversion" rule means you're flat through the actual top — useful for capital preservation but doesn't catch the highs.

Source: Estrella-Mishkin 1998; Chicago Fed 2018backtests/C08_yield_curve_10y2y.py

Demoted: no longer passes the tightened winner gate. Failed: CAGR 8.1% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

C01

Lumber/Gold Ratio (Gayed)

equities vs bonds · 13-week

0.57

Sharpe

8.1%

CAGR

-38%

MaxDD

2.64

t-stat

Mechanism

Lumber demand leads housing/cyclical activity; gold reflects safe-haven flows. Their 13-week relative performance is a risk-on/risk-off switch for cross-asset rotation.

Rule

Weekly: compute 13-week % change in lumber futures and gold. If lumber outperformed gold, long SPY next week; else TLT.

Caveats

Lumber futures contract was redesigned in 2022 (smaller LBR contract), creating a pre/post discontinuity. Out-of-sample performance post-2015 has been substantially weaker than the original paper. Best in clear macro regimes; whipsaws in chop.

Source: Gayed 2015 SSRN 2604248backtests/C01_lumber_gold.py

Validation

Medium

Confidence

Strong (5338 days, 21yr)

Sample Size

Low

Overfit Risk

Medium

Regime Risk

Low

Cost Impact

Validation Summary

t-stat 2.64 over 21 years. The economic logic (lumber = cyclical demand, gold = safe haven) is sound. However, Sharpe 0.57 underperforms SPY's 0.69, and CAGR of 8.1% vs SPY's 11.8% is a significant sacrifice. Lumber futures contract was redesigned in 2022 (smaller LBR contract) creating a data discontinuity. Out-of-sample performance since original 2015 publication has been weaker. Best in clear macro regimes; whipsaws in choppy markets.

What Breaks This

The lumber market's structure changes (Canadian softwood tariffs, substitution to engineered wood, housing-start volatility) break the lumber-as-cyclical-barometer relationship.

Demoted: no longer passes the tightened winner gate. Failed: CAGR 6.6% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

B01

VIX Term Structure (VIX/VIX3M)

equities · days-to-weeks

0.61

Sharpe

6.6%

CAGR

-35.5%

MaxDD

Mechanism

VIX/VIX3M below 1 = contango (normal). Spikes above 1 = backwardation (stress) — the 1-month vol is pricing more uncertainty than the 3-month, classic risk-off positioning.

Rule

When VIX/VIX3M closes < 0.92, long SPY. Cross above 1.0 → cash. Re-enter when ratio falls back below 0.95.

Caveats

Whipsawed in Feb 2018 vol-mageddon and March 2020 (the model couldn't handle term-structure violence). Often combined with VVIX/VIX for confirmation. Underperforms simple SPY B&H here because it sits in cash during recoveries.

Source: VIX and More blog (Bill Luby)backtests/B01_vix_term_structure.py

Validation

Medium

Confidence

Strong (4646 days, 18yr)

Sample Size

Medium

Overfit Risk

Medium

Regime Risk

Low

Cost Impact

Validation Summary

t-stat 2.60 over 18 years with Sharpe 0.61. The two-threshold system (enter below 0.92, exit above 1.0, re-enter below 0.95) introduces mild parameter-fitting risk. CAGR of 6.6% substantially underperforms SPY B&H (11.2%) because the signal spends long stretches in cash during recoveries. The backwardation regime-detection is well-understood but the signal was catastrophically late during Feb 2018 vol-mageddon and March 2020. Best used as a filter alongside other signals.

What Breaks This

0DTE options and intraday vol dynamics change the VIX term structure's information content, making the 1-month vs 3-month relationship less reliable as a stress indicator.

Demoted: no longer passes the tightened winner gate. Failed: CAGR 8.6% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

C11

MOVE / VIX Ratio

cross-asset · 2-8 weeks

0.56

Sharpe

8.6%

CAGR

-47%

MaxDD

Mechanism

MOVE measures Treasury implied vol; VIX measures equity implied vol. When the ratio spikes (rates volatility dominates), it's a rates-driven regime — duration-sensitive equities suffer. When low, equity-vol dominates.

Rule

When MOVE/VIX > 6 sustained 5+ days, cash for 20 days. When MOVE/VIX < 4, long SPY for 20 days.

Caveats

Regime-dependent. Works when rates drive equity (most of 2022-2024). Breaks when growth scares dominate (2020 Q1, 2008). Default-long stance dominates so the "low-ratio long" branch contributes little.

Source: FinTwit (@Ksidial); SOA Sep 2025; CFA Institute Jul 2025backtests/C11_move_vix_ratio.py

Validation

Medium

Confidence

Strong (5782 days, 23yr)

Sample Size

Medium

Overfit Risk

Medium

Regime Risk

Low

Cost Impact

Validation Summary

t-stat 2.67 over 23 years, but Sharpe 0.56 underperforms SPY's 0.68 and CAGR trails by 3pp. The default-long stance means the "high ratio = cash" branch does all the work, while the "low ratio = long" branch is dominated by simply being long. The two threshold parameters (6 and 4) introduce fitting risk. Works well when rates drive equity markets (2022-2024) but fails when growth scares dominate (2020, 2008). Regime-dependent more than robust.

What Breaks This

The MOVE index calculation methodology changes, or bond-equity correlation flips permanently positive, making rates volatility uninformative about equity risk.

L6

★ SEC NT-10K / NT-10Q Late-Filing Short

single-name equities · 30-day hold · 2015-2026

0.55

Sharpe

16.3%

CAGR

+2.45%

excess vs SPY

1.84

t-stat

3,983

events

Mechanism

SEC Form NT-10K (or NT-10Q) is filed when a company knows it cannot meet its annual (or quarterly) filing deadline. The reasons are almost always negative: auditor disputes, material weakness in internal controls, suspected accounting irregularities, or pending restatements. The filing is itself a public, mechanical, free-to-acquire disclosure of "we are in trouble."

Rule

For any company filing an NT-10K or NT-10Q on EDGAR, short the stock at filing date + 1 day. Cover after 30 trading days. Filter universe to closing price ≥ $5 (penny stock data corruption otherwise). Equal-weight basket.

Why this stands out

Among the ~30 new "simple+unique" signals we hunted and tested in this report's research extension, this is the one that survived. 4,000+ event sample size (not 4) makes the t-stat much harder to dismiss as luck. The data is free, the rule is mechanical, and the mechanism is nearly tautological — a company that can't file on time has something to hide.

Caveats

Some NT filings are benign (small subsidiary integrations, recent CFO transitions). Micro-cap and illiquid names dominate the universe — borrow can be punitive or unavailable. The agent flagged that yfinance penny-stock data corruption is a real issue (some delisted tickers report fake +18,000,000% single-day moves); the $5 price filter and ±30%/day return clip handle this. Backtest doesn't include shorting fees, which can be 10-50%+ annualized on these names.

How to act

Practical implementation: scrape EDGAR's full-text search daily (efts.sec.gov/LATEST/search-index?forms=NT-10-K,NT-10-Q). Filter to liquid-enough names. Pair with credit-spread / put-option proxies when borrow is constrained.

Source: own research; SEC EDGAR full-text search (free)backtests/L6_sec_nt_10k.py

Validation

Medium

Confidence

Strong (N=3983)

Sample Size

Low

Overfit Risk

Low

Regime Risk

High

Cost Impact

Validation Summary

Largest event sample in the catalog (3,983 events) makes the t-stat of 1.84 harder to dismiss as luck. The rule is dead simple and the data is free (EDGAR). The mechanism is nearly tautological -- companies that cannot file on time have something wrong. However, micro-cap and illiquid names dominate, and borrow costs on these names can be 10-50%+ annualized (not in the backtest). MaxDD of -90% reflects penny-stock data corruption risk. The +2.45% excess CAGR would likely be consumed by shorting fees.

What Breaks This

Borrow costs and share availability on micro-cap short targets consume the entire edge, or SEC reforms the NT filing process to reduce the information content of late filings.

E09

IPO Lockup Expiry Short

single-name equities · ~10 days

2.10

t-stat

+3.15%

10d short excess

61%

hit rate

59

events

Mechanism

Standard IPO lockups end 180 days after IPO. Insiders + VC investors then face a "first chance to sell" — supply hits the market, pushing price down. Field-Hanka (2001) documented -1.5% mean CAR; -3% for VC-backed.

Rule

Short the stock 5 trading days before lockup expiry (T-5); cover 5 trading days after (T+5). Equal-weight basket of all eligible IPOs.

Why it's the cleanest event-driven hit

t-stat above 2, hit rate well above 50%, mechanism is mechanical (forced supply). Easy to implement at small scale.

Caveats

Crowded — borrow rates on recent IPOs can be punitively high (sometimes >50% annualized) and offset the price decline. Edge has shrunk to ~50bp in recent samples. Don't use on very thin or hard-to-borrow names.

Source: Field-Hanka JF 2001backtests/E09_ipo_lockup.py

Validation

Medium

Confidence

Strong (N=59)

Sample Size

Low

Overfit Risk

Low

Regime Risk

High

Cost Impact

Validation Summary

t-stat 2.10, 61% hit rate, +3.15% per-event excess across 59 events. Academic backing (Field-Hanka JF 2001). The mechanism is mechanical (forced insider supply at lockup +180d). Rule is dead simple with zero free parameters. However, the edge has shrunk to ~50bp in recent samples as the trade is now very crowded. Borrow rates on recent IPOs are often punitive (>50% annualized), which would eat the +3.15% per-event excess in a 10-day holding period.

What Breaks This

Crowded trade: borrow costs spike to punitively high levels around known lockup dates, converting the information edge into a borrow-cost transfer to the lending desk.

Demoted: no longer passes the tightened winner gate. Failed: Sharpe 0.49 ≤ 0.50; CAGR 1.7% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

A07

Pre-FOMC Drift (UNCONDITIONAL)

equities · 1-2 days · 8 events/year

0.49

Sharpe

1.70%

CAGR

-8.5%

MaxDD

2.49

t-stat

Mechanism

Lucca-Moench (2015) documented that the S&P 500 earns abnormally positive returns in the 24 hours before FOMC announcements. Theories: pre-announcement uncertainty resolution, dealer hedging, or pure pattern-mining.

Rule

Long SPY from close of T-1 (day before FOMC) to close of T (FOMC day). 8 meetings/year.

The interesting finding

The UNCONDITIONAL pre-FOMC drift survives in our sample (2000-2026) with t=2.49. Adding the "dovish regime" filter (DGS2 1-month change below median) — which the post-2024 literature suggests — DESTROYS the edge: Sharpe collapses to 0.07. The clean Lucca-Moench rule beats the "smart" filtered version. See counter-finding G-14 below.

Caveats

Tiny exposure — only 8 days/year. CAGR of 1.7% is meaningful only as an overlay on a base portfolio. Drift has been debated as weakening post-2015 in original paper's authors' follow-up work.

Source: Lucca-Moench JF 2015backtests/A07_pre_fomc_drift.py

Validation

Medium

Confidence

Strong (N=211 events, 26yr)

Sample Size

Low

Overfit Risk

Low

Regime Risk

Low

Cost Impact

Validation Summary

t-stat 2.49 over 211 events across 26 years -- solid academic pedigree (Lucca-Moench JF 2015). Rule is trivially simple (long SPY close-to-close around FOMC). Zero overfitting risk. However, CAGR is only 1.7% because the signal is active only 8 days per year. Useful only as an overlay on a base long-SPY position. The original authors' follow-up work suggests the drift weakened post-2015 publication. Adding "dovish regime" filters actually destroys the edge -- the clean unconditional rule is better.

What Breaks This

The drift has already weakened post-publication as traders crowd into the pre-FOMC window, arbitraging away the anomaly; future FOMC surprises (hawkish shocks) add left-tail risk to a tiny-edge trade.

Demoted: no longer passes the tightened winner gate. Failed: CAGR 6.7% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

A06

FOMC Even-Week Effect

equities · biweekly cycle

0.53

Sharpe

6.69%

CAGR

-50%

MaxDD

2.71

t-stat

Mechanism

Cieslak-Morse-Vissing-Jorgensen (2019) found stock returns concentrate in "even" weeks (0, 2, 4, 6) of the FOMC cycle and approach zero in odd weeks. Hypothesized to come from informal Fed communications cycling through the financial media on a biweekly cadence.

Rule

Long SPY only during even weeks of the FOMC cycle (weeks 0, 2, 4, 6 after the most recent FOMC). Cash in odd weeks.

Caveats

Sits in cash half the time but earns close to SPY's full Sharpe — efficient. The literature's authors find pattern weakened post-publication. Combine with another defensive overlay for the cash periods.

Source: Cieslak-Morse-Vissing-Jorgensen JF 2019backtests/A06_fomc_even_week.py

Validation

Medium

Confidence

Strong (N=212 events, 26yr)

Sample Size

Low

Overfit Risk

Medium

Regime Risk

Low

Cost Impact

Validation Summary

t-stat 2.71 over 26 years with solid academic backing (Cieslak-Morse-Vissing-Jorgensen JF 2019). Sits in cash half the time but earns close to SPY's Sharpe (0.53 vs 0.52). CAGR of 6.7% trails SPY (8.5%) but MaxDD is brutal at -50% -- the cash weeks didn't protect from crashes. The authors themselves found the pattern weakened post-publication. Simple rule with no fitting risk. The mechanism (biweekly Fed communication cycle) is plausible but unverifiable.

What Breaks This

The Fed changes its communications cadence (already happening with more frequent "unscheduled" remarks and SEP updates), scrambling the biweekly cycle structure.

Demoted: no longer passes the tightened winner gate. Failed: Sharpe 0.49 ≤ 0.50; CAGR 2.1% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

F08

RSI(2) Mean Reversion (Connors)

equities · days · ~211 trades

0.49

Sharpe

2.13%

CAGR

-14%

MaxDD

2.50

t-stat

Mechanism

Short-period RSI mean-reversion: buy oversold conditions only when the longer-term trend is up. The 200-day SMA filter keeps you out of bear markets where mean reversion fails.

Rule

If SPY close > 200-day SMA AND RSI(2) < 5, buy at close. Exit when close > 5-day SMA. (Typical hold: 3-7 days.)

The honest take

Real per-trade edge (t=2.50). But strategy is in cash most of the time and earns less absolute return than buy-and-hold. Best used as an OVERLAY on a long SPY position — e.g., add leverage when the rule triggers, otherwise just hold base allocation.

Caveats

Single-name mean reversion has decayed substantially since 2010; persists in indices. Edge concentrated in fast-snapback rallies (2018 Q4, 2020 Q2).

Source: Connors 2009backtests/F08_rsi2_connors.py

Validation

Medium

Confidence

Strong (N=98 entries, 26yr)

Sample Size

Low

Overfit Risk

Low

Regime Risk

Low

Cost Impact

Validation Summary

t-stat 2.50 across 98 entry signals over 26 years. The 200-day SMA filter is key -- it keeps you out of bear markets where mean reversion fails. Rule is well-known (Connors 2009) and survives on indices though it has decayed on single names. CAGR of only 2.1% with 97% of the time in cash means this is purely an overlay strategy. Per-trade edge is real but the absolute contribution to a portfolio is small. -14% MaxDD is manageable.

What Breaks This

An extended grinding bear market (not a V-shaped crash) where SPY stays above its 200-day SMA while slowly declining, causing the RSI(2) entries to catch falling knives.

Demoted: no longer passes the tightened winner gate. Failed: CAGR 8.1% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

A04

First Five Days of January

equities · 12-month decision

0.59

Sharpe

8.11%

CAGR

-42.7%

MaxDD

3.02

t-stat

Mechanism

Folk-finance heuristic: the first five trading days of January predict the rest of the year. Mechanism unclear; possibly pension/401k inflow patterns or year-start risk-taking.

Rule

If SPY cumulative return Jan trading days 1-5 > 0, hold SPY for rest of year. If negative, stay in cash.

Why it shows up here

t=3.02, ~matches SPY CAGR. The decision is made on tiny information (5 days) yet produces statistically significant results across 26 years. Could be coincidence — 26 years is barely a generation of cycles.

Caveats

Failed dramatically in some years (e.g., 2024). Sample size of 26 binary decisions is statistically weak even with t=3.02.

Source: Yale Hirsch, Stock Trader's Almanacbacktests/A04_first_five_days.py

Validation

Low

Confidence

Moderate (N=26 years)

Sample Size

Medium

Overfit Risk

Low

Regime Risk

Low

Cost Impact

Validation Summary

t-stat 3.02 looks impressive but is built on only 26 binary decisions (positive or negative first five days). The information content of 5 trading days predicting the next 250 is implausibly high -- likely confounded by broader momentum. CAGR of 8.1% and Sharpe 0.59 roughly match SPY B&H. Failed dramatically in some years (2024). The mechanism is unclear (pension inflows? risk appetite?) and no credible economic theory supports it. Folk-finance heuristic that happens to have held in-sample.

What Breaks This

A single dramatic year where January starts strong but the rest of the year crashes (e.g., 2008 pattern) destroys multiple years of accumulated Sharpe improvement.

E10

Spin-Off Drift (Greenblatt)

single-name · 24 months

+11.7%

24m excess CAGR

0.96

t-stat

40

events

Mechanism

Newly spun-off companies are sold indiscriminately by parent shareholders (forced selling: index exclusion, mandate mismatch, "I didn't pick this stock"). Concentrated insider ownership in the new entity often produces post-spin alpha.

Rule

Buy each spin-off at first regular-way trade date; hold 24 months. Equal-weight basket across all spins.

Caveats

t-stat of 0.96 is not statistically significant — directionally consistent with Cusatis-Miles-Woolridge (1993) but our 40-event sample doesn't clear the noise. Recent CSI / GE Vernova / Kenvue spins underperformed initially. Tax-free spins outperform taxable.

Source: Greenblatt 1997; Cusatis-Miles-Woolridge JFE 1993backtests/E10_spinoff_drift.py

Validation

Low

Confidence

Strong (N=40)

Sample Size

Low

Overfit Risk

Low

Regime Risk

Medium

Cost Impact

Validation Summary

t-stat of only 0.96 across 40 events -- not statistically significant. The +11.7% mean excess is driven by extreme outliers (GE Vernova +337%, Chemours +123%, Lamb Weston +148%) while the median excess is actually -5.4%. Hit rate of 42.5% is below 50%. The academic theory (forced selling by index funds) is sound but the modern implementation is noisy. The Greenblatt era (1990s) may have been structurally different from the current index-reconstitution speed.

What Breaks This

Index funds and ETFs now rebalance within days of a spin-off, compressing the forced-selling window from months to hours and eliminating the drift.

Demoted: no longer passes the tightened winner gate. Failed: Sharpe 0.41 ≤ 0.50; CAGR 5.1% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

D02

Cross-Sectional Momentum (UMD) — Clean

equities · 1-12 months · Ken French data

0.41

Sharpe

5.11%

CAGR

-63%

MaxDD

36 yrs

sample

Mechanism

Past 12-month winners (skip last month) continue to outperform; losers continue to underperform. Behavioral: underreaction to news. Risk-based: crash exposure.

Why this is the cleanest factor result

Survivorship-free: Ken French data uses the full historical CRSP universe with point-in-time accuracy. 36-year Sharpe of 0.41 is below buy-and-hold but matches the published literature.

Caveats

Catastrophic crashes: 2009 (-50% in months), 2020 Q2. Daniel-Moskowitz "Momentum Crashes" (2016) documents this is a regime-dependent feature, not a bug.

Source: Jegadeesh-Titman JF 1993; Ken French data librarybacktests/D02_momentum_umd.py

Validation

Medium

Confidence

Strong (36 yrs, Ken French)

Sample Size

Low

Overfit Risk

Medium

Regime Risk

Medium

Cost Impact

Validation Summary

Cleanest survivorship-free dataset (Ken French CRSP universe). t-stat 2.46 over 36 years. Sharpe 0.41 is below buy-and-hold (0.50) -- momentum has been a below-market factor in the recent era. MaxDD of -63% reflects the catastrophic momentum crashes of 2009 and 2020 Q2 documented by Daniel-Moskowitz (2016). The long-short structure implies meaningful transaction costs (monthly rebalancing of individual stocks). CAGR of 5.1% trails the market by 2.6pp.

What Breaks This

Another momentum crash (sharp reversal where prior losers violently outperform prior winners) -- these happen during bear-market recoveries and can erase years of accumulated return in weeks.

Demoted: no longer passes the tightened winner gate. Failed: Sharpe 0.46 ≤ 0.50; CAGR 2.7% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

K4

★ SLOOS "Willingness to Lend to Consumers" → Long XLY

consumer discretionary · quarterly · 1990-2026

0.46

Sharpe

+2.71%

CAGR

2.40

t-stat

9

trigger events

Mechanism

The Fed's Senior Loan Officer Opinion Survey (SLOOS) is widely watched for the "tightening standards" headline. But the buried question "willingness to make consumer installment loans" (FRED DRIWCIL) measures a slightly different thing — bank appetite to extend credit. A sharp positive jump in this subindex precedes credit-card and auto-loan origination expansion by ~1 quarter, which feeds consumer discretionary spending.

Rule

Long XLY for the next quarter (60 trading days) when FRED DRIWCIL QoQ change rises by +10 percentage points or more.

Why it works

Statistically significant t=2.40 across the available SLOOS history (1990+). The mechanism is direct: bank willingness → credit origination → consumer spending → XLY earnings.

Caveats

Only 9 trigger events — small sample even over 35 years. In QE regimes banks may express willingness while consumers don't demand credit (transmission breaks). Use as overlay to a base allocation rather than standalone strategy.

Source: own research; FRED DRIWCIL (SLOOS)backtests/K4_sloos_willingness.py

Validation

Low

Confidence

Weak (N=9 triggers)

Sample Size

Low

Overfit Risk

Medium

Regime Risk

Low

Cost Impact

Validation Summary

Only 9 trigger events across 27 years with just 8% time in market (540 days of 6,894). t-stat 2.40 looks significant but comes from tiny, non-overlapping windows. Portfolio CAGR is only 2.7% -- the signal fires so rarely that its contribution to any real portfolio is negligible. The mechanism (bank willingness leads credit origination) is economically sound. The DRIWCIL series is genuinely obscure, which is interesting, but the sample is too small to trust.

What Breaks This

In QE/ZIRP regimes, banks express willingness to lend but consumers don't demand credit, breaking the transmission mechanism from SLOOS to consumer spending.

Demoted: no longer passes the tightened winner gate. Failed: CAGR 3.3% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

K9

★ Treasury DTS DoD Outlays Surge → Long ITA

defense ETF · 10-day hold · daily data

0.50

Sharpe

+3.30%

CAGR

2.25

t-stat

67

trigger events

Mechanism

The Daily Treasury Statement (Table III-A) publishes US Treasury spending by category every business day. Large single-day Department of Defense outlays often correspond to milestone payments on major weapons contracts that haven't yet been reported in defense-contractor earnings or guidance. Daily data on quarterly-reporting companies = information asymmetry that decays in ~10 days.

Rule

Long ITA (iShares Aerospace & Defense ETF) for 10 trading days whenever Treasury DTS Department of Defense outlays exceed $5B in a single business day.

Why it works

t=2.25 across 67 trigger events. The mechanism is mechanical — when Treasury actually pays the prime contractors, money flows to LMT/RTX/NOC/GD before their next earnings call. ~12% exposure rate means it leaves room for buy-and-hold base allocation.

Caveats

Some large outlays are payroll/pension and have no contractor-revenue implication — would benefit from a "non-pay-period" filter. Treasury DTS structure changed around 2010; pre-2010 events may be miscategorized.

Source: own research; fiscaldata.treasury.gov DTS Table III-Abacktests/K9_dod_treasury_outlays.py

Validation

Medium

Confidence

Strong (N=67)

Sample Size

Low

Overfit Risk

Low

Regime Risk

Low

Cost Impact

Validation Summary

Solid sample (67 triggers) with t-stat 2.25 over 20 years. The mechanism (large DoD payments = contract milestones not yet in earnings) is logical and the data is free, daily, and machine-readable. ITA is liquid with low trading costs. However, portfolio CAGR is only 3.3% due to 12% exposure rate, and excess CAGR vs ITA B&H is -7.8% -- the signal underperforms simply holding the defense ETF. Some large outlays are payroll/pension, not contractor revenue.

What Breaks This

Treasury changes the DTS reporting structure (as it did around 2010) or DoD shifts to continuous small payments rather than lumpy milestones, eliminating the information content of single-day spikes.

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=141, n_events=9.

K5

★ ClinicalTrials.gov Phase-3 Completion Slip → Short Sponsor

single-name biotech · 20-day · CAUTION: N=9

2.75

Sharpe

+72.6%

CAGR

2.06

t-stat

9

events ⚠️

Mechanism

ClinicalTrials.gov requires sponsors to publish Primary Completion Dates for registered trials. The registry tracks every revision. A sudden 90+ day slip in a Phase-3 trial's Primary Completion Date almost always reflects enrollment problems, interim safety concerns, or efficacy issues that the sponsor has not yet disclosed via 8-K. The registry update is mechanical and often days ahead of formal investor communications.

Rule

When a registered Phase-3 trial's Primary Completion Date slips by >90 days in a single update, short the sponsor's stock for 20 trading days.

Why it shows up despite N=9

The headline Sharpe of 2.75 is real but ride-on-luck — N=9 events across 5 sponsors (PFE/MRK/BIIB/MRNA/REGN) is statistically weak. The t-stat of 2.06 just clears the threshold for "interesting." We're keeping it here because the mechanism is so clean and the data is free that even a small effect is worth pursuing with a larger universe.

Caveats

Critical: N=9 is too few to trust the Sharpe number. The right way to deploy this is across a much larger biotech universe (full XBI constituents) and over more years. Some slips are administrative (CRO change, COVID disruption) and benign. Survivorship in winning trials means false negatives. The 72% CAGR figure is the in-sample fit, not a realistic forward estimate.

Source: own research; ClinicalTrials.gov StudyVersions APIbacktests/K5_phase3_completion_slip.py

Validation

Low

Confidence

Very Weak (N=9)

Sample Size

Medium

Overfit Risk

Low

Regime Risk

High

Cost Impact

Validation Summary

N=9 is far too few to trust the spectacular Sharpe of 2.75 and 72.6% CAGR. The backtest spans only 141 active days across 5 large-cap sponsors (PFE/MRK/BIIB/MRNA/REGN). The signal is exploratory at best. The mechanism (Phase-3 date slips signal problems before disclosure) is genuinely clever and the data source (ClinicalTrials.gov) is free. But the per-event CAR of ~5.9% could easily reverse with a few more observations. Administrative slips (CRO changes, COVID) add noise. Borrow costs on biotech shorts can be severe.

What Breaks This

Expanding the universe beyond 5 mega-cap biotechs reveals that most Phase-3 slips are administrative and benign, diluting the edge to zero.

Demoted: no longer passes the tightened winner gate. Failed: Sharpe 0.23 ≤ 0.50; CAGR 1.4% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

J7

★ Fed RRP Drain → Short TLT (regime-specific)

bonds · 1-2 weeks · 2021-2024

+6.0%

excess CAGR vs B&H

0.23

Sharpe

210

events

Mechanism

When Fed overnight reverse-repo (RRP) facility usage drains rapidly, money-market funds are pulling cash out of the Fed to buy T-bills instead. That front-loads Treasury supply absorption at the short end, leaving the long end to clear at higher yields. The signal works because RRP usage is the most price-insensitive cash on the Fed's balance sheet.

Rule

Short TLT for 5 trading days when 5-day MA of FRED RRPONTSYD falls more than $50B week-over-week.

Why it shows up here

Most of the alpha was captured during the 2022-2024 RRP wind-down + bond bear market. 210 trigger events. Regime-specific edge — when the RRP is near zero (as it is now), the signal won't fire much.

Caveats

Backtest only since the RRP facility became material (2021). Likely captures the bond-bear-market regime more than a persistent edge. Bill-supply driven drains (Treasury issuance shifts) can be false positives. Edge will dissipate as the RRP empties structurally.

Source: own research; FRED RRPONTSYDbacktests/J7_rrp_drain.py

Validation

Low

Confidence

Strong (N=210)

Sample Size

Low

Overfit Risk

High

Regime Risk

Low

Cost Impact

Validation Summary

Despite 210 trigger events, the t-stat is only 0.57 -- well below significance. The Sharpe of 0.23 is weak. The entire alpha came from riding the 2022-2024 bond bear market while the RRP was draining. Now that RRP is near zero, the signal rarely fires. The +6% excess CAGR over TLT B&H is real but entirely regime-specific to a unique period (post-COVID RRP wind-down). Not a persistent edge -- it's a coincident indicator of a bond bear market that has already ended.

What Breaks This

The RRP facility is permanently near zero (as it currently is), meaning the signal never fires again -- the one-time structural drain has already occurred.

Demoted: no longer passes the tightened winner gate. Failed: Sharpe -0.53 ≤ 0.50; CAGR -1.3% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

L3-inv

★ Weak Treasury Auction Direct Bidders → LONG TLT (inverse)

bonds · 5-day hold · 1985-2026

-0.53

Sharpe (short)

-2.58

t-stat (inverse signal)

28

events

Mechanism

The original hypothesis: when Treasury auction "direct bidder" share (real-money domestic accounts) collapses below 5%, primary dealers warehouse the paper and hedge by shorting Treasury futures → bonds fall.

The data says the opposite

Empirically, weak direct bidder demand precedes flight-to-quality bond RALLIES, not selloffs. The mechanism likely runs the other way: directs disappear when domestic accounts are de-risking and parking in bills, and the same risk-off impulse bids up duration globally. The t-stat of -2.58 on the original short-TLT rule means the inverse (LONG TLT after weak directs) is the real signal.

Rule (inverted)

Long TLT for 5 sessions after any 10Y or 30Y Treasury auction where Direct bidder allotment falls below 5% (vs ~15% trailing average).

Caveats

28 events — small sample. Direct share is noisy at small auctions. One large pension absence isn't a regime shift. Treat as a tactical overlay, not a standalone alpha source.

Source: own research; TreasuryDirect.gov auction resultsbacktests/L3_auction_direct_bidder.py

Validation

Medium

Confidence

Moderate (N=28)

Sample Size

Low

Overfit Risk

Medium

Regime Risk

Low

Cost Impact

Validation Summary

The original short-TLT hypothesis was wrong (t-stat -2.58), but the inverse (long TLT) is statistically significant. 28 events from 2003-2026. The mechanism makes sense in hindsight: directs disappear when domestic accounts de-risk into bills, and the same risk-off impulse bids up duration. This is a discovered counter-signal, which is intellectually honest but introduces a subtle form of look-ahead bias (you tested one direction, found the opposite worked, and flipped it). TLT is liquid.

What Breaks This

The "weak directs = risk-off" relationship breaks if foreign central banks (not domestic accounts) become the marginal direct bidders, changing the information content of low direct share.

Demoted: no longer passes the tightened winner gate. Failed: Sharpe 0.33 ≤ 0.50; CAGR 4.9% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

D04

Industry Momentum

equities · 6-month rebalance

0.33

Sharpe

4.86%

CAGR

1.97

t-stat

36 yrs

sample

Mechanism

Moskowitz-Grinblatt showed that much of stock-level momentum is actually industry momentum. Trending industries continue to trend; you don't need to pick individual winners.

Rule

Monthly: rank Ken French 49 industries by trailing 6-month return. Long top 5, short bottom 5, equal-weight.

Why it's promising

Cleanest equity factor finding in the catalog because the universe (49 industries) has no survivorship bias and the rule is dead simple. t=1.97 over 36 years.

Caveats

Lower magnitude than single-stock momentum but more robust to crashes. Industry definitions are stable, so concentration risk is real (energy 1980s, tech 1999, financials 2008).

Source: Moskowitz-Grinblatt JF 1999backtests/D04_industry_momentum.py

Validation

Medium

Confidence

Strong (36 yrs, Ken French)

Sample Size

Low

Overfit Risk

Medium

Regime Risk

Medium

Cost Impact

Validation Summary

Cleanest factor dataset (Ken French 49 industries, no survivorship bias). t-stat 1.97 is borderline significant over 36 years. Sharpe 0.33 and CAGR 4.9% are both below the market (0.58 Sharpe, 7.9% CAGR). The long-short spread underperforms on absolute metrics but is more crash-robust than single-stock momentum. Industry portfolios have stable definitions, making the backtest clean. Transaction costs from monthly rebalancing of 10 industry baskets would further reduce the 4.9% CAGR.

What Breaks This

Industry concentration increases (as it has with tech dominance), making the top-5/bottom-5 industry spread less diversified and more subject to single-sector reversals.

H05

Hash Ribbons (Edwards)

crypto · 3-12 months

0.71

Sharpe

20.69%

CAGR

-71%

MaxDD

0.83

BTC B&H Sharpe

Mechanism

BTC mining hashrate decline = miner capitulation. The recovery (30d MA crosses back above 60d MA) historically marks accumulation phases — capitulation has cleared, weak hands gone.

Rule

Long BTC when 30-day MA of hashrate crosses back above 60-day MA, after having spent ≥14 days below it. Hold until next miner capitulation.

Caveats

Trails BTC buy-and-hold on raw Sharpe (0.71 vs 0.83) but materially reduces drawdown when used right. Signal fires rarely (~1-2/year). ETF era may decouple miner stress from price action.

Source: Charles Edwards (Capriole)backtests/H05_hash_ribbons.py

Validation

Medium

Confidence

Strong (4264 days, 11yr)

Sample Size

Low

Overfit Risk

High

Regime Risk

Low

Cost Impact

Validation Summary

t-stat 2.90 across 11 years. The hashrate-capitulation mechanism is mechanical and well-understood. However, Sharpe 0.71 trails BTC B&H's 0.83 and CAGR 20.7% vs 35.4% -- the signal underperforms simply holding BTC. MaxDD of -71% shows it doesn't meaningfully reduce drawdown. Best viewed as a cycle indicator for entry timing, not a standalone alpha source. The signal fires rarely (1-2x/year), limiting its utility. ETF era may decouple miner stress from price.

What Breaks This

Hashrate becomes dominated by a few large publicly traded miners with treasury strategies (like MARA's BTC treasury), making miner capitulation events less frequent and less informative.

H07 / H08

Puell Multiple & NUPL Proxy

crypto · 6-24 months

0.87

Sharpe (each)

29-34%

CAGR

-62 to -77%

MaxDD

0.96

BTC B&H Sharpe

Mechanism

Puell = daily USD value of newly issued BTC / 365d MA — captures miner profitability extremes. NUPL = (Market Cap - Realized Cap) / Market Cap — share of supply in unrealized profit. Both flag euphoria + capitulation.

Rule

Long BTC when Puell < 0.5 (miner capitulation); flat when Puell > 3.0. Mirror for NUPL: long < 0; reduce > 0.75.

Caveats

Both signals slightly trail BTC buy-and-hold on Sharpe in our sample. They're cycle indicators, not alpha generators — useful for sizing/risk management. ETF-era proxies may be less reliable than on-chain originals.

Source: D. Puell; Glassnodebacktests/H07_puell_multiple.py, H08_nupl.py

AE-2

★ Hyperscaler Capex Ratchet → Power-Infra Basket

industrials · 90 days (quarterly) · 2024-2025

1.23

Sharpe

57.2%

CAGR

-42.5%

MaxDD

+37.9%

excess vs SPY

5

events

Mechanism

When 3+ of {MSFT, GOOG, AMZN, META} guide capex UP >20% YoY in the same earnings season, it signals an impending wave of transformer, switchgear, and power-distribution orders. Supply-chain stocks lag by 1-2 quarters because (a) orders haven't hit backlog disclosures yet, and (b) industrial sell-side models update on their own earnings cycle, not hyperscaler cycles. The cross-sector information gap is the edge.

Rule

Each earnings season: if ≥3 of {MSFT, GOOG, AMZN, META} report capex >20% YoY in 10-Qs, long equal-weight {ETN, VRT, PWR, HUBB, AMSC, MOD} for 90 days from the last confirming earnings call.

Why it's promising

Fired every quarter since Q4-2023. Average per-event basket return +16.3%, average excess vs SPY +10.7%. Best events: Q4-2023 (+45.6%) and Q2-2024 (+41.9%) as the AI power buildout wave hit equipment backlogs. Q3-2024 was a loser (-22.2%) during the broad tech selloff — the basket carries high beta when sentiment reverses.

Caveats

Only 16 months of history (Feb 2024 – Jun 2025). MaxDD -42.5% is brutal — the basket is volatile mid-caps. The trigger has fired every single quarter, which means either it's a structural regime (plausible — AI capex isn't slowing) or the signal will become crowded fast. If hyperscalers guide down, the basket collapses. No transaction costs, no slippage on AMSC (thin).

Source: hyperscaler 10-Q capex filings; own researchbacktests/AE2_hyperscaler_capex.py

Validation

Low

Confidence

Very Weak (N=5)

Sample Size

Medium

Overfit Risk

High

Regime Risk

Medium

Cost Impact

Validation Summary

Only 5 quarterly events across 16 months of history. The 57.2% CAGR and 1.23 Sharpe are artifacts of catching the AI power buildout wave at its inception -- this is a momentum trade on a thematic narrative, not a systematic signal. t-stat of only 1.43 does not clear significance. One losing event (-22.2% in Q3-2024) shows the basket carries extreme drawdown risk (-42.5% MaxDD). AMSC is thinly traded and would suffer slippage. The trigger has fired every single quarter, which is suspicious.

What Breaks This

Hyperscalers guide capex down even once, or shift to on-site nuclear/microgrids, and the power-infra basket collapses -- the entire thesis is one earnings call away from reversal.

AG-2

Transformer Shortage → OEM vs Utility Pair

industrials · regime trade · 2022-2026

0.66

Sharpe

17.5%

CAGR

-40.3%

MaxDD

977d

in-position

Mechanism

US large power transformer lead times extended beyond 100 weeks (vs historical 30-50) in mid-2022, driven by AI data-center load + grid modernization + EV charging. Only ~4 domestic LPT factories exist. Transformer OEMs (Eaton, Hubbell) enjoy massive pricing power — ASPs up 30-50% with full backlogs. Meanwhile, utilities that promised 8-12% rate-base growth face project delays from physical transformer unavailability. The same bottleneck creates a winner (OEM) and a loser (growth utility) and no single analyst covers both sides.

Rule

When DOE reports transformer lead time >100 weeks: long equal-weight {ETN, HUBB} vs short {NEE, AEP}. Hold until lead times normalize below 60 weeks.

Why it's promising

ETN +230% and HUBB +181% cumulative since Jul 2022. The pair trade returned 17.5% CAGR / Sharpe 0.66 over nearly 3 years. Long-only OEM basket would be ~33% CAGR. The regime persists because transformer manufacturing capacity takes 3-5 years to build.

Caveats

SPY returned 20.8% over the same period — the pair trade slightly underperforms SPY on excess basis (-3.3%), though the long-only OEM leg crushes it. MaxDD -40.3%. A Defense Production Act invocation to accelerate production or a hyperscaler shift to on-site nuclear/microgrids would end the regime. Only one regime window backtested (N=1 structural event).

Source: DOE Transformer Availability Report; own researchbacktests/AG2_transformer_shortage.py

Validation

Low

Confidence

Very Weak (N=1 regime)

Sample Size

Medium

Overfit Risk

High

Regime Risk

Medium

Cost Impact

Validation Summary

This is fundamentally a single-observation regime trade (N=1 structural event: transformer shortage beginning mid-2022). t-stat of 1.29 is well below significance. The pair trade returned 17.5% CAGR but underperformed SPY by -3.3% -- you'd have been better off in the index. MaxDD -40% is steep. The logic (OEMs benefit from pricing power while utilities face delays) is sound but 977 days of one continuous trade is not a backtest, it's a case study. Not statistically validatable.

What Breaks This

A Defense Production Act invocation or hyperscaler shift to on-site generation normalizes transformer lead times below 60 weeks, ending the OEM pricing power regime overnight.

AC-6

★ Saudi Fiscal Breakeven → Brent

commodities · ~3 months (Q1 only) · 2018-2026

2.84

Sharpe

~16%

CAGR (portfolio)

+27.2%

mean / event

3.38

t-stat

6 / 9

events triggered

Mechanism

Saudi Arabia publishes its annual budget in late December assuming an implicit oil price — the "fiscal breakeven." When that figure sits materially above market Brent, the Saudi government has fiscal and political incentive to engineer supply cuts or extensions at the next OPEC+ JMMC. The Saudi budget document is the most actionable forward indicator of OPEC+ supply policy — printed weeks before the analyst chatter and months before the meeting itself.

Rule

On Jan 2 each year: if (Saudi budget fiscal breakeven − front-month Brent) > $12, long Brent (BZ=F) from Jan 2 through Mar 31. Otherwise stay in cash.

Why it's promising

5 of 6 triggered Q1 windows were positive: 2018 (+5.6%), 2019 (+24.5%), 2021 (+24.4%), 2024 (+15.3%), 2026 (+94.8%); only 2025 lost (−1.6%). Geometric portfolio CAGR holding cash in untriggered years is ~16%. Strip 2026 as an outlier and the mean per-window is still ~13%. The 3-month Saudi budget → JMMC reaction loop is fundamentally hard to arbitrage because the read requires patient calendar discipline, not infrastructure.

Caveats

N=6 triggered events — small sample. 2026's +94.8% return was amplified by simultaneous geopolitical shock to Brent ($60→$118); treat as a partial confound. Breakeven figures come from IMF Article IV reports (annual, published with ~6-month lag) and the Saudi MoF — these are estimates, not contracts. US shale supply growth can cap Brent even when Saudi cuts; gate down when US rig count is rising >5%/quarter. Brent futures roll costs not included.

Source: IMF Article IV MENA REO; Saudi MoF budgetbacktests/AC6_saudi_breakeven_brent.py

Validation

Low

Confidence

Weak (N=6 triggered)

Sample Size

Low

Overfit Risk

Medium

Regime Risk

Medium

Cost Impact

Validation Summary

Headline Sharpe of 2.84 and t-stat 3.38 are heavily inflated by 2026's +94.8% return (a geopolitical outlier). Strip that and the mean per-window drops to ~13%. Only 6 triggered events across 9 years. The mechanism (Saudi fiscal pressure incentivizes OPEC+ cuts) is sound but the breakeven estimate is itself uncertain (IMF Article IV reports lag by 6 months). Brent futures roll costs not included. The $12 gap threshold is one more fitted parameter. A neat idea but statistically too thin.

What Breaks This

Saudi Arabia diversifies revenue enough (Vision 2030) that fiscal breakeven no longer dictates OPEC+ supply policy, or US shale output responds faster than OPEC+ can cut.

PL5

★ Housing Permits/Completions Ratio → Long Aggregates

equities · 6-month hold · 2012-2022

1.14

Sharpe

31.4%

CAGR (in-pos)

-27.3%

MaxDD

2.53

t-stat

92%

win rate (11/12)

Mechanism

Census Bureau publishes monthly housing permits and completions (free on FRED). When the permits-to-completions ratio crosses above 1.3, it means more homes are being approved than finished — a construction backlog is building. Each permitted home requires 100-400 tons of aggregate and 10-30 cubic yards of concrete. Aggregate/cement producers (VMC, MLM, EXP) have local monopoly pricing power because trucking radius limits competition. Rising volume + pricing power = operating leverage — margins expand 200-400bps over the next 2-3 quarters.

Rule

When FRED PERMIT/COMPUTSA ratio crosses above 1.3: long equal-weight {VMC, MLM, EXP} for 6 months (126 trading days).

Why it's promising

11 of 12 trigger events produced positive returns. Average 6-month basket return +18.1%, average excess vs SPY +14.0%. t-stat 2.53 across 12 events spanning 2012-2022. The only loser was Dec 2021 (entering into the 2022 rate shock). The mechanism is nearly structural: permits are a legal commitment to build, and each build requires physical aggregate that only local producers can supply.

Caveats

The ratio can stay above 1.3 for extended periods (not just a momentary cross), so signal timing requires watching the first crossing. In a severe rate-hiking cycle (2022), even guaranteed demand doesn't prevent multiple compression. VMC/MLM/EXP are cyclical industrials with 25-30% drawdowns in downturns. FRED data has a 1-month publication lag.

Source: FRED PERMIT + COMPUTSA; own researchbacktests/PL5_housing_permits_completions_aggregate.py

AI-4

GLP-1 Mass Adoption → Short Premium Spirits

consumer / healthcare · regime trade · 2023-2026

0.60

Sharpe

13.1%

CAGR

-22.4%

MaxDD

747d

in-position

Mechanism

GLP-1 drugs (Ozempic, Wegovy, Zepbound) have reached mass adoption (~15M+ US patients). Clinical data shows a 29% reduction in drinking frequency among users. This creates a structural headwind for spirits volume that the market initially dismissed as "GLP-1 adherence is low." But adherence is improving and the prescriptions keep growing. Constellation Brands (STZ) faces a double whammy: GLP-1 volume drag PLUS tariff exposure on 85% Mexico-sourced revenue. Brown-Forman (BF-B) sees organic revenue stalling for the first time in decades.

Rule

Short equal-weight {BF-B, STZ} when GLP-1 cumulative prescriptions exceed 15M AND quarterly organic spirits volume declines >3%. Regime trade — hold until prescription growth reverses or volume recovers.

Caveats

STZ already -45% from peak — much of the decline may be priced in. Excess CAGR vs SPY is -10% (SPY returned 23% over this period). The short thesis worked but a simple long-SPY position outperformed. GLP-1 supply constraints or coverage limits could stall the adoption curve. Tariff rollback would specifically help STZ.

Source: STEP trial data; Nielsen spirits scanner; BF.B/STZ earningsbacktests/AI4_glp1_spirits_short.py

Validation

Low

Confidence

Very Weak (N=1 regime)

Sample Size

Medium

Overfit Risk

High

Regime Risk

Medium

Cost Impact

Validation Summary

Single regime trade (N=1) with only 747 days of history. The 13.1% CAGR looks decent but excess CAGR vs SPY is -10% -- you'd have made much more just holding the index. STZ's decline is confounded by Mexico tariff exposure (not purely GLP-1). The t-stat of 1.03 is not significant. Much of the spirits decline may already be priced in (STZ -45% from peak). The mechanism (GLP-1 reduces drinking) has clinical support but translating that to stock alpha is a stretch with N=1 and multiple confounds.

What Breaks This

GLP-1 supply constraints or insurance coverage limits slow the adoption curve, spirits volume stabilizes, and tariff rollback produces a sharp STZ rebound.

Demoted: no longer passes the tightened winner gate. Failed: Sharpe 0.49 ≤ 0.50; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness).

AI-5

Office CMBS Stress → Regional Bank CRE Pair

financials · regime trade · 2023-2026

0.49

Sharpe

11.7%

CAGR

-34.1%

MaxDD

850d

in-position

Mechanism

Remote work has become structural — Kastle back-to-work barometer stuck below 50%. This feeds a 5-link chain: 1) structural vacancy → 2) office CMBS 60+ delinquency at 12.3% (exceeding 2009 GFC peak) → 3) $270B+ office CMBS matures 2026-2028 at 40-60% LTV haircuts → 4) regional banks with CRE >300% of Tier 1 capital (OZK, NYCB/Flagstar) face write-downs → 5) pair spread widens vs diversified large banks (JPM, WFC).

Rule

When Trepp office CMBS 60+ DQ exceeds 10% AND specific regional bank CRE concentration exceeds 300% Tier 1: long {JPM, WFC} vs short {OZK, NYCB/FLG}, equal-weight pair.

Caveats

Excess CAGR vs SPY is -12% — long JPM/WFC leg carried the return, the short leg underperformed. MaxDD -34% is steep. NYCB restructured to Flagstar Financial in 2024, muddying the backtest. A Fed rate-cutting cycle would reduce refinancing pain and narrow the spread. Office-to-residential conversion programs could stabilize valuations.

Source: Trepp CMBS data; FDIC call reports; Kastle barometerbacktests/AI5_office_cmbs_regional_bank.py

Validation

Low

Confidence

Very Weak (N=1 regime)

Sample Size

Medium

Overfit Risk

High

Regime Risk

Medium

Cost Impact

Validation Summary

Another N=1 regime trade (850 days). t-stat 0.89 is nowhere near significance. Excess CAGR vs SPY is -11.7% -- the long JPM/WFC leg carried the returns while the short OZK/NYCB leg barely contributed. MaxDD -34% is steep. NYCB restructured to Flagstar Financial mid-trade, creating a structural break in the backtest. The mechanism (office vacancy hits CRE-exposed regionals) is directionally correct but the pair construction doesn't isolate the thesis from general bank-sector beta.

What Breaks This

Fed rate cuts reduce refinancing pain for office CMBS, narrowing the pair spread; or regional banks provision aggressively and the market rewards the clarity with a rerating.

AS-B

★ Inference Cost Reversal → Custom Silicon (MRVL + CRDO)

semiconductors · regime trade · 2024-2026

1.58

Sharpe

129%

CAGR

-60.1%

MaxDD

+107%

excess vs SPY

Mechanism

Frontier LLM token pricing collapsed 97% from 2023-2025, then GPT-5.5 doubled pricing — signaling the deflationary era for frontier inference is ending. Unprofitable inference on general-purpose NVIDIA GPUs forces hyperscalers to accelerate custom ASIC adoption (Google TPUs via Broadcom, Amazon Trainium, Meta chips via Marvell). Custom silicon delivers 2-5x better cost-per-token. Marvell's data center revenue +46% YoY, Credo's +85% YoY. The value is migrating from GPU rental to inference-optimized custom silicon and optical interconnect.

Rule

When frontier model providers raise token pricing (reversing the deflationary trend) AND hyperscaler custom ASIC orders accelerate: long equal-weight {MRVL, CRDO}. Regime trade — hold while inference cost pressure drives custom silicon adoption.

Caveats

Only 17 months of data (Jan 2024 – May 2026). MaxDD -60% is extreme — CRDO dropped from $80 to $30 during the Oct-Dec 2024 selloff before recovering to $100+. CRDO alone up 1022% from entry. If NVIDIA cuts GPU pricing to compete with custom silicon, the thesis weakens. Both names trade at elevated multiples.

Source: LLM pricing pages; hyperscaler ASIC announcements; own researchbacktests/ASB_custom_silicon.py

Validation

Low

Confidence

Very Weak (N=1 regime)

Sample Size

High

Overfit Risk

High

Regime Risk

Medium

Cost Impact

Validation Summary

The 129% CAGR and 1.58 Sharpe come from a single 17-month trade (N=1). CRDO alone was up 1022% from entry -- this is a momentum trade on one stock, not a systematic signal. MaxDD of -60% (CRDO dropped from $80 to $30 in Q4-2024) is catastrophic. Both names trade at extreme multiples. The "trigger" (inference pricing reversal) is a narrative overlay on what amounts to "buy two high-beta semi names in a bull market." Not statistically validatable by any standard.

What Breaks This

NVIDIA cuts GPU pricing aggressively to compete with custom ASICs, or a single hyperscaler cancels its custom chip program, puncturing the narrative.

AQ-1

★ Copper Near Marginal Cost Floor → Long HG=F

commodities · event-driven · 2016-2023

1.79

Sharpe

45.8%

CAGR (in-pos)

-16.7%

MaxDD

3.10

t-stat

3 / 3

positive trades

Mechanism

Copper ore grades are declining structurally (Escondida: 1.6% → 1.0%; Grasberg: -20% in Q1 2025). This pushes all-in sustaining costs up every year. When spot copper approaches the 90th-percentile cost curve (~$3.80-4.00/lb in 2026), high-cost mines curtail production, mechanically tightening supply. The floor mechanism caps downside while upside is uncapped. Each year Chilean desalination mandates push costs higher still (seawater share 36% → projected 66% by 2034). The floor only rises.

Rule

When spot copper (HG=F) trades within 15% of the estimated 90th-percentile AISC (from Cochilco + miner 10-Q cost disclosures), go long HG=F. Hold 12 months.

Why it's promising

3 for 3 positive trades: Jan 2016 entry at $1.94/lb (+80% in 18mo), Mar 2020 at $2.10/lb (+67%), Jul 2022 at $3.30/lb (+21%). t-stat 3.10 across the three events. The mechanism is nearly tautological — producers below their own cost of production stop producing.

Caveats

N=3 trades, each ~12 months. Portfolio CAGR including cash periods is ~12% — the signal only fires when copper is cheap. The floor estimate is itself an estimate — Cochilco publishes annually with lag. Recycling and substitution can dampen demand enough to keep copper at depressed levels longer than miners can sustain losses (2015 lasted ~18 months).

Source: Cochilco cost reports; miner 10-Q ore grades; own researchbacktests/AQ1_copper_marginal_cost.py

Validation

Low

Confidence

Very Weak (N=3)

Sample Size

Low

Overfit Risk

Medium

Regime Risk

Medium

Cost Impact

Validation Summary

3 for 3 positive trades with Sharpe 1.79 and t-stat 3.10 -- but N=3 is inherently untrustable regardless of in-sample metrics. The mechanism is nearly tautological (producers below cost stop producing, tightening supply), which is a genuine structural strength. However, the 90th-percentile AISC estimate is itself an estimate that changes annually and is published with lag. Portfolio CAGR including cash periods is ~12%. Futures roll costs not included. Each trade is a 12-month bet on a commodity.

What Breaks This

Copper demand destruction from substitution (aluminum for wiring, fiber for copper in telecoms) or a global recession keeps copper at depressed levels longer than miners can sustain losses.

PL83_continued_claims_decline_iwm

Continued Claims Decline → Long IWM

small-cap equities (IWM) · 3 events

2.03

Sharpe

48.4%

CAGR

-13.0%

MaxDD

2.48

t-stat

Mechanism

Signal based on Long IWM 126d when CCSA declines 8+ weeks from near-peak.

Rule

Long IWM 126d when CCSA declines 8+ weeks from near-peak

Caveats

N=3 events. In-position CAGR — portfolio contribution depends on trigger frequency. Low event count limits statistical confidence.

Source: FRED CCSA; yfinancebacktests/PL83_continued_claims_decline_iwm.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=40, n_events=4.

PL3_china_pmi_expansion_metals

China PMI Expansion → Long Copper

copper futures · 10-day hold · 4 events

1.81

Sharpe

86.2%

CAGR

-17.8%

MaxDD

0.72

t-stat

Mechanism

China PMI expansion cross signals industrial demand recovery; copper reprices upward

Rule

Long HG=F for 10 trading days when China Manufacturing PMI crosses above 50 after 2+ months below

Caveats

N=4 events. In-position CAGR — portfolio contribution depends on trigger frequency. Low event count limits statistical confidence. t-stat 0.72 below conventional significance threshold. Underperforms benchmark on risk-adjusted basis in this sample window. Negative excess CAGR (-29.9pp) vs benchmark — alpha questionable.

Source: China NBS PMI; yfinancebacktests/PL3_china_pmi_expansion_metals.py

PL73_commercial_paper_spike_gold

CP Spike → Long GLD

gold · ~8-week hold · 6 events

1.79

Sharpe

40.1%

CAGR

-15.2%

MaxDD

1.79

t-stat

Mechanism

CP market stress → backstage liquidity pressure → gold flight-to-quality

Rule

Long GLD 42 days when COMPOUT 4-week change > +10%

Caveats

N=6 events. In-position CAGR — portfolio contribution depends on trigger frequency.

Source: FRED COMPOUT; yfinancebacktests/PL73_commercial_paper_spike_gold.py

PL78_gdpnow_consensus_gap_spy

GDPNow > 3% → Long SPY

equities (SPY) · 10 events

1.70

Sharpe

29.9%

CAGR

-17.1%

MaxDD

2.18

t-stat

Mechanism

Signal based on Long SPY 42d when GDPNOW > 3.0%.

Rule

Long SPY 42d when GDPNOW > 3.0%

Caveats

N=10 events. In-position CAGR — portfolio contribution depends on trigger frequency.

Source: FRED GDPNOW; yfinancebacktests/PL78_gdpnow_consensus_gap_spy.py

PL97_core_capex_orders_capital_goods

Core Capex Orders Turn → Long ETN+ROK+AME

capital goods equities · 3-month hold · 5 events

1.60

Sharpe

40.4%

CAGR

-22.2%

MaxDD

2.53

t-stat

Mechanism

Signal based on Long ETN+ROK+AME 126d when NEWORDER 3mo MA YoY turns positive after 6+ months negative.

Rule

Long ETN+ROK+AME 126d when NEWORDER 3mo MA YoY turns positive after 6+ months negative

Caveats

N=5 events. In-position CAGR — portfolio contribution depends on trigger frequency. Low event count limits statistical confidence.

Source: FRED NEWORDER; yfinancebacktests/PL97_core_capex_orders_capital_goods.py

PL41_existing_home_low_supply_builders

Existing Home Low Supply → Long Homebuilders

homebuilders · ~12-month hold · 1 events

1.54

Sharpe

61.5%

CAGR

-15.0%

MaxDD

1.54

t-stat

Mechanism

Extreme existing-home scarcity forces buyers to new construction → homebuilder pricing power

Rule

Long LEN+DHI+NVR for 252 days when MSACSR crosses below 3.0 months

Caveats

N=1 events. In-position CAGR — portfolio contribution depends on trigger frequency. Low event count limits statistical confidence. Underperforms benchmark on risk-adjusted basis in this sample window.

Source: FRED MSACSR; yfinancebacktests/PL41_existing_home_low_supply_builders.py

PL95_savings_rate_decline_leisure

Savings Rate Decline → Long Leisure

leisure/travel equities · 12-month hold · 4 events

1.51

Sharpe

55.5%

CAGR

-25.1%

MaxDD

1.85

t-stat

Mechanism

Signal based on Long BKNG+MAR+H 126d when PSAVERT drops >4pp from 12mo peak.

Rule

Long BKNG+MAR+H 126d when PSAVERT drops >4pp from 12mo peak

Caveats

N=4 events. In-position CAGR — portfolio contribution depends on trigger frequency. Significant drawdown (25%) during adverse regimes. Low event count limits statistical confidence. Underperforms benchmark on risk-adjusted basis in this sample window.

Source: FRED PSAVERT; yfinancebacktests/PL95_savings_rate_decline_leisure.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=226, n_events=2.

PL62_m2_reacceleration_gold

M2 Reacceleration → Long Gold

gold · ~12-month hold · 2 events

1.38

Sharpe

40.6%

CAGR

-19.2%

MaxDD

1.30

t-stat

Mechanism

Monetary expansion drives gold with 6-12mo lag as liquidity flows to hard assets

Rule

Long GLD 252 days when M2 YoY crosses above +4% after 6+ months below +2%

Caveats

N=2 events. In-position CAGR — portfolio contribution depends on trigger frequency. Low event count limits statistical confidence. t-stat 1.30 below conventional significance threshold. Underperforms benchmark on risk-adjusted basis in this sample window.

Source: FRED M2SL; yfinancebacktests/PL62_m2_reacceleration_gold.py

PL42_semi_b2b_crossing_one_smh

SEMI B2B Cross 1.0 -> Long SMH

semiconductors (SMH) · 6-month hold · 5 events

1.36

Sharpe

46.7%

CAGR

-33.6%

MaxDD

2.15

t-stat

Mechanism

B2B crossing 1.0 signals order book turning positive -> semiconductor sector recovery begins -> SMH rallies as earnings estimates inflect

Rule

Long SMH 6mo when SEMI B2B crosses 1.0 from below after 3+ months under

Caveats

N=5 events. In-position CAGR — portfolio contribution depends on trigger frequency. Significant drawdown (34%) during adverse regimes. Low event count limits statistical confidence.

Source: SEMI B2B press releases (hand-coded crossing dates); yfinance SMHbacktests/PL42_semi_b2b_crossing_one_smh.py

PL48_cre_distressed_alt_managers

CRE Trough -> Long Alt Managers

alt managers · 12-month hold · 2 events

1.32

Sharpe

44.3%

CAGR

-21.3%

MaxDD

1.87

t-stat

Mechanism

CRE distress cycle trough -> alt managers deploy dry powder at discounted prices -> AUM/fee re-rating -> stock re-rates

Rule

Long BX+ARES+KKR equal-weight 12mo, entry 12mo after CRE volume trough

Caveats

N=2 events. In-position CAGR — portfolio contribution depends on trigger frequency. Low event count limits statistical confidence. Underperforms benchmark on risk-adjusted basis in this sample window.

Source: MSCI RCA CRE volume (hand-coded trough dates); yfinance BX, ARES, KKRbacktests/PL48_cre_distressed_alt_managers.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=90, n_events=3.

PL49_plantings_corn_soy_spread

Plantings Surprise -> Long Soy

soybean futures · 3 events

1.29

Sharpe

23.0%

CAGR

-12.7%

MaxDD

0.77

t-stat

Mechanism

Corn acreage expansion at soy expense -> soy supply squeeze -> soy/corn spread widens -> soy rallies

Rule

Long ZS=F 30d on Mar 31 when corn acreage surprises high / soy low

Caveats

N=3 events. In-position CAGR — portfolio contribution depends on trigger frequency. Low event count limits statistical confidence. t-stat 0.77 below conventional significance threshold.

Source: USDA Prospective Plantings (hand-coded surprise years); yfinance ZS=F, ZC=Fbacktests/PL49_plantings_corn_soy_spread.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=250, n_events=2.

PL59_durable_goods_surge_industrials

Durable Goods Surge → Long XLI

industrials (XLI) · ~6-month hold · 2 events

1.28

Sharpe

31.5%

CAGR

-18.6%

MaxDD

1.27

t-stat

Mechanism

Sustained durable goods strength → capex cycle inflection → industrials rally

Rule

Long XLI 126 days when DGORDER > 105% of trailing 6mo avg for 3 consecutive months

Caveats

N=2 events. In-position CAGR — portfolio contribution depends on trigger frequency. Low event count limits statistical confidence. t-stat 1.27 below conventional significance threshold.

Source: FRED DGORDER; yfinancebacktests/PL59_durable_goods_surge_industrials.py

PL76_yield_curve_uninversion_cyclicals

Yield Curve Un-Inversion → Long XLI+XLF

industrials (XLI) · 3 events

1.22

Sharpe

23.8%

CAGR

-16.1%

MaxDD

1.24

t-stat

Mechanism

Un-inversion signals recession troughing → cyclical recovery

Rule

Long XLI+XLF 252d when T10Y2Y crosses above 0 after 12+ months inverted

Caveats

N=3 events. In-position CAGR — portfolio contribution depends on trigger frequency. Low event count limits statistical confidence. t-stat 1.24 below conventional significance threshold.

Source: FRED T10Y2Y; yfinancebacktests/PL76_yield_curve_uninversion_cyclicals.py

PL75_durable_goods_ex_transport_xli

Durable Goods ex-Transport Turn → Long XLI

industrials (XLI) · ~6-month hold · 6 events

1.19

Sharpe

23.2%

CAGR

-18.6%

MaxDD

2.04

t-stat

Mechanism

Core capex orders inflection → manufacturing demand recovery

Rule

Long XLI 126 days when NEWORDER YoY turns positive after 6+ months negative

Caveats

N=6 events. In-position CAGR — portfolio contribution depends on trigger frequency.

Source: FRED NEWORDER; yfinancebacktests/PL75_durable_goods_ex_transport_xli.py

PL38_rrp_drain_equity_liquidity

RRP Drain → Long SPY

equities (SPY) · ~12-month hold · 6 events

1.04

Sharpe

18.9%

CAGR

-18.8%

MaxDD

1.21

t-stat

Mechanism

RRP drain → liquidity flowing back into banking system and risk assets

Rule

Long SPY when ON RRP drains below $500B (after being >$1T) for 252 days, or when draining >$100B/month

Caveats

N=6 events. In-position CAGR — portfolio contribution depends on trigger frequency. t-stat 1.21 below conventional significance threshold.

Source: FRED RRPONTSYD; yfinancebacktests/PL38_rrp_drain_equity_liquidity.py

PL35_drought_outlook_water_infra

Drought Outlook -> Long Water Infra

cross-asset · 6-month hold · 4 events

0.99

Sharpe

19.5%

CAGR

-21.7%

MaxDD

1.40

t-stat

Mechanism

Drought -> municipal/industrial water infrastructure capex accelerates -> AWK/XYL revenue tailwind

Rule

Long AWK+XYL equal-weight 6mo when NOAA drought outlook flags 3+ Western states

Caveats

N=4 events. In-position CAGR — portfolio contribution depends on trigger frequency. Low event count limits statistical confidence. t-stat 1.40 below conventional significance threshold.

Source: NOAA seasonal drought outlook (hand-coded); yfinance AWK, XYLbacktests/PL35_drought_outlook_water_infra.py

PL87_twd_decline_industrial_exporters

TWD Decline → Long Exporters

cross-asset · 6-month hold · 9 events

0.98

Sharpe

26.2%

CAGR

-42.1%

MaxDD

2.09

t-stat

Mechanism

Signal based on Long HON+CAT+DE 126d when DTWEXBGS drops >5% from 6mo peak.

Rule

Long HON+CAT+DE 126d when DTWEXBGS drops >5% from 6mo peak

Caveats

N=9 events. In-position CAGR — portfolio contribution depends on trigger frequency. Deep drawdown (42%) requires position sizing discipline.

Source: FRED DTWEXBGS; yfinancebacktests/PL87_twd_decline_industrial_exporters.py

PL10_class8_truck_replacement

Class 8 Replacement Cycle

truck/transport equities · 12-month hold · 4 events

0.98

Sharpe

23.5%

CAGR

-21.4%

MaxDD

1.97

t-stat

Mechanism

Deferred truck replacement during downturn creates forced replacement cycle 18mo later

Rule

12 months after Class 8 order trough: long PCAR+CMI for 12 months

Caveats

N=4 events. In-position CAGR — portfolio contribution depends on trigger frequency. Low event count limits statistical confidence. Underperforms benchmark on risk-adjusted basis in this sample window. Negative excess CAGR (-0.0pp) vs benchmark — alpha questionable.

Source: ACT Research Class 8 order data; yfinancebacktests/PL10_class8_truck_replacement.py

PL742_core_capex_orders_automation

Core Capex A34SNO Turn → Long ROK+ETN+AME

capital goods equities · 3-month hold · 8 events

0.98

Sharpe

22.0%

CAGR

-28.3%

MaxDD

1.96

t-stat

Mechanism

Signal based on Long ROK+ETN+AME 126d when A34SNO 3mo MA YoY turns positive after 6+ months negative.

Rule

Long ROK+ETN+AME 126d when A34SNO 3mo MA YoY turns positive after 6+ months negative

Caveats

N=8 events. In-position CAGR — portfolio contribution depends on trigger frequency. Significant drawdown (28%) during adverse regimes. Underperforms benchmark on risk-adjusted basis in this sample window.

Source: FRED A34SNO; yfinancebacktests/PL742_core_capex_orders_automation.py

PL82_mfg_weekly_hours_cyclicals

Mfg Hours → Long XLI

industrials (XLI) · 5 events

0.98

Sharpe

19.2%

CAGR

-18.6%

MaxDD

1.37

t-stat

Mechanism

Signal based on Long XLI 126d when AWHMAN crosses 41.0 after 4+ months below 40.5.

Rule

Long XLI 126d when AWHMAN crosses 41.0 after 4+ months below 40.5

Caveats

N=5 events. In-position CAGR — portfolio contribution depends on trigger frequency. Low event count limits statistical confidence. t-stat 1.37 below conventional significance threshold.

Source: FRED AWHMAN; yfinancebacktests/PL82_mfg_weekly_hours_cyclicals.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=186, n_events=3.

PL46_dxy_spike_em_reversal

DXY Spike → Long EEM

emerging markets · ~12-week hold · 3 events

0.97

Sharpe

34.2%

CAGR

-26.4%

MaxDD

0.84

t-stat

Mechanism

EM equities mean-revert after forced-selling exhaustion from dollar strength

Rule

Long EEM 63 days after DXY 63d return spikes >+8% then rolls back below +4%

Caveats

N=3 events. In-position CAGR — portfolio contribution depends on trigger frequency. Significant drawdown (26%) during adverse regimes. Low event count limits statistical confidence. t-stat 0.84 below conventional significance threshold.

Source: FRED DTWEXBGS; yfinancebacktests/PL46_dxy_spike_em_reversal.py

PL57_housing_starts_lumber_lag

Housing Starts → Long WOOD (lagged)

cross-asset · ~6-month hold · 14 events

0.90

Sharpe

17.9%

CAGR

-18.5%

MaxDD

1.96

t-stat

Mechanism

Housing starts surge → lumber demand peaks during framing phase 2-4 months later

Rule

Long WOOD 126 days, entering 63 days after HOUST YoY > +15% for 2 consecutive months

Caveats

N=14 events. In-position CAGR — portfolio contribution depends on trigger frequency. Underperforms benchmark on risk-adjusted basis in this sample window. Negative excess CAGR (-2.5pp) vs benchmark — alpha questionable.

Source: FRED HOUST; yfinancebacktests/PL57_housing_starts_lumber_lag.py

PL14_soybean_export_season

Soybean Export Season (Oct-Jan)

soybean futures · 16 events

0.88

Sharpe

16.1%

CAGR

-20.5%

MaxDD

2.05

t-stat

Mechanism

Peak US soybean export season — China/global buyers take new-crop delivery

Rule

Long ZS=F Oct 1 through Jan 31 each year

Caveats

N=16 events. In-position CAGR — portfolio contribution depends on trigger frequency. Underperforms benchmark on risk-adjusted basis in this sample window. Negative excess CAGR (-5.9pp) vs benchmark — alpha questionable.

Source: yfinance ZS=F; USDA export calendarbacktests/PL14_soybean_export_season.py

PL60_used_car_cpi_disinflation_tlt

Used Car CPI Disinflation → Long TLT

bonds (TLT) · ~12-week hold · 10 events

0.86

Sharpe

12.9%

CAGR

-16.6%

MaxDD

1.29

t-stat

Mechanism

Used car deflation pulls headline CPI lower → supports disinflation expectations and bonds

Rule

Long TLT 63 days when CPI used cars YoY drops below -5%

Caveats

N=10 events. In-position CAGR — portfolio contribution depends on trigger frequency. t-stat 1.29 below conventional significance threshold.

Source: FRED CUSR0000SETA02; yfinancebacktests/PL60_used_car_cpi_disinflation_tlt.py

PL30_rig_count_production_lag

Rig Count Lag → Long XOP

oil & gas (XOP) · 6-month hold · 5 events

0.80

Sharpe

27.5%

CAGR

-41.1%

MaxDD

1.26

t-stat

Mechanism

Rig decline → DUC depletion over 6mo → production stalls → supply tightens → oil/E&P rallies

Rule

6mo after rig count crosses -20% from 12mo peak: long XOP for 6 months

Caveats

N=5 events. In-position CAGR — portfolio contribution depends on trigger frequency. Deep drawdown (41%) requires position sizing discipline. Low event count limits statistical confidence. t-stat 1.26 below conventional significance threshold. Underperforms benchmark on risk-adjusted basis in this sample window. Negative excess CAGR (-4.5pp) vs benchmark — alpha questionable.

Source: Baker Hughes rig count (hand-coded crossing dates); yfinance XOPbacktests/PL30_rig_count_production_lag.py

PL69_capacity_utilization_recovery_xlb

CapUtil Recovery → Long XLB

materials (XLB) · 6-month hold · 4 events

0.79

Sharpe

17.4%

CAGR

-24.5%

MaxDD

1.12

t-stat

Mechanism

Factory restarts → raw material demand recovers → materials stocks benefit

Rule

FRED TCU crosses above 76% after 6+ months below → long XLB 6mo

Caveats

N=4 events. In-position CAGR — portfolio contribution depends on trigger frequency. Low event count limits statistical confidence. t-stat 1.12 below conventional significance threshold. Underperforms benchmark on risk-adjusted basis in this sample window. Negative excess CAGR (-2.4pp) vs benchmark — alpha questionable.

Source: FRED TCU + yfinancebacktests/PL69_capacity_utilization_recovery_xlb.py

PL79_resi_construction_cement

Resi Construction → Long EXP+SUM

cross-asset · 3-month hold · 22 events

0.77

Sharpe

21.4%

CAGR

-38.0%

MaxDD

2.54

t-stat

Mechanism

Signal based on Long EXP+SUM 126d when PRRESCONS 3mo ann. > +15%.

Rule

Long EXP+SUM 126d when PRRESCONS 3mo ann. > +15%

Caveats

N=22 events. In-position CAGR — portfolio contribution depends on trigger frequency. Significant drawdown (38%) during adverse regimes.

Source: FRED PRRESCONS; yfinancebacktests/PL79_resi_construction_cement.py

PL43_lei_inflection_long_spy

LEI Inflection → Long SPY

equities (SPY) · ~12-month hold · 13 events

0.75

Sharpe

12.2%

CAGR

-33.7%

MaxDD

2.70

t-stat

Mechanism

LEI trough signals recession bottom → SPY rallies in early recovery

Rule

Long SPY for 252 days when LEI 6mo RoC turns positive after 6+ negative months

Caveats

N=13 events. In-position CAGR — portfolio contribution depends on trigger frequency. Significant drawdown (34%) during adverse regimes.

Source: FRED USSLIND; yfinancebacktests/PL43_lei_inflection_long_spy.py

PL15_wheat_drought_abandonment

Wheat Drought Abandonment (Mar-May)

wheat futures · 6 events

0.74

Sharpe

22.7%

CAGR

-22.2%

MaxDD

0.92

t-stat

Mechanism

Drought during dormancy → high abandonment (20-35% vs 10% normal) → WASDE cuts production → KC wheat rallies

Rule

Long KE=F Mar 1 - May 31 in years with USDA winter wheat poor/very-poor >35% by March

Caveats

N=6 events. In-position CAGR — portfolio contribution depends on trigger frequency. t-stat 0.92 below conventional significance threshold.

Source: USDA Crop Progress winter wheat conditions; yfinance KE=Fbacktests/PL15_wheat_drought_abandonment.py

PL94_umich_sentiment_low_xrt

UMich Sentiment Trough → Long XRT

retail (XRT) · 3 events

0.74

Sharpe

17.9%

CAGR

-15.5%

MaxDD

0.75

t-stat

Mechanism

Signal based on Long XRT 126d when UMCSENT below 55 then rises 2 consecutive months.

Rule

Long XRT 126d when UMCSENT below 55 then rises 2 consecutive months

Caveats

N=3 events. In-position CAGR — portfolio contribution depends on trigger frequency. Low event count limits statistical confidence. t-stat 0.75 below conventional significance threshold. Underperforms benchmark on risk-adjusted basis in this sample window.

Source: FRED UMCSENT; yfinancebacktests/PL94_umich_sentiment_low_xrt.py

PL34_ism_orders_inventories_spread

ISM Orders-Inventories → Long XLI

industrials (XLI) · ~6-month hold · 10 events

0.72

Sharpe

11.8%

CAGR

-31.3%

MaxDD

1.61

t-stat

Mechanism

Strong orders + lean inventories = restocking cycle imminent → industrials rally

Rule

Long XLI for 126 days when ISM New Orders - Inventories spread crosses above +10

Caveats

N=10 events. In-position CAGR — portfolio contribution depends on trigger frequency. Significant drawdown (31%) during adverse regimes. Underperforms benchmark on risk-adjusted basis in this sample window.

Source: ISM Manufacturing Report (hand-coded); yfinancebacktests/PL34_ism_orders_inventories_spread.py

PL66_sloos_easing_small_cap_iwm

SLOOS Easing → Long IWM

small-cap equities (IWM) · 12-month hold · 5 events

0.70

Sharpe

14.7%

CAGR

-41.1%

MaxDD

1.56

t-stat

Mechanism

Credit cycle trough → small caps recover fastest when banks start lending again

Rule

SLOOS (hand-coded (SLOOS)) crosses from tightening to easing after 4+ quarters tight → long IWM 12mo

Caveats

N=5 events. In-position CAGR — portfolio contribution depends on trigger frequency. Deep drawdown (41%) requires position sizing discipline. Low event count limits statistical confidence. Underperforms benchmark on risk-adjusted basis in this sample window.

Source: FRED hand-coded (SLOOS) + yfinancebacktests/PL66_sloos_easing_small_cap_iwm.py

PL23_semi_b2b_specialty_chem

SEMI B2B Inflection → Specialty Chem+Equip

cross-asset · 6-month hold · 4 events

0.69

Sharpe

23.0%

CAGR

-40.6%

MaxDD

0.98

t-stat

Mechanism

B2B inflection signals fab reactivation → wafer starts ramp → consumable chemical demand rises linearly → specialty chem re-rates

Rule

Long ENTG+AMAT+LRCX equal-weight 6mo when SEMI B2B crosses 1.05 from trough

Caveats

N=4 events. In-position CAGR — portfolio contribution depends on trigger frequency. Deep drawdown (41%) requires position sizing discipline. Low event count limits statistical confidence. t-stat 0.98 below conventional significance threshold. Underperforms benchmark on risk-adjusted basis in this sample window.

Source: SEMI B2B press releases (hand-coded inflection dates); yfinancebacktests/PL23_semi_b2b_specialty_chem.py

PL64_bank_credit_reacceleration_xlf

Bank Credit Reacceleration → Long XLF

financials (XLF) · 12-month hold · 3 events

0.60

Sharpe

11.3%

CAGR

-33.0%

MaxDD

1.03

t-stat

Mechanism

Credit expansion = banks lending again → NIM improves → loan growth drives net interest income → banks re-rate

Rule

FRED TOTBKCR YoY crosses above +2% after 6+ months below → long XLF 12 months

Caveats

N=3 events. In-position CAGR — portfolio contribution depends on trigger frequency. Significant drawdown (33%) during adverse regimes. Low event count limits statistical confidence. t-stat 1.03 below conventional significance threshold. Underperforms benchmark on risk-adjusted basis in this sample window.

Source: FRED TOTBKCR (H.8 bank credit) + yfinancebacktests/PL64_bank_credit_reacceleration_xlf.py

PL54_bankruptcy_surge_restructuring

Bankruptcy Surge → Long Restructuring

restructuring equities · ~12-month hold · 5 events

0.58

Sharpe

17.5%

CAGR

-50.9%

MaxDD

1.23

t-stat

Mechanism

Bankruptcy surge → restructuring advisory fee boom with 2-3 quarter lag

Rule

Long HLI+LAZ+PJT for 252 days during bankruptcy surge periods

Caveats

N=5 events. In-position CAGR — portfolio contribution depends on trigger frequency. Deep drawdown (51%) requires position sizing discipline. Low event count limits statistical confidence. t-stat 1.23 below conventional significance threshold. Underperforms benchmark on risk-adjusted basis in this sample window. Negative excess CAGR (-3.9pp) vs benchmark — alpha questionable.

Source: ABI/US Courts data (hand-coded surge periods); yfinancebacktests/PL54_bankruptcy_surge_restructuring.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=96, n_events=3.

PL44_corn_planting_delay_long

Corn Planting Delay -> Long ZC=F

corn futures · 3 events

0.56

Sharpe

13.3%

CAGR

-15.4%

MaxDD

0.35

t-stat

Mechanism

Late planting -> reduced acreage + lower yields -> USDA cuts production estimates -> corn prices rally into WASDE

Rule

Long ZC=F May 15 - Jun 30 in years when planting >15pp behind avg by May 15

Caveats

N=3 events. In-position CAGR — portfolio contribution depends on trigger frequency. Low event count limits statistical confidence. t-stat 0.35 below conventional significance threshold.

Source: USDA Crop Progress (hand-coded delay years); yfinance ZC=Fbacktests/PL44_corn_planting_delay_long.py

PL70_new_home_supply_spike_delayed_long

New Home Supply Spike → Delayed Long Builders

homebuilders · 12-month hold · 3 events

0.55

Sharpe

16.1%

CAGR

-56.0%

MaxDD

0.95

t-stat

Mechanism

Housing glut → supply discipline → 12mo later inventory clears → builder pricing power returns

Rule

FRED MNEWSRSA crosses above 8.0 → wait 12 months → long XHB+LEN+DHI 12mo

Caveats

N=3 events. In-position CAGR — portfolio contribution depends on trigger frequency. Deep drawdown (56%) requires position sizing discipline. Low event count limits statistical confidence. t-stat 0.95 below conventional significance threshold. Underperforms benchmark on risk-adjusted basis in this sample window. Negative excess CAGR (-5.4pp) vs benchmark — alpha questionable.

Source: FRED MNEWSRSA + yfinancebacktests/PL70_new_home_supply_spike_delayed_long.py

PL31_auto_inventory_trough_dealers

Auto Inventory Trough → Long Dealers

cross-asset · ~6-month hold · 15 events

0.54

Sharpe

14.1%

CAGR

-57.9%

MaxDD

1.48

t-stat

Mechanism

Extreme scarcity → maximum dealer pricing power → margin expansion not yet in earnings

Rule

Long AN+PAG+LAD equal-weight for 126 days when AISRSA hits local minimum below 0.40

Caveats

N=15 events. In-position CAGR — portfolio contribution depends on trigger frequency. Deep drawdown (58%) requires position sizing discipline. t-stat 1.48 below conventional significance threshold. Underperforms benchmark on risk-adjusted basis in this sample window. Negative excess CAGR (-1.3pp) vs benchmark — alpha questionable.

Source: FRED AISRSA; yfinancebacktests/PL31_auto_inventory_trough_dealers.py

PL90_henry_hub_below_cash_cost_producers

HH Below Cash Cost → Long NatGas Producers

natural gas · 6 events

0.54

Sharpe

15.8%

CAGR

-76.2%

MaxDD

1.20

t-stat

Mechanism

Signal based on Long EQT+AR+RRC 252d when MHHNGSP < $2.50.

Rule

Long EQT+AR+RRC 252d when MHHNGSP < $2.50

Caveats

N=6 events. In-position CAGR — portfolio contribution depends on trigger frequency. Deep drawdown (76%) requires position sizing discipline. t-stat 1.20 below conventional significance threshold. Underperforms benchmark on risk-adjusted basis in this sample window.

Source: FRED MHHNGSP; yfinancebacktests/PL90_henry_hub_below_cash_cost_producers.py

PL81_jolts_quits_staffing

JOLTS Quits → Long RHI

cross-asset · 6-month hold · 5 events

0.53

Sharpe

13.1%

CAGR

-39.7%

MaxDD

0.84

t-stat

Mechanism

Signal based on Long RHI 126d when JTSQUR > 6mo low + 0.3pp.

Rule

Long RHI 126d when JTSQUR > 6mo low + 0.3pp

Caveats

N=5 events. In-position CAGR — portfolio contribution depends on trigger frequency. Significant drawdown (40%) during adverse regimes. Low event count limits statistical confidence. t-stat 0.84 below conventional significance threshold. Underperforms benchmark on risk-adjusted basis in this sample window. Negative excess CAGR (-19.5pp) vs benchmark — alpha questionable.

Source: FRED JTSQUR; yfinancebacktests/PL81_jolts_quits_staffing.py

PL107_semi_ppi_deflation_trough

Semiconductor PPI Deflation Trough → Long SMH

macro · ~6-month hold · 10 events

0.84

Sharpe

17.8%

CAGR

-23.7%

MaxDD

1.69

t-stat

Mechanism

Semiconductor PPI YoY turning positive after 6+ months of deflation signals the chip cycle trough. Chip designers see ASP recovery = margin inflection.

Rule

Long SMH 126d when PCU33443344 YoY turns positive after 6+ months negative

Caveats

N=10 events. In-position CAGR. t-stat 1.69 approaching but below 2.0. Cyclical signal tied to semiconductor capex cycle timing.

Source: FRED PCU33443344; yfinancebacktests/PL107_semi_ppi_deflation_trough.py

PL112_vehicle_age_auto_parts

Vehicle Sales Drought → Long Auto Parts (AZO+ORLY)

macro · ~12-month hold · 2 events

0.92

Sharpe

19.2%

CAGR

-16.4%

MaxDD

1.29

t-stat

Mechanism

Prolonged vehicle sales drought (SAAR < 14M for 6+ months) ages the fleet, driving replacement parts demand to AZO/ORLY. Fewer new cars = more repairs on existing fleet.

Rule

Long AZO+ORLY 252d when TOTALSA 12mo avg < 14M SAAR for 6 months

Caveats

N=2 events only (GFC, COVID chip shortage). Strong per-event returns but very low statistical power. Needs more out-of-sample confirmation.

Source: FRED TOTALSA; yfinancebacktests/PL112_vehicle_age_auto_parts.py

PL113_motor_vehicle_ip_suppliers

Motor Vehicle IP Recovery → Long OEM Suppliers (BWA+LEA)

macro · ~6-month hold · 8 events

1.07

Sharpe

30.9%

CAGR

-39.4%

MaxDD

2.10

t-stat

Mechanism

Motor vehicle industrial production YoY turning positive after 6+ months of decline signals production recovery. OEM suppliers (BorgWarner, Lear) see volume recovery with operating leverage.

Rule

Long BWA+LEA 126d when IPG3361T3S YoY turns positive after 6+ months negative

Caveats

N=8 events. t-stat 2.10 is statistically significant. 39% max drawdown requires position sizing. Cyclical signal fires at production troughs.

Source: FRED IPG3361T3S; yfinancebacktests/PL113_motor_vehicle_ip_suppliers.py

PL114_cre_delinquency_trough_reits

CRE Delinquency Trough → Long Office/Retail REITs (BXP+SPG)

macro · ~12-month hold · 11 events

0.77

Sharpe

17.8%

CAGR

-32.7%

MaxDD

2.42

t-stat

Mechanism

CRE loan delinquency rate peaking and declining for 2 consecutive quarters signals credit improvement. Office/retail REITs recover as refinancing risk ebbs.

Rule

Long BXP+SPG 252d when DRCLACBS peaks then declines 2 quarters

Caveats

N=11 events. t-stat 2.42 is statistically significant. Quarterly data = slow signal. CRE cycle specific — structural shifts (WFH) may impair future applicability for office.

Source: FRED DRCLACBS; yfinancebacktests/PL114_cre_delinquency_trough_reits.py

PL116_nonres_construction_industrial_reits

Nonres Construction Inflection → Long PLD

macro · ~12-month hold · 3 events

0.57

Sharpe

12.0%

CAGR

-37.3%

MaxDD

0.98

t-stat

Mechanism

Nonresidential construction spending YoY turning positive after prolonged decline signals warehouse/industrial demand recovery. PLD benefits as the largest industrial REIT.

Rule

Long PLD 252d when TLNRESCONS YoY turns positive after 6+ months negative

Caveats

N=3 events. Low statistical power. t-stat 0.98 below significance. Deep drawdown (37%) during adverse regimes.

Source: FRED TLNRESCONS; yfinancebacktests/PL116_nonres_construction_industrial_reits.py

PL117_michigan_expectations_luxury

Michigan Expectations Trough → Long Luxury Retail (RL+TPR)

macro · ~6-month hold · 63 events

0.58

Sharpe

15.9%

CAGR

-67.6%

MaxDD

2.66

t-stat

Mechanism

Michigan Consumer Expectations trough below 60 then bouncing for 3 months signals consumer confidence recovery. Luxury retail (Ralph Lauren, Tapestry) benefits from discretionary spending rebound.

Rule

Long RL+TPR 126d when MICH hits trough below 60 then bounces 3 months

Caveats

N=63 events (high statistical power). t-stat 2.66 is significant. However, deep drawdown (68%) and many overlapping signals suggest regime sensitivity. Position sizing critical.

Source: FRED MICH; yfinancebacktests/PL117_michigan_expectations_luxury.py

PL119_freight_rate_collapse_retailers

Import Price Collapse → Long COST+TGT

macro · ~6-month hold · 5 events

1.10

Sharpe

37.1%

CAGR

-32.8%

MaxDD

1.72

t-stat

Mechanism

Import price index declining >5% YoY for 3+ months signals falling input costs for import-dependent retailers. COST and TGT see margin expansion as COGS decline before shelf prices adjust.

Rule

Long COST+TGT 126d when IR (import prices) YoY < -5% for 3 months

Caveats

N=5 events. Strong Sharpe (1.10) and CAGR (37%). t-stat 1.72 approaching significance. Import price deflation episodes are infrequent but powerful for retailers.

Source: FRED IR; yfinancebacktests/PL119_freight_rate_collapse_retailers.py

PL122_hdd_spike_utilities

Natural Gas Price Spike → Long XLU

macro · ~6-month hold · 16 events

0.69

Sharpe

11.4%

CAGR

-23.9%

MaxDD

1.83

t-stat

Mechanism

Henry Hub crossing above $4 from below signals elevated energy costs. Regulated utilities pass through fuel costs to ratepayers with rate adjustments, benefiting from higher revenue base.

Rule

Long XLU 126d when MHHNGSP crosses above $4 from below

Caveats

N=16 events. t-stat 1.83 approaching significance. Moderate drawdown (24%). Utility rate adjustment lag means benefit is not immediate.

Source: FRED MHHNGSP; yfinancebacktests/PL122_hdd_spike_utilities.py

PL124_defense_spending_primes

Federal Defense Spending Acceleration → Long LMT+RTX+NOC

macro · ~12-month hold · 2 events

0.79

Sharpe

18.9%

CAGR

-28.3%

MaxDD

1.12

t-stat

Mechanism

Federal defense spending YoY >+5% for 2 consecutive quarters signals sustained budget growth. Defense primes (LMT, RTX, NOC) see backlog and revenue acceleration with multi-year contract visibility.

Rule

Long LMT+RTX+NOC 252d when FDEFX YoY > +5% for 2 quarters

Caveats

N=2 events (limited history of sustained acceleration). Strong per-event performance but very low statistical power. Government spending is lumpy and politically driven.

Source: FRED FDEFX; yfinancebacktests/PL124_defense_spending_primes.py

PL126_fed_nondefense_invest_it

Federal Nondefense Investment → Long ACN+LDOS

macro · ~12-month hold · 2 events

0.87

Sharpe

14.3%

CAGR

-24.2%

MaxDD

1.22

t-stat

Mechanism

Federal nondefense investment spending accelerating >+5% YoY signals IT modernization and consulting demand. ACN (Accenture) and LDOS (Leidos) capture federal IT/consulting contract flow.

Rule

Long ACN+LDOS 252d when A782RX1Q020SBEA YoY > +5% for 2 quarters

Caveats

N=2 events only. Limited statistical power despite strong risk-adjusted returns. Quarterly data means slow signal generation.

Source: FRED A782RX1Q020SBEA; yfinancebacktests/PL126_fed_nondefense_invest_it.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=249, n_events=2.

PL127_nfci_easing_small_cap_value

NFCI Easing From Tight → Long Small-Cap Value (IWN)

macro · ~12-month hold · 2 events

0.93

Sharpe

23.2%

CAGR

-22.0%

MaxDD

0.93

t-stat

Mechanism

Chicago Fed NFCI dropping below 0 after sustained tightening (>+0.5 for 3+ months) signals credit conditions easing. Small-cap value (IWN) is the most credit-sensitive equity segment and recovers aggressively.

Rule

Long IWN 252d when NFCI drops below 0 after 3+ months above +0.5

Caveats

N=2 events (GFC recovery, post-COVID). Excellent risk-adjusted returns but minimal statistical confidence. Rare signal by design.

Source: FRED NFCI; yfinancebacktests/PL127_nfci_easing_small_cap_value.py

PL128_m2_velocity_commodities

M2 Velocity Inflection → Long Commodities (DJP)

macro · ~12-month hold · 5 events

1.34

Sharpe

24.9%

CAGR

-16.6%

MaxDD

2.30

t-stat

Mechanism

M2 velocity (quarterly) YoY turning positive after sustained decline signals money flowing back into the real economy. Commodity demand accelerates as velocity recovers from monetary overhang.

Rule

Long DJP 252d when M2V YoY turns positive after 4+ quarters decline

Caveats

N=5 events. t-stat 2.30 is statistically significant. Strong Sharpe (1.34) with contained drawdown (17%). Quarterly data = slow signal but high quality. Velocity regime changes are rare and powerful.

Source: FRED M2V; yfinancebacktests/PL128_m2_velocity_commodities.py

PL129_term_premium_positive_xlf

Term Premium Turning Positive → Long Financials (XLF)

macro · ~12-month hold · 12 events

1.06

Sharpe

18.2%

CAGR

-25.8%

MaxDD

2.62

t-stat

Mechanism

10Y term premium (ACM model) crossing positive after 6+ months below zero signals the yield curve is steepening via higher long-end compensation. Banks (XLF) benefit from wider NIM as the term structure normalizes.

Rule

Long XLF 252d when THREEFYTP10 crosses above 0 after 6+ months below

Caveats

N=12 events. t-stat 2.62 is statistically significant. Sharpe 1.06 with contained drawdown. Daily FRED data from 1961 provides long history. Strongest macro signal in this batch.

Source: FRED THREEFYTP10; yfinancebacktests/PL129_term_premium_positive_xlf.py

PL132_bank_reserves_surge_qqq

Bank Reserves Surge → Long QQQ

macro · ~6-month hold · 7 events

0.79

Sharpe

21.5%

CAGR

-46.3%

MaxDD

1.47

t-stat

Mechanism

Fed reserve injections (TOTRESNS surging >15% in 3 months) signal liquidity flooding the banking system. Excess reserves flow into risk assets, with QQQ as the highest-beta large-cap beneficiary.

Rule

Long QQQ 126d when TOTRESNS 3-month rolling change > +15%

Caveats

N=7 events. t-stat 1.47 below conventional significance. Deep drawdown (46%) during volatile liquidity-injection episodes. Post-GFC/COVID phenomenon — limited pre-2008 history at this scale.

Source: FRED TOTRESNS; yfinancebacktests/PL132_bank_reserves_surge_qqq.py

PL135_personal_income_acceleration_xly

Personal Income YoY Reaccelerates > +5% → Long XLY

macro · ~6-month hold · 7 events

1.36

Sharpe

25.6%

CAGR

-22.1%

MaxDD

2.55

t-stat

Mechanism

Personal income YoY crossing above +5% after 6+ months below signals income acceleration. Rising income leads consumer spending by 1–2 months, boosting consumer discretionary stocks (XLY). 6/7 events positive, avg +12.7% per trade.

Rule

Long XLY 126d when FRED PI YoY crosses above +5% after 6+ months below +5%

Caveats

N=7 events (1999–2020). t-stat 2.55 is statistically significant. Excess vs SPY only +1.8% avg — XLY moves directionally with SPY. 2020 COVID stimulus event (PI YoY +13.6%) is an outlier that inflates returns. Remove that event and check robustness.

Source: FRED PI; yfinancebacktests/PL135_personal_income_acceleration_xly.py

PL136_corporate_profits_recovery_spy

Corporate Profits YoY Turns Positive After 2Q Negative → Long SPY

macro · ~12-month hold · 7 events

0.75

Sharpe

11.3%

CAGR

-22.9%

MaxDD

1.99

t-stat

Mechanism

Corporate profit recovery (FRED CP YoY turning positive after 2+ quarters negative) signals cyclical expansion. Earnings growth drives equity re-rating. 6/7 events positive, avg +12.4% per trade.

Rule

Long SPY 252d when FRED CP YoY turns positive after 2+ quarters of negative YoY growth

Caveats

N=7 events (1999–2020). t-stat 1.99 borderline significant. CP data has ~2-month publication lag. 2020 event (COVID recovery, +32.6%) is an outlier. No excess return metric since benchmark IS SPY — the edge is timing vs buy-and-hold.

Source: FRED CP; yfinancebacktests/PL136_corporate_profits_recovery_spy.py

PL138_real_disposable_income_negative_xlp

Real Disposable Income YoY Turns Negative → Long XLP

macro · ~12-month hold · 4 events

1.37

Sharpe

16.0%

CAGR

-8.0%

MaxDD

2.74

t-stat

Mechanism

Real disposable income turning negative (FRED DSPIC96 YoY < 0 after 12+ months positive) signals an income squeeze. Consumers trade down to staples, making XLP a relative outperformer. 4/4 events positive, avg +16.0%.

Rule

Long XLP 252d when FRED DSPIC96 YoY turns negative after 12+ months of positive YoY

Caveats

N=4 events only (2005, 2009, 2013, 2021). Perfect win rate but very small sample. Excess vs SPY only +0.7% avg — XLP rose but SPY rose nearly as much in most events. 2021 event (post-COVID stimulus distortion) may not be representative.

Source: FRED DSPIC96; yfinancebacktests/PL138_real_disposable_income_negative_xlp.py

PL140_mortgage_application_surge_homebuilders

Mortgage Rate Drop > 50bps → Long Homebuilders (1mo lag)

macro · ~6-month hold · 19 events

0.89

Sharpe

33.2%

CAGR

-60.2%

MaxDD

2.73

t-stat

Mechanism

Mortgage rate dropping >50bps from 13-week high proxies an application surge. With a 1-month entry lag (applications-to-closings pipeline), long LEN+DHI+NVR captures the homebuilder revenue acceleration. 13/19 events positive, avg +18.1%, avg excess +12.6% vs SPY.

Rule

Long LEN+DHI+NVR equal-weight 126d, entered 21 trading days after MORTGAGE30US drops >50bps from 13-week high

Caveats

N=19 events (1997–2025). t-stat 2.73 is significant. Deep max drawdown (-60.2%) driven by 2008 GFC event. Excluding GFC, metrics improve substantially. Recent events (2023–2025) show losses, suggesting post-COVID regime change. The 1-month entry lag is essential to the thesis.

Source: FRED MORTGAGE30US; yfinancebacktests/PL140_mortgage_application_surge_homebuilders.py

PL142_sloos_credit_card_willingness_xly

SLOOS Credit Card Tightening Crosses Zero → Long XLY

macro · ~3-month hold · 6 events

0.65

Sharpe

13.0%

CAGR

-16.9%

MaxDD

0.80

t-stat

Mechanism

SLOOS credit card tightening (FRED DRTSCLCC) crossing from negative to zero signals banks easing consumer credit standards. More available credit fuels discretionary spending, benefiting XLY. 5/6 events positive, avg +3.4%.

Rule

Long XLY 63d when FRED DRTSCLCC crosses from negative (tightening) to zero or positive (easing)

Caveats

N=6 events only (2000–2022). t-stat 0.80 is below statistical significance. Quarterly SLOOS data means very few events. Avg excess vs SPY +1.9% — modest edge. 2007 event was a loss during early GFC. Small sample makes this suggestive rather than conclusive.

Source: FRED DRTSCLCC; yfinancebacktests/PL142_sloos_credit_card_willingness_xly.py

PL145_dgorder_acceleration_semi_equipment

Durable Goods 3mo MoM Acceleration → Long AMAT+LRCX+KLAC

macro · ~3-month hold · 5 events

1.43

Sharpe

66.5%

CAGR

-35.7%

MaxDD

1.60

t-stat

Mechanism

Three consecutive months of positive AND increasing durable goods MoM (FRED DGORDER) signals capex cycle inflection. Semi equipment makers (AMAT, LRCX, KLAC) are highest-beta beneficiaries. 3/5 events positive, avg +16.9%, avg excess +10.1% vs SPY.

Rule

Long AMAT+LRCX+KLAC equal-weight 63d when FRED DGORDER MoM is positive and increasing for 3 consecutive months

Caveats

N=5 events only (2005–2025). t-stat 1.60 below conventional significance. CAGR heavily inflated by 2020 event (+59.5%). DGORDER includes volatile Boeing orders — consider ex-transport (NEWORDER) for cleaner signal. The strict 3-month acceleration filter fires rarely.

Source: FRED DGORDER; yfinancebacktests/PL145_dgorder_acceleration_semi_equipment.py

PL147_permit1_acceleration_supplier_stocks

Single-Family Permits YoY Improving 3mo → Long SHW+VMC+MLM

macro · ~3-month hold · 26 events

0.68

Sharpe

15.3%

CAGR

-41.8%

MaxDD

1.63

t-stat

Mechanism

Single-family permits (FRED PERMIT1) second derivative turning positive for 3 months signals housing trough. Building material suppliers (SHW paint, VMC/MLM aggregates) benefit first as materials are ordered before construction starts. 15/26 events positive, avg +3.3%.

Rule

Long SHW+VMC+MLM equal-weight 63d when FRED PERMIT1 YoY improves for 3 consecutive months (second derivative positive)

Caveats

N=26 events (1991–2023) — good sample size. t-stat 1.63 borderline. Deep drawdown (-41.8%) from 2019 event hitting COVID crash. Second-derivative signal is noisy and fires frequently. Early events (pre-1994) used limited ticker universe. Consider adding BLDR from 2016 onward.

Source: FRED PERMIT1; yfinancebacktests/PL147_permit1_acceleration_supplier_stocks.py

PL172_nfci_easing_streak_high_beta

NFCI Easing Streak 8+ Weeks → Long High-Beta Equities

macro · ~3-month hold · 15 events

1.71

Sharpe

36.3%

CAGR

-13.6%

MaxDD

1.91

t-stat

Mechanism

Chicago Fed NFCI staying negative (easing) for 8+ consecutive weeks after tightening signals financial conditions are loosening. High-beta equities outperform as credit/liquidity conditions improve. 5/15 events positive with strong avg returns.

Rule

Long high-beta equity basket 63d when FRED NFCI stays negative for 8 consecutive weeks following a positive reading

Caveats

N=15 events. t-stat 1.91 borderline significant. Generic z-score implementation — a bespoke version matching exact rule may differ. NFCI weekly data from 1971 provides long history. Contained drawdown (-13.6%) is notable.

Source: FRED NFCI; yfinancebacktests/PL172_nfci_easing_streak_high_beta.py

PL212_rsxfs_surge_payment_processors

Retail Sales Control Group Surge → Long Payment Processors

macro · ~2-month hold · 24 events

1.52

Sharpe

44.3%

CAGR

-25.8%

MaxDD

3.05

t-stat

Mechanism

FRED RSXFS (retail sales control group) surging >0.5% MoM for 2 consecutive months signals consumer spending acceleration. Payment processors (V, MA) capture transaction volume growth with operating leverage. t-stat 3.05 is highly significant.

Rule

Long V+MA equal-weight 42d when FRED RSXFS MoM > +0.5% for 2 consecutive months

Caveats

N=24 events with t-stat 3.05 — strongest statistical result in this batch. SQ data spotty/delisted. V and MA are mega-cap, highly liquid. MaxDD -25.8% during COVID. Generic backtester implementation; bespoke version may differ slightly.

Source: FRED RSXFS; yfinancebacktests/PL212_rsxfs_surge_payment_processors.py

PL205_cass_freight_inflection_ltl_truckers

Cass Freight Index YoY Inflection → Long LTL Truckers

freight · ~2-month hold · 20 events

1.24

Sharpe

42.2%

CAGR

-35.5%

MaxDD

2.78

t-stat

Mechanism

Cass Freight Index shipments YoY rate of change improving for 3 consecutive months signals freight cycle trough. LTL truckers (ODFL, SAIA, XPO) have highest operating leverage to volume recovery. 16/20 events positive.

Rule

Long ODFL+SAIA+XPO equal-weight 42d when FRED freight shipments YoY improves 3 consecutive months

Caveats

N=20 events, t-stat 2.78 is significant. Strong win rate (16/20). MaxDD -35.5% during deep downturns. Uses TSIFRGHT proxy on FRED. LTL stocks are high-beta industrials — part of return may be general market beta rather than alpha.

Source: FRED TSIFRGHT; yfinancebacktests/PL205_cass_freight_inflection_ltl_truckers.py

PL209_ev_penetration_new_high_lithium_miners

EV Adoption Proxy New High → Long Lithium Miners

thematic · ~3-month hold · 14 events

0.92

Sharpe

29.6%

CAGR

-57.0%

MaxDD

3.43

t-stat

Mechanism

Using FRED TOTALSA (total vehicle sales) as EV adoption proxy. When the metric crosses a new high, long TSLA+ALB+SQM. t-stat 3.43 is the highest in this batch. 13/14 events positive, avg +29.6%.

Rule

Long TSLA+ALB+SQM 60d when FRED TOTALSA proxy signals new quarterly high in EV adoption

Caveats

N=14 events, t-stat 3.43 highly significant. Deep max drawdown (-57.0%) from TSLA/lithium volatility. The TOTALSA proxy is crude — actual EV penetration data would sharpen signal. Strong TSLA weighting dominates basket returns. Reverse causation risk.

Source: FRED TOTALSA; yfinancebacktests/PL209_ev_penetration_new_high_lithium_miners.py

PL193_altsales_17m_surge_dealer_groups

Light Vehicle Sales > 17M SAAR → Long Auto Dealers

macro · ~6-month hold · 29 events

0.80

Sharpe

30.0%

CAGR

-77.5%

MaxDD

2.15

t-stat

Mechanism

FRED ALTSALES crossing above 17M SAAR after 3+ months sub-16M signals auto demand recovery. Dealer groups (AN, PAG, LAD) benefit from volume and margin expansion. 18/29 events positive, avg +30.0%.

Rule

Long AN+PAG+LAD equal-weight 126d when FRED ALTSALES crosses above 17M SAAR after 3+ months below 16M

Caveats

N=29 events, t-stat 2.15 significant. Very deep max drawdown (-77.5%) from GFC/COVID events. Generic backtester z-score implementation rather than bespoke 17M threshold. Auto dealer stocks are high-beta cyclicals. Recent EV transition may change dynamics.

Source: FRED ALTSALES; yfinancebacktests/PL193_altsales_17m_surge_dealer_groups.py

PL160_revolsl_deceleration_xlp_defensive

Revolving Credit Deceleration → Long XLP Defensive

macro · ~6-month hold · 23 events

0.86

Sharpe

16.7%

CAGR

-28.9%

MaxDD

2.03

t-stat

Mechanism

Revolving consumer credit (FRED REVOLSL) YoY decelerating by >5pp over 2 months signals consumer credit stress ahead. Defensive rotation into XLP (consumer staples) outperforms. 17/23 events positive.

Rule

Long XLP 126d when FRED REVOLSL YoY decelerates by >5pp over 2 months

Caveats

N=23 events, t-stat 2.03 significant. Good sample size with 74% win rate. MaxDD -28.9% manageable for a defensive strategy. XLP is a low-beta defensive play — excess return vs SPY is the key metric.

Source: FRED REVOLSL; yfinancebacktests/PL160_revolsl_deceleration_xlp_defensive.py

PL194_indpro_yoy_cross_zero_xlb_long

Industrial Production YoY Crosses Zero → Long Materials (XLB)

macro · ~3-month hold · 25 events

0.76

Sharpe

17.8%

CAGR

-34.4%

MaxDD

1.86

t-stat

Mechanism

Industrial production (FRED INDPRO) YoY crossing above zero after 6+ months of contraction signals manufacturing recovery. Materials sector (XLB) benefits from volume recovery and restocking. 17/25 events positive.

Rule

Long XLB 63d when FRED INDPRO YoY crosses above 0% after 6+ months negative

Caveats

N=25 events, t-stat 1.86 borderline significant. Large sample with data back to 1919. MaxDD -34.4% during recession events where recovery was false start. Simple, well-known macro signal — low originality but robust.

Source: FRED INDPRO; yfinancebacktests/PL194_indpro_yoy_cross_zero_xlb_long.py

PL226_reserves_scarcity_dealer_repo_profit

Bank Reserves Decline to Scarcity → Long Dealer Banks

monetary · ~2-month hold · 17 events

1.10

Sharpe

28.6%

CAGR

-24.7%

MaxDD

2.27

t-stat

Mechanism

FRED WRESBAL (reserve balances) declining toward scarcity threshold signals repo market stress. Primary dealer banks (GS, MS, JPM) profit from wider repo spreads in tight reserve environments. 14/17 events positive.

Rule

Long GS+MS+JPM equal-weight 42d when FRED WRESBAL reserve balances approach scarcity threshold (~$3T)

Caveats

N=17 events, t-stat 2.27 significant. Generic z-score implementation of reserve scarcity. Dealer bank profitability from repo is only one driver — other macro factors may dominate. Post-2008 regime with large balance sheets makes historical comparison difficult.

Source: FRED WRESBAL; yfinancebacktests/PL226_reserves_scarcity_dealer_repo_profit.py

PL233_business_applications_surge_small_cap

High-Propensity Business Applications Surge → Long Small-Cap

macro · ~3-month hold · 7 events

0.97

Sharpe

20.0%

CAGR

-28.7%

MaxDD

2.57

t-stat

Mechanism

FRED high-propensity business applications surging >15% YoY for 3 consecutive months signals entrepreneurial confidence and new business formation. Small-cap stocks (IWM) benefit from the same economic optimism. 6/7 events positive.

Rule

Long IWM 63d when FRED BABATOTALSAUS (high-propensity business applications) YoY >15% for 3 consecutive months

Caveats

N=7 events only, but t-stat 2.57 is significant. Business applications data starts 2004. The surge may coincide with general economic expansion — disentangling from broad market beta is important. 2020-2021 COVID-era application surge was unprecedented.

Source: FRED BABATOTALSAUS; yfinancebacktests/PL233_business_applications_surge_small_cap.py

PL228_multifamily_permit_collapse_reit_supply_cliff

Multi-Family Permits Collapse → Long Apartment REITs

housing · ~2-month hold · 31 events

0.83

Sharpe

16.9%

CAGR

-28.0%

MaxDD

2.32

t-stat

Mechanism

FRED PERMIT5 (5+ unit building permits) collapsing >30% YoY signals incoming supply cliff for apartments. Apartment REITs (EQR, AVB, MAA) benefit from tighter supply 12-18 months later. 22/31 events positive.

Rule

Long EQR+AVB+MAA equal-weight 42d when FRED PERMIT5 YoY collapses >30%

Caveats

N=31 events, t-stat 2.32 significant. Largest sample in this batch. Supply cliff thesis plays out over 12-18 months — 42-day hold may be too short. Rate sensitivity of REITs can overwhelm supply fundamentals. MaxDD -28.0% during rate shock periods.

Source: FRED PERMIT5; yfinancebacktests/PL228_multifamily_permit_collapse_reit_supply_cliff.py

PL245_manemp_acceleration_industrial_gas

Manufacturing Employment Acceleration → Long Industrial Gas

macro · ~2-month hold · 9 events

1.39

Sharpe

21.8%

CAGR

-12.5%

MaxDD

2.09

t-stat

Mechanism

FRED MANEMP showing 3 consecutive months of manufacturing employment acceleration signals industrial recovery. Industrial gas companies (APD, LIN, ECL) have high fixed-cost structures and benefit from factory utilization increases. 8/9 events positive.

Rule

Long APD+LIN+ECL equal-weight 42d when FRED MANEMP shows 3-month employment acceleration

Caveats

N=9 events, t-stat 2.09 significant. Notably low max drawdown (-12.5%) — industrial gas companies are defensive industrials. Generic z-score implementation. Employment data has 1-month lag. APD/LIN/ECL are not pure-play manufacturing but benefit from industrial volume.

Source: FRED MANEMP; yfinancebacktests/PL245_manemp_acceleration_industrial_gas.py

PL243_nonrescons_surge_aggregates_cement

Nonresidential Construction Spending Surge → Long Aggregates/Cement

construction · ~2-month hold · 11 events

1.11

Sharpe

26.2%

CAGR

-23.7%

MaxDD

1.85

t-stat

Mechanism

FRED TLNRESCONS surging >8% YoY for 3 months signals sustained nonresidential construction boom (data centers, infrastructure, reshoring). Aggregates and cement companies (VMC, MLM, EXP) are direct beneficiaries. 10/11 events positive.

Rule

Long VMC+MLM+EXP equal-weight 42d when FRED TLNRESCONS YoY >8% for 3 consecutive months

Caveats

N=11 events, t-stat 1.85 borderline. Excellent win rate (10/11). Construction spending data has 2-month publication lag. Related to PL137 (private nonres) and PL079 (resi construction cement) but uses total nonresidential including government.

Source: FRED TLNRESCONS; yfinancebacktests/PL243_nonrescons_surge_aggregates_cement.py

PL258_wasde_wheat_low_stocks_fertilizer_demand

WASDE Global Wheat Stocks-to-Use < 30% → Long Fertilizers

commodities · ~2-month hold · 10 events

1.46

Sharpe

73.5%

CAGR

-26.0%

MaxDD

1.88

t-stat

Mechanism

When USDA WASDE reports global wheat stocks-to-use ratio below 30%, tight supply incentivizes maximum-yield planting worldwide. Fertilizer companies (NTR, MOS, CF) benefit from both volume and pricing surges as farmers aggressively apply inputs. 6/10 events positive with avg +10.4% basket return and +7.4% excess vs SPY.

Rule

Long NTR+MOS+CF equal-weight 42d when WASDE global wheat stocks-to-use < 30%

Caveats

N=10 events, t-stat 1.88 borderline significant. Highly concentrated in 2007-2012 commodity supercycle (3 of top 4 returns). Win rate only 60% — 4 losing events including -5% in 2021. Fertilizer stocks are volatile (ann vol 44.6%). WASDE data is monthly with same-day release — signal may already be priced by open. Related to PL015 (wheat drought) and PL044 (corn planting delay).

Source: USDA WASDE; yfinancebacktests/PL258_wasde_wheat_low_stocks_fertilizer_demand.py

PL263_census_advance_import_surge_drayage_warehouse

Census Advance Trade Goods Import Surge → Long Drayage & Warehouse

macro · ~2-week hold · 22 events

0.76

Sharpe

17.2%

CAGR

-27.5%

MaxDD

0.69

t-stat

Mechanism

FRED IMPGS (real goods imports) surging >3% MoM signals imminent drayage and warehouse demand spike. JBHT benefits from intermodal container volumes; PLD benefits from warehouse occupancy and rental rate growth. 13/22 events positive. Uses quarterly FRED data with automatic signal detection.

Rule

Long JBHT+PLD equal-weight 10d when FRED IMPGS (real goods imports) surges >3% MoM

Caveats

N=22 events but t-stat only 0.69 — not statistically significant. Excess CAGR vs SPY is negative (-9.7%) meaning SPY outperformed in the same windows. Large sample (22 events) but inconsistent: several sharp losses in 2008-2009. Import data has 1-month lag. Strongest in 2020-2021 supply chain crunch era. Related to PL121 (import surge warehouse REITs) and PL088 (trade deficit logistics).

Source: FRED IMPGS; yfinancebacktests/PL263_census_advance_import_surge_drayage_warehouse.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=80, n_events=8.

PL272_usda_fsis_slaughter_recovery_packer_margins

USDA FSIS Slaughter Recovery → Long Protein Packers

commodities · ~2-week hold · 8 events

2.53

Sharpe

108.6%

CAGR

-11.6%

MaxDD

1.43

t-stat

Mechanism

When USDA FSIS weekly cattle slaughter volume recovers above 650K head after a period of depressed throughput, packer utilization snaps back. TSN and PPC benefit from improved capacity utilization and margin expansion. 6/8 events positive with avg +3.1% basket return in just 10 days. Excess vs SPY +2.2% per event.

Rule

Long TSN+PPC equal-weight 10d when USDA FSIS weekly cattle slaughter recovers to >650K head

Caveats

N=8 events, t-stat 1.43 not significant at 5%. Extremely high Sharpe (2.53) and CAGR (108.6%) inflated by short holding periods and March 2009 recovery event (+12.6%). Two losing events: 2020 covid packing plant shutdowns (-7.1%) and 2023 (-2.8%). Slaughter data is weekly with same-week release. Related to PL004 (recall-based protein short, failed) but opposite direction.

Source: USDA FSIS; yfinancebacktests/PL272_usda_fsis_slaughter_recovery_packer_margins.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=126, n_events=3.

PL267_fhwa_vmt_surge_auto_insurer_premium_growth

FHWA Traffic Volume Surge → Long Auto Insurers

macro · ~2-month hold · 3 events

1.46

Sharpe

28.4%

CAGR

-11.1%

MaxDD

1.03

t-stat

Mechanism

FRED TRFVOLUSM227NFWA (Vehicle Miles Traveled) surging >3% YoY for 3+ months signals sustained driving increase. More miles = more accidents = more claims = insurers raise premiums. PGR (Progressive), ALL (Allstate), TRV (Travelers) benefit. 2/3 events positive with +4.4% avg basket return.

Rule

Long PGR+ALL+TRV equal-weight 42d when FHWA VMT >3% YoY for 3+ consecutive months

Caveats

Only N=3 events — extremely low statistical power, t-stat 1.03. High Sharpe driven by small sample. VMT data has 2-month publication lag. The 3% threshold may be too strict (hence few signals). Insurance pricing has long lag from claims experience. Needs more events to validate. Related to PL065 (VMT surge auto parts) and PL197 (CPI transport auto insurers).

Source: FRED TRFVOLUSM227NFWA; yfinancebacktests/PL267_fhwa_vmt_surge_auto_insurer_premium_growth.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=180, n_events=6.

PL275_doe_spr_refill_procurement_crude_floor

DOE SPR Refill Procurement → Long Crude/E&P

energy · ~6-week hold · 6 events

0.58

Sharpe

12.2%

CAGR

-14.7%

MaxDD

0.49

t-stat

Mechanism

When DOE announces SPR refill procurement at a target price range, it establishes a perceived crude price floor. USO and XOP benefit from reduced downside risk. 3/6 events positive. Best event: Jun 2023 +12.5% basket return after DOE awarded 3M barrels.

Rule

Long USO+XOP equal-weight 30d when DOE announces SPR refill procurement solicitation

Caveats

Only N=6 events, all from 2023-2024 (SPR refill program started after 2022 release). t-stat 0.49 not significant. Excess CAGR vs SPY is negative (-20.7%). Win rate only 50%. Crude prices driven by many macro factors beyond SPR. Signal is novel (post-2022 regime) and needs out-of-sample validation. Very marginal pass on CAGR threshold.

Source: DOE SPR announcements; yfinancebacktests/PL275_doe_spr_refill_procurement_crude_floor.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=168, n_events=8.

PL293_sec_13f_hedge_fund_short_unwinding_squeeze

Hedge Fund Short Unwinding → Long Small Caps (IWM)

sentiment · ~3-week hold · 8 events

3.82

Sharpe

294.2%

CAGR

-10.4%

MaxDD

3.12

t-stat

Mechanism

SEC 13F data shows hedge fund aggregate short ratio declining, signaling consensus short positions being unwound. Small caps (IWM) are disproportionately affected by short squeezes. 8/8 events positive — every identified unwinding episode produced gains.

Rule

Long IWM 21d when SEC 13F data shows hedge fund aggregate short ratio declining

Caveats

Extreme metrics (Sharpe 3.82, CAGR 294%) are inflated by short 21-day hold periods and market recovery timing (Mar 2009, Oct 2022 troughs). t-stat 3.12 is significant but N=8 events. Several event dates coincide with major market bottoms, introducing look-ahead bias risk. 13F data has 45-day lag. Hand-coded events may overfit to known bottoms.

Source: SEC 13F; yfinancebacktests/_run_batch_pl.py

PL282_cftc_bank_participation_net_long_flip_commodity

CFTC Bank Participation Net Long Flip → Commodity Trend

commodities · ~2-month hold · 8 events

1.99

Sharpe

65.2%

CAGR

-15.4%

MaxDD

2.30

t-stat

Mechanism

When CFTC Bank Participation Report shows large banks flipping from net short to net long in commodity futures, it confirms an underlying commodity trend. DBC and GSG broad commodity ETFs benefit. 7/8 events positive, t-stat 2.30 significant.

Rule

Long DBC+GSG equal-weight 42d when CFTC Bank Participation Report shows net long flip

Caveats

N=8 events, t-stat 2.30 significant. Bank Participation Report is monthly with ~1 week lag. The 2007 and 2021-2022 commodity booms drive much of the performance. DBC and GSG have roll yield drag. Bank positioning may be hedging-related rather than directional. Related to C17 (commodity backwardation).

Source: CFTC Bank Participation; yfinancebacktests/_run_batch_pl.py

PL281_bts_ontime_deterioration_mro_demand_surge

BTS Airline On-Time Deterioration → Long MRO/Aerospace

transport · ~2-month hold · 8 events

1.46

Sharpe

31.8%

CAGR

-19.3%

MaxDD

1.68

t-stat

Mechanism

BTS airline on-time performance dropping below 75% for 3+ months signals fleet aging and utilization stress. MRO providers (TDG, HEI) and engine makers (GE) benefit from increased maintenance demand. 7/8 events positive.

Rule

Long TDG+HEI+GE equal-weight 42d when BTS on-time performance drops below 75% for 3+ months

Caveats

N=8 events, t-stat 1.68 borderline. TDG and HEI are secular growers (aftermarket monopoly business models) which may inflate returns regardless of signal. BTS data has 2-month lag. GE is a conglomerate, not pure-play MRO. Related to AK2 (FAA hub capacity) and PL204 (airline load factor).

Source: BTS; yfinancebacktests/_run_batch_pl.py

PL286_fdic_unrealized_loss_regional_bank

FDIC Unrealized Loss Improvement → Long Regional Banks (KRE)

financials · ~2-month hold · 6 events

1.47

Sharpe

42.3%

CAGR

-14.8%

MaxDD

1.47

t-stat

Mechanism

FDIC quarterly banking profile showing unrealized loss improvement >20% signals reduced mark-to-market risk for regional banks. KRE benefits from improved sentiment and reduced capital adequacy concerns. Perfect 6/6 win rate.

Rule

Long KRE 42d when FDIC quarterly profile shows unrealized loss improvement >20%

Caveats

Only N=6 events, t-stat 1.47 not significant at 5%. FDIC data is quarterly with ~2 month lag. The 6/6 win rate is promising but small sample. Hand-coded events may coincide with broader rate-cycle recoveries. Related to PL084 (TED spread normalization KRE) and PL115 (CMBS spread regional banks).

Source: FDIC Quarterly Profile; yfinancebacktests/_run_batch_pl.py

PL296_usitc_affirmative_injury_domestic_producer_long

USITC Affirmative Injury Determination → Long Steel Producers

trade · ~2-month hold · 7 events

1.38

Sharpe

48.4%

CAGR

-29.5%

MaxDD

1.49

t-stat

Mechanism

When USITC issues an affirmative injury determination in steel/metals trade cases, it leads to tariffs and duties that protect domestic producers. NUE+STLD benefit from improved pricing power. 4/7 events positive.

Rule

Long NUE+STLD 42d when USITC issues affirmative injury determination in steel/metals

Caveats

N=7 events, t-stat 1.49. X ticker failed to load (possibly delisted), using NUE+STLD only. Trade cases are public and well-telegraphed — market may price in before determination. MaxDD -29.5% suggests significant drawdown risk. Steel producers are cyclical beyond trade policy. Related to PL009 (scrap steel HRC spread).

Source: USITC; yfinancebacktests/_run_batch_pl.py

PL295_strips_deep_inversion_defensive

Deep Yield Curve Inversion → Long Utilities + Staples

macro · ~3-month hold · 5 events

1.32

Sharpe

19.8%

CAGR

-8.9%

MaxDD

1.44

t-stat

Mechanism

When Treasury 2s10s spread falls below -50bp (deep inversion), recession risk is elevated. Defensive sectors (XLU utilities, XLP staples) outperform. Low max drawdown (-8.9%) reflects defensive nature. 4/5 events positive.

Rule

Long XLU+XLP equal-weight 60d when Treasury 2s10s spread falls below -50bp

Caveats

Only N=5 events, t-stat 1.44 not significant. Three of five events from 2022-2023 (same inversion cycle). Low max drawdown is attractive but small sample. Related to C08 (yield curve 10y2y) and PL076 (yield curve uninversion cyclicals). The -50bp threshold makes this a rare signal.

Source: FRED T10Y2Y; yfinancebacktests/_run_batch_pl.py

PL291_nhtsa_recall_volume_spike_aftermarket_parts

NHTSA Recall Volume Spike → Long Aftermarket Auto Parts

autos · ~2-month hold · 8 events

0.64

Sharpe

12.4%

CAGR

-23.5%

MaxDD

0.74

t-stat

Mechanism

NHTSA recall volume spikes signal vehicle safety issues requiring repair. Aftermarket parts retailers (ORLY, AZO, AAP) benefit from increased demand for replacement parts and repair services. 6/8 events positive.

Rule

Long ORLY+AZO+AAP equal-weight 42d when NHTSA monthly recall volume >2x trailing 12mo average

Caveats

N=8 events, t-stat 0.74 weak. ORLY and AZO are secular growth stories that may outperform regardless. AAP has been a notable underperformer. NHTSA recall data is public and immediate but aftermarket impact has unclear timing. Related to PL112 (vehicle age auto parts).

Source: NHTSA; yfinancebacktests/_run_batch_pl.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=60, n_events=6.

PL297_noaa_hurricane_gulf_shutin_inland_refiner

Gulf Hurricane Crude Shut-In → Long Inland Refiners

energy · ~2-week hold · 6 events

1.99

Sharpe

206.3%

CAGR

-30.3%

MaxDD

0.97

t-stat

Mechanism

When NOAA hurricane track forces >50% Gulf offshore crude production shut-in, inland/East Coast refiners (PBF, VLO) benefit from crack spread widening as Gulf Coast refinery operations are disrupted. 4/6 events positive.

Rule

Long PBF+VLO equal-weight 10d when NOAA hurricane forces >50% Gulf offshore crude shut-in

Caveats

Only N=6 events, t-stat 0.97 weak. Extreme CAGR (206.3%) inflated by short 10-day holding period. MaxDD -30.3% is significant. PBF may not have been available for early events. Hurricane timing is inherently unpredictable. Related to PL017 (hurricane season refiners).

Source: NOAA; yfinancebacktests/_run_batch_pl.py

PL298_eia_ethanol_collapse_corn_surplus_poultry_feed

EIA Ethanol Production Collapse → Long Protein Packers

commodities · ~2-month hold · 6 events

1.31

Sharpe

110.6%

CAGR

-37.2%

MaxDD

1.31

t-stat

Mechanism

EIA weekly ethanol production dropping >10% from peak signals corn demand destruction (ethanol consumes ~40% of US corn). Cheaper feed costs benefit protein packers (PPC, TSN) whose #1 input cost is corn-based feed. 5/6 events positive.

Rule

Long PPC+TSN equal-weight 42d when EIA weekly ethanol production drops >10% from peak

Caveats

N=6 events, t-stat 1.31. MaxDD -37.2% driven by 2008 financial crisis event. Ethanol production collapse can signal broader economic weakness that hurts meat demand. Feed cost is only one input — cattle prices matter more. Related to PL272 (slaughter recovery packers) and PL012 (cattle on feed).

Source: EIA; yfinancebacktests/_run_batch_pl.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=120, n_events=6.

PL292_eia_coal_stockpile_low_days_burn_producer_long

EIA Coal Stockpiles Low Days-of-Burn → Long Coal Producer

energy · ~6-week hold · 6 events

1.48

Sharpe

145.9%

CAGR

-44.4%

MaxDD

1.02

t-stat

Mechanism

When EIA power plant coal stockpiles fall below 60 days-of-burn, utilities must procure urgently, giving coal producers (BTU) pricing power. 3/6 events positive but winners were outsized. Extreme CAGR inflated by 2021-2022 energy crisis events.

Rule

Long BTU 30d when EIA coal stockpiles fall below 60 days-of-burn

Caveats

Only N=6 events, t-stat 1.02 weak. MaxDD -44.4% is severe (single-stock risk). Win rate only 50% — BTU is extremely volatile. Coal is a declining sector with structural headwinds. ARCH and CEIX were intended but unavailable on yfinance. Related to PL090 (Henry Hub below cash cost).

Source: EIA; yfinancebacktests/_run_batch_pl.py

PL316_census_defense_orders_spike_primes

Census Defense New Orders Surge → Long Defense Primes

defense · ~2-month hold · 7 events

2.28

Sharpe

45.9%

CAGR

-10.3%

MaxDD

2.46

t-stat

Mechanism

Census defense new orders surging >15% YoY signals upcoming contract awards and DoD procurement acceleration. Defense primes (LMT, RTX, NOC, GD) benefit directly. Perfect 7/7 win rate across events spanning 2007-2023.

Rule

Long LMT+RTX+NOC+GD equal-weight 42d when Census defense new orders surge >15% YoY

Caveats

N=7, t-stat 2.46 significant. Perfect win rate is encouraging but defense primes are secular growers. 2022 Ukraine event may have inflated recent returns. Census data has ~1 month lag. Related to PL124 (defense spending primes) and G10 (NDAA defense).

Source: Census; yfinancebacktests/_run_batch_pl2.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=126, n_events=6.

PL315_hpai_outbreak_egg_producer_long

HPAI Avian Flu Outbreak → Long Egg Producers (CALM)

agriculture · ~3-week hold · 6 events

3.18

Sharpe

146.2%

CAGR

-7.8%

MaxDD

2.52

t-stat

Mechanism

When USDA/APHIS confirms HPAI outbreak affecting >1M birds, egg supply contracts sharply. CALM (Cal-Maine Foods) as largest US egg producer benefits from extreme price spikes. 5/6 events positive. Low max drawdown (-7.8%).

Rule

Long CALM 21d when USDA/APHIS confirms HPAI outbreak affecting >1M birds

Caveats

Single stock risk (CALM only). N=6, t-stat 2.52 significant. Extreme CAGR inflated by short 21-day hold. 2022 outbreak was historically extreme and may not repeat. HPAI outbreaks are unpredictable. Egg prices spike but can normalize quickly.

Source: USDA APHIS; yfinancebacktests/_run_batch_pl2.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=168, n_events=8.

PL327_gdp_advance_beat_gdpnow_cyclical

BEA GDP Advance Beats GDPNow → Cyclical Sector Rotation

macro · ~3-week hold · 8 events

1.37

Sharpe

22.9%

CAGR

-9.1%

MaxDD

1.58

t-stat

Mechanism

When BEA GDP advance estimate beats Atlanta Fed GDPNow forecast by >0.5pp, it signals positive growth surprise. Cyclical sectors (XLI industrials, XLB materials, XLF financials) rotate higher. 5/8 events positive. Low max drawdown (-9.1%).

Rule

Long XLI+XLB+XLF equal-weight 21d when BEA GDP advance beats GDPNow by >0.5pp

Caveats

N=8, t-stat 1.58 borderline. GDP advance released quarterly (4 events/year max). GDPNow only available from 2011; earlier events estimated. GDP beats well-telegraphed by GDPNow tracking. Related to PL078 (GDPNow consensus gap SPY).

Source: BEA, Atlanta Fed; yfinancebacktests/_run_batch_pl2.py

PL326_census_mfg_construction_record_industrial_gas

Census Mfg Construction Record → Long Industrial Gas

construction · ~2-month hold · 7 events

1.16

Sharpe

21.4%

CAGR

-14.2%

MaxDD

1.25

t-stat

Mechanism

Census manufacturing construction spending hitting new records signals factory building boom (semiconductors, EV, reshoring). Industrial gas companies (APD, LIN) supply process gases to every new factory. 4/7 events positive.

Rule

Long APD+LIN equal-weight 42d when Census mfg construction spending hits new record

Caveats

N=7, t-stat 1.25 weak. APD and LIN are defensive industrials with secular tailwinds. Construction spending data has 2-month lag. Win rate only 57%. Related to PL245 (mfg employment industrial gas) which uses a different signal with better stats.

Source: Census; yfinancebacktests/_run_batch_pl2.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=126, n_events=6.

PL322_usda_export_unknown_soybean_china

USDA Export Sales Unknown Destination → Long Soybeans

agriculture · ~3-week hold · 6 events

1.40

Sharpe

25.7%

CAGR

-7.4%

MaxDD

1.11

t-stat

Mechanism

USDA weekly export sales showing >1MT to "unknown destination" is a well-known proxy for Chinese flash demand. SOYB rallies as market prices in bulk purchasing. 6/6 events positive with very low max drawdown (-7.4%).

Rule

Long SOYB 21d when USDA weekly export sales show >1MT to unknown destinations

Caveats

N=6, t-stat 1.11 weak. Perfect win rate but small sample. "Unknown destination" trick is well-known to grain traders. SOYB ETF has tracking error vs futures. US-China trade tensions can disrupt pattern. Related to PL014 (soybean export season).

Source: USDA Export Sales; yfinancebacktests/_run_batch_pl2.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=210, n_events=6.

PL329_census_rare_earth_import_disruption_domestic_mp

Rare Earth Import Disruption → Long MP Materials

supply chain · ~2-month hold · 6 events

1.69

Sharpe

184.7%

CAGR

-33.8%

MaxDD

1.34

t-stat

Mechanism

When Census trade data shows rare earth imports from China dropping >20% QoQ, domestic rare earth processor MP Materials benefits from supply scarcity premium. 4/6 events positive. Extreme CAGR driven by volatile single stock.

Rule

Long MP 42d when Census rare earth imports from China drop >20% QoQ

Caveats

Single stock risk (MP only). N=6, t-stat 1.34 not significant. MP IPO was 2020 so limited history. MaxDD -33.8% is severe. Rare earth supply disruptions are geopolitical and hard to time. Census trade data has 2-month lag.

Source: Census Trade; yfinancebacktests/_run_batch_pl2.py

PL338

★ BLS Healthcare Wage Acceleration + JOLTS Openings → Long Travel Nurse Agencies

equities (AMN, CCRN) · ~3-month hold · 8 events

0.77

Sharpe

29.2%

CAGR

-39%

MaxDD

1.09

t-stat

+25.2%

excess vs SPY

Mechanism

Travel nurse staffing agencies (AMN Healthcare, Cross Country Healthcare) operate on a bill-pay spread model: they charge hospitals a markup over permanent nurse wages. When BLS healthcare average hourly earnings accelerate past 5% YoY and JOLTS healthcare job openings remain elevated (above 24-month median), the bill-pay spread widens because agencies can charge premium rates for scarce travel nurses. The dual-signal requirement (wages AND openings) filters out normal wage drift and isolates acute labor shortage regimes where agencies capture outsized margins.

Rule

When FRED CES6500000008 (healthcare avg hourly earnings) YoY growth > 5% AND JTS6200JOL (JOLTS healthcare openings) above 24-month rolling median, long equal-weight AMN+CCRN for 63 trading days. Minimum 90 calendar day gap between signals.

Caveats

Heavy caveats: N=8 events, all clustered in 2021-2022 (COVID-driven nurse shortage). The signal has not fired outside of a pandemic regime, so there is no evidence of generality. MaxDD -39% is severe. AMN and CCRN have both collapsed 70%+ from 2022 peaks as travel nurse demand normalized. If healthcare wage growth re-accelerates (aging population, nurse retirements), the mechanism could re-activate, but the historical sample is a single episode. Position-size accordingly.

Source: FRED CES6500000008 + JTS6200JOL + yfinancebacktests/PL338_bls_nurse_wage_accel_travel_nurse_long.py

PL367

★ BEA International Travel Receipts Surge → Long Hotel REITs

equities (HST, PK, MAR) · ~42-day hold · 6 events

1.40

Sharpe

44.6%

CAGR

-17%

MaxDD

1.40

t-stat

+25.6%

excess vs SPY

Mechanism

International travel receipts (BEA balance of payments) measure total foreign visitor spending in the US. When this surges > 20% YoY, gateway city hotels experience elevated RevPAR through higher occupancy and ADR pricing power. Hotel REITs (HST, PK, MAR) own and operate properties in gateway cities (NYC, Miami, San Francisco, Los Angeles) that disproportionately benefit from international tourism demand.

Rule

When BEA international travel receipts (quarterly, via FRED) show YoY growth > 20%, long equal-weight HST+PK+MAR for 42 trading days. PK available from 2017; pre-2017 use HST+MAR. Minimum 90-day gap between signals.

Caveats

N=6 events, win rate 6/6. Events span 2010-2023 covering post-GFC and post-COVID tourism recoveries. BEA data releases with ~75-day lag. MaxDD only -17%. Excess CAGR +25.6% vs SPY is strong but needs more events for statistical confidence.

Source: FRED BEA travel receipts + yfinancebacktests/PL751_bea_travel_receipts_hotel_long.py

PL396

★ Broker-Dealer Momentum → Long M&A Boutique Banks

equities (EVR, PJT) · ~42-day hold · 36 events

1.13

Sharpe

38.8%

CAGR

-20%

MaxDD

1.13

t-stat

+4.3%

avg excess

Mechanism

When the broker-dealer sector (IAI ETF) outperforms SPY by >5% over 20 trading days, it signals an inflecting M&A cycle. Boutique advisory firms (Evercore, PJT Partners) have 3-5x higher revenue sensitivity to deal volume than diversified banks because 50-70% of their revenue comes from advisory fees. The IAI momentum signal captures early-cycle M&A acceleration that has not yet fully priced into pure-play advisory stocks.

Rule

When IAI (broker-dealer ETF) 20-day cumulative return exceeds SPY by >5%, long equal-weight EVR+PJT for 42 trading days. Pre-2015 use EVR+LAZ. Minimum 90-day gap between signals.

Caveats

36 events over 2006-2025 is a decent sample. Win rate 24/36 (67%). The signal fires during both genuine M&A recoveries and false starts (2007 pre-GFC, 2022 rate shock). MaxDD -20% is moderate. The 2025 event (+27.2%) is partly from a tariff-driven rebound. Excess returns vary significantly by regime. The proxy (IAI momentum) is an imperfect substitute for FTC HSR early termination data, which would be a cleaner signal.

Source: FTC HSR (proxy via IAI momentum) + yfinancebacktests/PL396_ftc_hsr_early_term_ma_ibank_long.py

PL398

★ NOAA H1 Severe Weather Season → Long P&C Reinsurers

equities (RNR, ACGL) · ~63-day hold (Jul-Sep) · 12 events

0.68

Sharpe

14.0%

CAGR

-12%

MaxDD

0.68

t-stat

+2.9%

avg excess

Mechanism

When NOAA confirms 5+ billion-dollar weather disasters in the first half of the year, it signals above-average catastrophe losses that drive reinsurance rate hardening at the critical June/July renewal season. Reinsurers (RenaissanceRe, Arch Capital) benefit from forward premium adequacy improvements as they can reprice risk at higher rates. The July entry timing captures the post-renewal period when hardened rates are locked in.

Rule

When NOAA NCEI confirms 5+ billion-dollar weather disasters in Jan-Jun, long equal-weight RNR+ACGL from July 1 for 63 trading days. One signal per year maximum.

Caveats

12 events over 2008-2025. Win rate 8/12 (67%). MaxDD only -12% which is attractive for an equity strategy. The signal-year performance is skewed by 2024 (+16.6%) when reinsurer stocks had a secular re-rating. Some signal years (2020, 2021, 2022) saw poor returns despite elevated H1 disasters because the broader market regime dominated. H1 disaster counts are curated from NOAA NCEI public data but involve judgment on timing boundaries.

Source: NOAA NCEI Billion-Dollar Disasters + yfinancebacktests/PL398_noaa_severe_weather_reinsurer_hardening_long.py

PL408

★ PNW Grain Season UNP Outperformance → Long Union Pacific

equities (UNP) · ~42-day hold · 35 events

1.07

Sharpe

32.9%

CAGR

-15%

MaxDD

1.07

t-stat

+3.3%

avg excess

Mechanism

Union Pacific dominates the rail franchise serving PNW grain export terminals (Portland, Longview, Tacoma). When UNP outperforms CSX (which serves Eastern/Gulf corridors) by >5% over 20 days during grain export season (Sep-Feb), it signals that PNW grain flows are surging relative to Gulf. This captures periods of strong Asian demand for US grains routed through Pacific Northwest ports, which directly benefits UNP's grain franchise revenue.

Rule

When UNP outperforms CSX by >5% over trailing 20 trading days during September-February, long UNP for 42 trading days. Minimum 90-day gap between signals.

Caveats

35 events over 2000-2025 is an excellent sample. Win rate 25/35 (71%). MaxDD -15% is low. Avg excess +3.3% vs SPY per trade. However, the proxy (UNP/CSX relative strength) may capture momentum effects beyond grain-specific fundamentals. Some events cluster (2000-2002, 2006-2008) suggesting regime dependency. The signal fires in ~70% of grain seasons, raising questions about its selectivity. Still, the combination of high Sharpe, low MaxDD, and large N makes this one of the stronger pipeline signals.

Source: USDA PNW grain (proxy via UNP/CSX relative strength) + yfinancebacktests/PL408_usda_pnw_grain_export_unp_long.py

PL409

★ Census M3 Core Capex Inflection → Long Industrial Distributors

equities (FAST, GWW) · ~42-day hold · 31 events

0.55

Sharpe

12.9%

CAGR

-20%

MaxDD

0.55

t-stat

+1.8%

avg excess

Mechanism

Census M3 nondefense capital goods shipments (ex-aircraft) is the gold standard monthly indicator of business investment. When this series inflects from negative to positive YoY growth (after 3+ months of contraction), it signals a capex recovery cycle that drives downstream MRO consumable demand at industrial distributors. Fastenal and Grainger report daily sales growth that correlates with capex equipment deliveries with a 1-2 month lag.

Rule

When FRED A34SNO YoY growth turns positive after 3+ months negative, OR accelerates >3pp from trough, long FAST+GWW equal-weight for 42 trading days. Minimum 90-day gap.

Caveats

31 events over 1998-2023 with 18/31 win rate (58%). The signal fires frequently during prolonged downturns (multiple triggers during 2001-2003, 2014-2015) with mixed early results. Strongest returns come at true cycle bottoms (2009: +26.7%, 2012: +15.8%). FRED A34SNO has a 35-day publication lag, reducing timeliness. MaxDD -20% is moderate. Excess +1.8% per trade is modest but consistent over 25 years of data.

Source: FRED A34SNO + yfinancebacktests/PL409_census_m3_core_capex_industrial_dist_long.py

PL418

★ Treasury Bill Issuance Surge → Long Money Market Asset Managers

equities (STT, BLK) · ~42-day hold · 51 events

0.57

Sharpe

13.1%

CAGR

-39%

MaxDD

1.61

t-stat

51

events

Mechanism

When Treasury bill outstanding grows >5% QoQ, it signals heavy front-end issuance that drives money market fund AUM growth. State Street (STT) and BlackRock (BLK) earn management fees on money market fund assets, and T-bill supply expansion creates a virtuous cycle of higher yields attracting more inflows. The signal captures periods of fiscal expansion or debt ceiling aftermath when Treasury rebuilds the TGA via bill issuance.

Rule

When FRED WTREGEN (T-bill outstanding) shows QoQ growth >5%, long equal-weight STT+BLK for 42 trading days. Minimum 90-day gap between signals.

Caveats

51 events over 2005-2025 is a strong sample with t-stat 1.61. MaxDD -39% is severe (driven by GFC). The signal fires frequently during fiscal expansion periods. STT and BLK are diversified asset managers so T-bill issuance is one of many revenue drivers. The strongest returns come during post-crisis periods when bill supply normalizes.

Source: FRED WTREGEN + yfinancebacktests/PL418_treasury_bill_shift_mmf_asset_mgr_long.py

PL425

★ Census E-Commerce Share Sequential Decline → Long Mall REITs

equities (SPG, MAC) · ~30-day hold · 22 events

0.84

Sharpe

32.4%

CAGR

-36%

MaxDD

1.36

t-stat

22

events

Mechanism

When Census quarterly e-commerce as a share of total retail sales declines sequentially (QoQ), it signals that brick-and-mortar retail is gaining ground relative to online. This benefits mall REITs (Simon Property Group, Macerich) through improved tenant health, higher occupancy, and re-leasing spreads. The e-commerce share typically declines in Q4 (holiday in-store shopping) and during physical retail recovery phases.

Rule

When FRED ECOMPCTNSA (e-commerce share of retail) declines >0.3 percentage points QoQ, long equal-weight SPG+MAC for 30 trading days.

Caveats

22 events over 2005-2025. MaxDD -36% is significant. The signal captures both seasonal patterns (Q4 in-store shopping boost) and structural shifts (post-COVID return to physical retail). MAC is a higher-beta play on mall recovery with more volatile returns. The e-commerce share has been on a secular uptrend, so sequential declines are typically temporary reversions rather than trend changes.

Source: FRED ECOMPCTNSA + yfinancebacktests/PL425_census_ecommerce_share_decline_mall_reit_long.py

Demoted: no longer passes the tightened winner gate. Failed: Sharpe 0.05 ≤ 0.50; CAGR -3.4% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; OOS Sharpe -0.16 ≤ 0.

PL436

★ FDA Novel Drug Approval Cluster → Long XBI (Biotech M&A Wave)

equities (XBI) · ~42-day hold · 8 events

1.72

Sharpe

53.7%

CAGR

-21%

MaxDD

1.99

t-stat

+30.3%

excess vs SPY

Mechanism

When FDA CDER approves >5 novel molecular entities (NMEs) in a single calendar month, it signals a cluster of commercially validated drug programs that triggers big pharma M&A interest. Large pharma companies facing patent cliffs use NME approvals as a shopping list for bolt-on acquisitions. The cluster effect creates a brief window where biotech sector sentiment lifts as multiple validated targets emerge simultaneously, driving XBI higher.

Rule

When FDA CDER approves >5 NMEs in a calendar month (vs baseline of ~3-4/month), long XBI for 42 trading days starting first trading day of following month.

Caveats

N=8 events over 2014-2024. t-stat 1.99 is borderline significant. The NME counts are curated from FDA CDER annual reports and involve judgment on what constitutes a "cluster" month. MaxDD -21% is moderate. The mechanism is plausible but the sample is small and events are hand-picked. The strongest returns came during the 2020-2021 biotech bull market. In bear biotech markets (2022), even NME clusters may not lift XBI.

Source: FDA CDER Novel Drug Approvals + yfinancebacktests/PL436_fda_nme_cluster_biotech_ma_wave.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; sample too small: n_days=210, n_events=5.

PL435

★ Columbia River Low Flow → Long Thermal IPPs (NRG)

equities (NRG) · ~42-day hold (Jul-Aug) · 5 events

1.00

Sharpe

31.5%

CAGR

-18%

MaxDD

0.92

t-stat

+8.2%

excess vs SPY

Mechanism

The Columbia River system provides ~40% of US hydroelectric generation. When USGS streamflow at The Dalles, OR drops below the 25th percentile of historical norms, hydro generation declines and thermal power plants (natural gas, coal) must fill the gap at higher marginal costs. This drives up wholesale power prices in the Western Interconnection, benefiting thermal IPPs like NRG Energy through higher spark-spread margins on their gas fleet.

Rule

When USGS streamflow at The Dalles, OR shows 30-day average below 25th percentile of historical distribution during summer, long NRG from July 1 for 42 trading days. Curated drought years: 2015, 2018, 2021, 2022, 2024.

Caveats

Heavy caveats: N=5 only. Curated drought years introduce hindsight bias. NRG is a diversified power company, so Columbia River hydro is just one driver. The 2023-2024 NRG rally was driven primarily by the AI/data center power demand narrative, not hydro shortfall. t-stat 0.92 is not statistically significant. MaxDD -18% is moderate.

Source: USGS NWIS (proxy via curated drought years) + yfinancebacktests/PL759_usgs_columbia_low_flow_thermal_ipp_long.py

PL439

★ ASCE Water Infrastructure Grade Decline → Long AWK (Water Utility Capex)

equities (AWK) · ~40-day hold · 10 events

0.87

Sharpe

17.0%

CAGR

-22%

MaxDD

1.10

t-stat

70%

win rate

Mechanism

When ASCE publishes its quadrennial Infrastructure Report Card showing poor drinking water grades (D-range), or when major water infrastructure crises occur (Flint MI, Toledo algal bloom, EPA PFAS rules, IIJA signing), public attention to water infrastructure investment surges. This drives regulatory and legislative action that expands water utility capex programs and rate base growth. American Water Works (AWK), as the largest US water utility, directly benefits from increased infrastructure investment mandates and rate case approvals.

Rule

Long AWK for 40 trading days after ASCE Infrastructure Report Card publication or major water infrastructure catalyst events (water crises, EPA drinking water rules, federal water infrastructure legislation).

Caveats

10 events over 2009-2023 combining ASCE reports (4) and supplementary catalysts (6). 70% win rate with avg 40d return of +2.85%. The strongest returns came from crisis events (Flint +14%, ASCE 2021 +16.9%). Negative excess vs SPY (-0.91%) suggests AWK's defensive nature means it underperforms in strong markets. The ASCE report fires only every 4 years, making this infrequent; supplementary events improve testability but introduce selection bias. MaxDD -22% driven by 2009 ASCE event during GFC.

Source: ASCE Infrastructure Report Card + EPA SDWIS + yfinancebacktests/PL439_asce_water_grade_decline_utility_capex.py

PL451

★ IRS Partnership Return Surge → Long Thomson Reuters (Legal Demand)

equities (TRI) · ~42-day hold · 7 events

0.87

Sharpe

19.1%

CAGR

-12%

MaxDD

0.87

t-stat

5/7

win rate

Mechanism

IRS Statistics of Income data showing >8% YoY growth in partnership return filings signals accelerating business formation complexity. More partnerships mean more K-1 filings, more compliance requirements, and more legal/tax advisory demand. Thomson Reuters (TRI) owns Westlaw, Practical Law, and tax compliance platforms that directly benefit from increased legal workflow volume. The signal fires in January following the IRS data year.

Rule

When IRS SOI partnership return filings grow >8% YoY, long TRI for 42 trading days starting mid-January following the data year. Curated strong-formation years: 2014-2016, 2018-2019, 2021-2022.

Caveats

N=7 events. IRS SOI data has a 1-2 year publication lag, so the actual signal would need to use Census Business Formation Statistics as a leading indicator. TRI is a large diversified information company; legal/tax is ~50% of revenue. MaxDD only -12% which is attractive. Win rate 5/7 (71%).

Source: IRS SOI + yfinancebacktests/PL451_irs_soi_partnership_returns_legal_demand.py

Demoted: no longer passes the tightened winner gate. Failed: Sharpe 0.45 ≤ 0.50; CAGR 9.5% ≤ 10%; fails Benjamini-Hochberg multiple-testing correction at FDR=0.05.

PL460

★ Census Housing Completions Surge → Long Residential HVAC

equities (CARR, LII) · ~42-day hold · 40 events

0.85

Sharpe

20.3%

CAGR

-43%

MaxDD ⚠

2.13

t-stat

+11.1%

excess vs SPY

Mechanism

Census housing completions (COMPUTSA) measure actual new home deliveries requiring HVAC system installation. When completions surge >10% YoY, residential HVAC manufacturers (Carrier Global, Lennox International) see direct order-book acceleration. Each new home requires a $5-15K HVAC system, making completions a near-mechanical demand driver for these companies. The signal has a natural 1-2 month lag as completions data is published ~5 weeks after the reference month.

Rule

When FRED COMPUTSA (housing completions, SA annual rate) YoY growth >10%, long equal-weight CARR+LII for 42 trading days. Minimum 90-day gap between signals. CARR available from April 2020 (Carrier spin-off); pre-2020 uses LII only.

Caveats

40 events over 2005-2025 with t-stat 2.13 (statistically significant at 5%). MaxDD -43% is severe and occurred during the GFC housing collapse. The signal fires during housing booms which can reverse sharply. CARR only has data from 2020 (spun from UTX), so pre-2020 the backtest is LII-only. Excess CAGR +11.1% vs SPY is strong across a large sample.

Source: FRED COMPUTSA + yfinancebacktests/PL460_census_housing_completions_hvac_long.py

PL456

★ USGS Titanium Sponge Tightening → Long Specialty Metals (CRS+ATI)

equities (CRS, ATI) · ~40-day hold · 7 events

0.68

Sharpe

19.0%

CAGR

-30%

MaxDD

0.71

t-stat

57%

win rate

Mechanism

When USGS Mineral Commodity Summaries show titanium sponge supply tightening (declining stockpiles, rising imports, constrained production) while aerospace production rates are rising (Boeing/Airbus delivery ramps), specialty metals companies Carpenter Technology (CRS) and ATI benefit from pricing power and volume growth. Titanium is critical for jet engine and airframe components, and supply constraints during aerospace upcycles create margin expansion for specialty alloy producers. The signal is most powerful when geopolitical factors (Russia/VSMPO-AVISMA sanctions) compound supply tightness.

Rule

When USGS MCS titanium chapter shows supply tightening (stockpile drawdown or import dependence rising) concurrent with aerospace production rate increases, long equal-weight CRS+ATI for 40 trading days post-MCS publication (late January). Annual frequency.

Caveats

7 events over 2011-2025. Annual USGS publication frequency limits sample size. Win rate 57% (4/7). The 2014 and 2024 events produced outsized returns (+17.3% and +20.7%) while the 2018 event was badly timed (-12.9%, coinciding with late-cycle volatility). MaxDD -30% is significant. ATI went through bankruptcy restructuring in 2020, complicating pre-2020 analysis. The mechanism is sound but the small sample and event selection introduce uncertainty. t-stat 0.71 is not statistically significant.

Source: USGS MCS Titanium + yfinancebacktests/PL456_usgs_titanium_sponge_stockpile_aerospace.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; sample too small: n_days=210, n_events=5.

PL473

★ CMS MA Star Rating Upgrade Cycle → Long Managed Care

equities (UNH, HUM, ELV) · ~42-day hold · 5 events

2.64

Sharpe

75.4%

CAGR

-8%

MaxDD

2.41

t-stat

5

events ⚠

Mechanism

CMS publishes Medicare Advantage star ratings in October each year. When the upgrade cycle favors the major MCOs (UnitedHealth, Humana, Elevance), it signals higher quality bonus payments (5% premium increase for 4+ star plans) improving forward-year economics. Star rating upgrades give a 14-month visibility window into revenue uplift.

Rule

When CMS October MA star rating publication shows favorable upgrade cycle, long equal-weight UNH+HUM+ELV for 42 trading days. Curated upgrade years: 2014, 2016, 2018, 2021, 2023.

Caveats

Heavy caveats: N=5 only, 5/5 win rate. MaxDD -8% exceptional but tiny sample. Curated years introduce hindsight bias. MCO sector in secular uptrend from Medicare enrollment growth. t-stat 2.41 unreliable with 5 observations.

Source: CMS MA Star Ratings + yfinancebacktests/PL473_cms_ma_star_upgrade_mco_long.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; sample too small: n_days=172, n_events=6.

PL472

★ Baltic FFA Contango Flip → Long Dry Bulk Shippers (SBLK+GNK)

equities (SBLK, GNK) · ~30-day hold · 6 events

1.11

Sharpe

88.5%

CAGR

-34%

MaxDD

0.93

t-stat

50%

win rate

Mechanism

When the Baltic Dry Index (BDI) hits an extreme trough (below 10th percentile historically) and begins recovering, the FFA forward curve typically flips from backwardation to contango, signaling market expectations of improving freight rates. Dry bulk shipping equities (Star Bulk Carriers, Genco Shipping) are highly leveraged to freight rates and rerate aggressively during recoveries. The 2016 recovery from BDI 290 (all-time low) produced +53% in 30 days, and the 2020 COVID recovery produced +44%.

Rule

When BDI recovers from an extreme trough (bottom-decile reading followed by 50%+ recovery from the trough level), long equal-weight SBLK+GNK for 30 trading days.

Caveats

Heavy caveats: Only 6 events over 2009-2023. 50% win rate -- the 2009 and 2012 events produced large losses (-17% and -17%). The winners were enormous (2016: +53%, 2020: +44%) creating positive expected value despite low hit rate. Baltic FFA data is not freely available on FRED/yfinance, so we use BDI trough events as a proxy. Dry bulk equities are extremely volatile (30%+ drawdowns common). The signal is essentially a contrarian bet on cyclical recovery. t-stat 0.93 is not statistically significant.

Source: Baltic Exchange BDI (proxy via trough events) + yfinancebacktests/PL472_baltic_ffa_contango_drybulk_recovery.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; sample too small: n_days=251, n_events=6.

PL478

★ JOLTS Industry Openings Divergence → Sector Rotation (XHB/XLI)

equities (XHB, XLI) · ~42-day hold · 6 events

2.31

Sharpe

80.2%

CAGR

-3%

MaxDD

2.15

t-stat

83%

win rate

Mechanism

When BLS JOLTS monthly data shows a specific industry's job openings surging >20% YoY while total nonfarm openings are flat or declining, it signals sector-specific demand strength that has not yet been fully reflected in equity prices. The labor market leads revenue by 1-2 quarters: when an industry is aggressively hiring while others contract, it points to upcoming earnings beats. Construction openings (mapped to XHB) dominated the signal, firing during housing recovery inflections (2009, 2016, 2022-2023).

Rule

When JOLTS industry openings surge >20% YoY while total nonfarm JOLTS openings are flat or declining YoY, long the corresponding sector ETF (Healthcare→XLV, Construction→XHB, Manufacturing→XLI) for 42 trading days. 180-day minimum gap per industry.

Caveats

6 events over 2009-2025 -- small sample. 83% win rate (5/6) with only one loss (-2.6%). Construction/XHB dominated the signal (5 of 6 events). Avg excess vs SPY of +6.4% per event is strong. However, the signal may simply be capturing housing cycle recoveries via a roundabout route. The JOLTS data has ~2 month publication lag. t-stat 2.15 is promising but sample size is tiny. Would benefit from more industry-ETF mappings to increase event count.

Source: FRED JOLTS + yfinancebacktests/PL478_jolts_industry_divergence_sector_rotation_long.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; sample too small: n_days=240, n_events=6.

PL489

★ DOE Industrial Assessment Center Efficiency Surge → Long ROK+EMR

equities (ROK, EMR) · ~40-day hold · 6 events

0.80

Sharpe

17.5%

CAGR

-26%

MaxDD

0.77

t-stat

83%

win rate

Mechanism

When DOE Industrial Assessment Center (IAC) annual data shows implementation recommendation rates increasing (>10pp YoY) and average projected savings per assessment rising (>20%), it signals a surge in industrial energy efficiency investment. This drives demand for automation and process control equipment from Rockwell Automation (ROK) and Emerson Electric (EMR). The IAC program conducts free energy assessments for mid-size manufacturers; rising adoption rates are a leading indicator of broader industrial efficiency capex. Energy price spikes and IRA incentives have been the strongest catalysts.

Rule

When DOE IAC annual data shows rising efficiency recommendation adoption and savings, long equal-weight ROK+EMR for 40 trading days post-publication.

Caveats

6 events over 2008-2023. 83% win rate (5/6) with only the 2008 GFC event producing a loss (-20.6%). The 2022-2023 events showed strong returns (+9.5%, +17.5%) driven by IRA incentives and energy price spikes. Small sample size limits statistical confidence (t-stat 0.77). DOE IAC data is annual with significant lag. ROK and EMR are diversified industrials, so IAC adoption is one of many revenue drivers. The signal may simply capture industrial capex cycle timing.

Source: DOE Industrial Assessment Center + yfinancebacktests/PL489_doe_iac_energy_efficiency_equipment.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; no OOS Sharpe computed (rerun via shared harness); sample too small: n_days=45, n_events=0.

PL509

★ IEA Oil Market Report Demand Up-Revision → Long USO + XOP

commodities/energy (USO, XOP) · ~15-day hold · 3 events

1.63

Sharpe

68.6%

CAGR

-13%

MaxDD

0.69

t-stat

33%

win rate

Mechanism

The IEA's monthly Oil Market Report (OMR) is the most widely tracked global oil demand forecast in the industry. When the IEA revises its current-year headline demand forecast UPWARD by ≥ +0.3 mb/d versus the prior month, physical and financial crude participants read it as a bullish signal on the supply/demand balance. WTI typically rallies as traders re-price the call on OPEC+ supply, and producer equities (XOP) respond with operational leverage as cash-flow forecasts are marked up. The 15-day window captures the post-release drift through subsequent OPEC+ commentary.

Rule

On an IEA OMR release date where current-year global oil demand forecast was revised UPWARD by ≥ +0.3 mb/d vs the prior month (same calendar-year baseline), enter long equal-weight USO (50%) + XOP (50%) at the next session's open; hold 15 trading days; exit at the close. SPY benchmark.

Caveats

Heavy caveats: Only 3 strict events fired in 2021-2026 (all in 2022) and t-stat is just 0.69 -- result is dominated by a single +17.7% event in Feb-2022 (Russia/Ukraine invasion shock). Median event return is actually negative (-3.4%). The relaxed ≥ +0.2 mb/d threshold gives 9 events with win-rate 44% and a much weaker average. USO carries known contango roll yield drag, mechanically depressing long-WTI exposure outside backwardation regimes. OMR demand values are hard-coded from press releases (~0.1 mb/d precision); IEA historical revisions could re-class trigger dates. Strong tail-driven Sharpe -- treat as small-sample event-study, not as a continuous strategy.

Source: IEA Oil Market Report press releases + yfinancebacktests/PL509_iea_omr_demand_uprevision_crude.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; OOS Sharpe -0.76 ≤ 0.

PL511

★ CDC C. auris Surge Announcements → Long Antifungal/Diagnostics vs XLV

equities/healthcare (SCYX, DHR, TMO vs XLV) · 60-day hold · 5 events

0.61

Sharpe

21.9%

CAGR

-33.6%

MaxDD

0.66

t-stat

40%

win rate

Mechanism

CDC escalation announcements about Candida auris (HAN advisories, MMWR surveillance reports, AR Threats tier-1 designation) trigger hospital and public-health demand for novel antifungal therapeutics (SCYX is the pure-play antifungal developer with ibrexafungerp) and for clinical-diagnostics platforms able to rapidly identify resistant fungal isolates (DHR/Cepheid, TMO). Shorting equal-notional XLV neutralizes broad healthcare sector beta so the spread isolates the C. auris-specific demand shock. The 60-day window captures news → guidance → contract-flow propagation.

Rule

On each known CDC C. auris escalation announcement date (CDC HAN advisory, MMWR report on C. auris case acceleration, or annual CDC AR Threats update flagging C. auris as Tier 1 urgent threat), enter at next session's open a long basket (50% SCYX + 25% DHR + 25% TMO) and short an equal-notional XLV. Hold 60 trading days; exit at the close.

Caveats

Heavy caveats: Small event sample (n=5 curated dates, 2023-2025). Event dates hand-curated from CDC HAN/MMWR archive — different curators might select different dates and add look-ahead risk. Win rate is just 40%; the headline CAGR is dominated by the +58% Mar-2023 event (first big public escalation around MMWR). Out-of-sample half is negative (OOS Sharpe -0.76); in-sample Sharpe was 1.38 — classic regime-decay pattern. SCYX is small-cap and illiquid (25 bps one-way slippage applied; realized cost could be higher). MNK was dropped (delisted). Overlapping events not double-stacked. t-stat 0.66 is not significant — treat as small-sample event study, not a continuous strategy.

Source: CDC HAN/MMWR archive + AR Threats Report; yfinancebacktests/PL511_candida_auris_antifungal_event.py

Demoted: no longer passes the tightened winner gate. Failed: fails Benjamini-Hochberg multiple-testing correction at FDR=0.05; sample too small: n_days=210, n_events=5.

PL524

★ ADF&G Bristol Bay Sockeye Mid-Run Shortfall → Long Norwegian Salmon Farmers (MHGVY+BKFKF) vs KR

equities/consumer-staples (MHGVY, BKFKF vs KR) · 42-day hold · 5 events

0.87

Sharpe

23.9%

CAGR

-20.5%

MaxDD

0.79

t-stat

60%

win rate (net)

Mechanism

Bristol Bay supplies ~40% of the world's wild sockeye salmon. When ADF&G's in-season management indicates cumulative mid-July escapement is running materially below preseason forecast, the global wild-sockeye supply gap gets filled by farmed Atlantic salmon from Norway, supporting NOK-denominated spot prices and producer margins at Mowi (MHGVY) and Bakkafrost (BKFKF). US grocers like KR carry inventory and markdown risk on their perishable seafood category as wholesale prices rise. The long Norwegian-farmer / short KR spread captures the relative margin tailwind to producers vs the squeezed retail-perishable margins. 42 trading days (~8 weeks) is the typical spot-price → producer-margin → equity propagation window.

Rule

On each curated ADF&G Bristol Bay mid-run sockeye-escapement-shortfall confirmation date (cumulative escapement vs preseason forecast running ≥ 25% below by July 10-20), enter at the next session's open a dollar-neutral pair: long basket 50% MHGVY + 50% BKFKF, short equal-notional KR. Hold 42 trading days, exit at close. SPY benchmark.

Caveats

Heavy caveats: Very small curated event sample (n=5: 2016, 2018, 2020, 2023, 2024). Event dates are approximate to ADF&G mid-July in-season management windows -- alternate curators could pick different dates, raising selection-bias and look-ahead concerns. MHGVY (Mowi) and BKFKF (Bakkafrost) are pink-sheet ADRs with thin US volume and material bid-ask spreads; backtest applies 25-30 bps one-way slippage but realized impact at scale would be higher. Bristol Bay shortfalls and Atlantic-salmon price dynamics are also driven by Chile/Norway disease cycles, biomass restrictions, and NOK/USD FX moves that this single-factor spread does not isolate. KR is exposed to broader US-grocery margin pressure independent of seafood mix. t-stat 0.79 is not statistically significant -- treat as a small-sample event-study trade, not a continuous strategy. OOS sub-period (post-2020) Sharpe is 1.72; IS half is negative, so the signal works largely from 2020-onward (Mowi/Bakkafrost ADR liquidity also grew over that period).

Source: ADF&G Bristol Bay annual season summaries, NOAA NMFS landings, ASMI market reports; yfinancebacktests/PL524_adfg_bristol_bay_sockeye_shortfall_salmon.py

PL661

FRED CPI Communication Subindex YoY Trough → Long XLY+XLP

macro · monthly trigger · 1810 days · 29 events

1.58

Sharpe

22.6%

CAGR (in-pos)

-14.7%

MaxDD

4.22

t-stat

2.39

OOS Sharpe

Mechanism

CPI Communication subindex (FRED CUSR0000SEEB) prints monthly. A YoY trough after 2+ months of deflation marks a deflationary inflection in a granular consumer category — historically followed by a discretionary + staples rotation as real wages firm and household budgets reallocate. Family-enumerated variant of the broader "CPI subindex YoY trough → consumer basket long" pattern, surviving BH-correction at FDR=0.05 across the ~800-test catalog and a positive OOS Sharpe (2.39) on the held-out second half.

Rule

Each month: pull FRED CUSR0000SEEB (Communication CPI). When YoY change troughs (current month > prior month, after 2+ consecutive months of negative YoY), go long 50% XLY + 50% XLP for 42 trading days. Benchmark vs SPY.

Why it's promising

29 events across 1810 in-position days, t-stat 4.22, OOS Sharpe 2.39 (vs IS — OOS edge is real, not curve-fit). One of only 6 PL-prefixed strategies to pass the canonical winner gate (Sharpe>0.5 + CAGR>10% + BH-significant + OOS>0 + sample-size floor).

Caveats

This is a family-enumeration variant of the CPI-subindex pattern — many sibling CPI subseries were tested. BH correction passed, but the underlying mechanism is auto-generated rather than hand-derived, so the causal story for "Communication CPI" specifically is weaker than for the broader pattern. Sample is 29 events over ~7 years — meaningful but not deep. Strategy is fully in / fully out; portfolio-level CAGR including cash periods is lower than the in-position figure.

Source: FRED CUSR0000SEEB; family enumeration of CPI subindex variantsbacktests/PL661_cpi_subindex_cusr0000seeb.py

PL651

FRED CPI Shelter Subindex YoY Trough → Long XLY+XLP

macro · monthly trigger · 1260 days · 20 events

1.48

Sharpe

26.6%

CAGR (in-pos)

-16.9%

MaxDD

3.30

t-stat

1.74

OOS Sharpe

Mechanism

CPI Shelter subindex (FRED CUSR0000SAH1) is the largest single component of headline CPI (~33% weight) and famously lags rent-market reality by 6-18 months. A YoY trough after 2+ months of deflation in this lagging series tends to coincide with the end of the disinflation phase that real wages and consumer budgets respond to. The discretionary + staples basket captures both the recovering middle-class spend (XLY) and the defensive bid that lingers from the prior tight-budget regime (XLP).

Rule

Each month: pull FRED CUSR0000SAH1 (Shelter CPI). When YoY change troughs (current month > prior month after 2+ consecutive months of decline), go long 50% XLY + 50% XLP for 42 trading days. Benchmark vs SPY.

Why it's promising

20 events across 1260 in-position days, t-stat 3.30, OOS Sharpe 1.74 (vs IS, OOS edge survives the holdout). Shelter is structurally the slowest-moving CPI subindex, so the trough signal is reliable and non-noisy. Passes the canonical winner gate (Sharpe>0.5 + CAGR>10% + BH-significant + OOS>0 + sample-size).

Caveats

Same family-enumeration caveat as PL661 — the Shelter subindex variant survived BH but the broader mechanism is auto-generated rather than hand-derived. Shelter CPI has a known 6-18 month lag vs real rent prints; the YoY-trough signal can therefore mis-time relative to the actual rent regime change. Sample is 20 events over ~5 years.

Source: FRED CUSR0000SAH1; family enumeration of CPI subindex variantsbacktests/PL651_cpi_subindex_cusr0000sah1.py