Jump to section ▾

What predicts US residential real-estate returns?

~19 min read · generated 2026-05-23 · all data from free / public sources

Executive summary

  • Headline decile spread. On the 3-year horizon (post-2012 split), the model's top-decile ZIPs averaged +10.24% realized returns vs. the bottom decile at +2.82% — a +7.42-point out-of-sample spread.
  • Backtest hit-rate. Replaying month-by-month from 2018-01 through 2021-12 (48 basket-formation months × 3y hold), the top-decile basket beat the equal-weight universe in 91.7% of months at roughly +2.23 pp/yr alpha.
  • Timing verdict. The model's current (2026-04) universe-mean predicted fwd_3y annualized return is -1.35% — the 25th percentile of post-2012 history. The model is currently bearish vs. its historical baseline.
  • Hedge comparison. US residential RE comes last in an 8-asset cohort on median 3y real return (+3.5%); BTC and gold lead, REITs/equities mid-pack. Caveat: ZHVI is unlevered and ignores rent.
  • Non-obvious finding. Drawdown vs. the trailing-60-month ZHVI peak is the #1 mean-abs SHAP feature at fwd_1y — cross-sectional mean reversion is the dominant short-horizon signal, not demographics or schools.

How to read this report (a 60-second primer)

This report uses some specialized terms repeatedly. The shortest possible definitions:

ZIP / metro (CBSA)
A ZIP code is a USPS postal area (~22,000 across the US). A "metro" — Census's Core-Based Statistical Area or CBSA — is a grouping of nearby ZIPs that share a labor market. We restrict to the top 100 metros by population.
Forward return (fwd_1y, fwd_3y, fwd_5y)
"fwd_3y" is the price change of a typical home in a ZIP over the next 3 years, annualized. The model uses only information available at the start of the holding period — no peeking ahead.
ZHVI / ZORI
Zillow's monthly Home Value Index (the price level we predict) and Observed Rent Index (the yield-aware variant of the model).
LightGBM
A gradient-boosted decision-tree model — the standard ML workhorse for tabular data. The headline model in this report; the appendix compares it to ElasticNet and a demographics-only hedonic baseline.
SHAP
For each prediction, SHAP attributes a contribution to every input feature. Top-SHAP features tell us what the model relies on most. See the methodology page.
Decile spread
The realized return of the top 10% of ZIPs the model ranked highest, minus the bottom 10%. The bottom-line "would picking this model's top zips have helped?" number.
Backtest
Replay history. At each past month, we run the model on data available then, take its top-decile picks, and check what those ZIPs actually returned 3 years later.

1. Where might the model want to buy today?

Bottom line: Kansas City, Memphis, Detroit and St. Louis-area ZIPs dominate the top picks; predictions cluster around +5–7% annualized over 3 years with a median ±19 pp uncertainty band per ZIP. Bet sizing should reflect the band, not the point estimate.

Hover any zip for ZIP code, metro, ZHVI, predicted fwd_3y return, and decile rank. Drag to pan; scroll to zoom. Open the map full-screen ↗

Using the post-2012 LightGBM trained on 3-year forward returns, we scored every zip in the top-100 US metros at the most recent month with enough feature coverage to make a confident prediction. The table below is rolled up to one row per metro (CBSA), showing the metro's principal (largest-population) city, the best-scoring ZIP inside that metro, the mean predicted return across the metro's top-5 ZIPs, and how many of its ZIPs land in the top 10% of all predictions nationally. Hover over any feature for its description; click any column header to sort.

Metro (principal city)CBSABest ZIPBest ZIP priceMean pred (top 5 zips)Pred for best ZIPp10 (80% PI lo)p90 (80% PI hi)Zips in top 10%Top SHAP drivers for best ZIP (hover)
Kansas City, MO2814064127$87,604+6.23%+7.38%-3.65%+19.90%1410Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.017), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012)
Chicago, IL1698046406$77,661+5.03%+6.97%-2.02%+17.20%610Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.049), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.016), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012)
New Orleans, LA3538070117$197,104+4.83%+6.79%-2.60%+12.26%1710Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.048), FEMA other disastersOther-category FEMA disasters in the trailing 10 years. (+0.023), NRI wildfire riskFEMA NRI wildfire risk score (0-100). Tract-level. (+0.013)
Buffalo, NY1538014211$105,043+4.25%+6.44%-1.86%+16.96%1310Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.052), FEMA other disastersOther-category FEMA disasters in the trailing 10 years. (+0.015), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011)
Memphis, TN3282038109$89,220+5.42%+6.37%-1.31%+16.81%1510Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.048), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011)
Milwaukee, WI3334053206$68,826+2.88%+6.22%-2.84%+18.82%410Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.018), Median HH incomeMedian household income (ACS, $ per year). (+0.012)
Tulsa, OK4614074126$90,854+3.82%+5.33%-1.03%+14.43%1010Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.045), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.017), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012)
Jacksonville, FL2726032209$117,406+4.27%+5.06%-2.09%+13.77%1910Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.048), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.008)
Dayton, OH1943045404$94,232+4.37%+4.92%-3.22%+15.89%1010Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.049), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.012), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012)
Omaha, NE3654051545$80,280+4.76%+4.76%-1.76%+12.77%110Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.042), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.016), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012)
Akron, OH1042044307$72,092+4.09%+4.68%-4.31%+16.59%1010Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.054), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011)
Oklahoma City, OK3642074026$106,941+4.65%+4.65%-1.74%+12.92%110Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.041), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.013), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011)
Toledo, OH4578043607$71,388+3.32%+4.63%-3.61%+15.96%710Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.054), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.012), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011)
Louisville/Jefferson County, KY3114040211$85,011+2.64%+4.60%-3.18%+16.41%310Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.052), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.015), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.010)
Tampa, FL4530034661$158,723+4.55%+4.55%-2.97%+10.45%110Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.045), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.010), Mean hospital star ratingMean overall 1-5 CMS star rating across rated hospitals in the county. (-0.010)
Indianapolis, IN2690046016$80,043+2.25%+4.51%-3.42%+16.30%310Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.053), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.012), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.010)
Baltimore, MD1258021223$76,194+2.50%+4.40%-4.61%+15.76%510Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.053), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.014), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.009)
St. Louis, MO4118063133$61,113+3.45%+4.37%-3.68%+16.66%1210Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.054), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.013), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012)
Pittsburgh, PA3830015640$86,013+4.37%+4.37%-1.50%+12.28%110Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.040), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.015), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.010)
Cincinnati, OH1714045214$123,142+3.45%+4.29%-3.34%+15.33%510Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.055), Median HH incomeMedian household income (ACS, $ per year). (+0.013), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.009)
Cleveland, OH1741044003$60,286+4.15%+4.15%-1.82%+11.20%110Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.042), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.014), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011)
Baton Rouge, LA1294070805$86,721+2.99%+4.11%-3.29%+14.95%1010Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.045), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.014), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012)
Wichita, KS4862067214$82,065+2.62%+4.05%-3.42%+16.55%510Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.053), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.012), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012)
Philadelphia, PA3798019140$86,893+2.58%+4.03%-2.58%+17.74%310Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.054), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.016), Median HH incomeMedian household income (ACS, $ per year). (+0.009)
Atlanta, GA1206031816$129,303+4.03%+4.03%-3.04%+10.21%110Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.038), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011)
Knoxville, TN2894037729$106,337+3.99%+3.99%-2.59%+12.97%110Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.041), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.014), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.013)
Detroit, MI1982048238$55,536+3.66%+3.97%-4.37%+19.09%1310Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.063), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.014), M2 money supplySeasonally-adjusted M2 money supply in billions of $ (FRED: M2SL). Liquidity proxy. (+0.012)
Riverside, CA4014093562$69,621+3.95%+3.95%-5.50%+12.99%110Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.015), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011)
Augusta, GA1226030901$71,236+1.83%+3.89%-4.01%+15.47%210Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.047), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.014), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011)

1a. ZIP-level detail (top 20)

The same model expanded back to per-ZIP rows for users who want to drill into specific neighborhoods rather than cities.

ZIPCity, StateCBSAZHVIZORIPred fwd_3y ann.p10 (80% PI lo)p90 (80% PI hi)Top SHAP drivers (hover for feature description)
64127Kansas City, MO28140$87,604$987+7.38%-3.65%+19.90%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.017), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012)
64128Kansas City, MO28140$89,801$1,165+7.27%-3.33%+19.10%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.057), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.015), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012)
46406Gary, IN16980$77,661+6.97%-2.02%+17.20%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.049), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.015), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012)
70117New Orleans, LA35380$197,104$1,475+6.79%-2.60%+12.26%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.048), FEMA other disastersOther-category FEMA disasters in the trailing 10 years. (+0.023), NRI wildfire riskFEMA NRI wildfire risk score (0-100). Tract-level. (+0.013)
14211Buffalo, NY15380$105,043+6.44%-1.86%+16.96%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.052), FEMA other disastersOther-category FEMA disasters in the trailing 10 years. (+0.015), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011)
38109Memphis, TN32820$89,220$1,027+6.37%-1.31%+16.81%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.048), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011)
53206Milwaukee, WI33340$68,826$766+6.22%-2.84%+18.82%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.018), Median HH incomeMedian household income (ACS, $ per year). (+0.012)
64126Kansas City, MO28140$85,881+6.09%-3.79%+18.31%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.055), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.016), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012)
38106Memphis, TN32820$55,228$865+5.87%-2.08%+18.31%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.052), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.013), Wikipedia pageviewsMonthly pageviews of the metro's primary Wikipedia article — search-interest proxy. (+0.011)
64130Kansas City, MO28140$104,120$1,229+5.64%-3.28%+17.01%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.052), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.010)
74126Tulsa, OK46140$90,854+5.33%-1.03%+14.43%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.044), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.017), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012)
74130Tulsa, OK46140$91,695+5.13%-1.03%+13.39%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.044), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.018), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012)
32209Jacksonville, FL27260$117,406$1,079+5.06%-2.09%+13.77%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.048), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.008)
38114Memphis, TN32820$82,035$865+5.05%-2.90%+17.90%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.053), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011), Median HH incomeMedian household income (ACS, $ per year). (+0.008)
46407Gary, IN16980$56,193+5.05%-3.73%+18.32%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.058), Median HH incomeMedian household income (ACS, $ per year). (+0.011), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011)
38127Memphis, TN32820$88,799$990+4.93%-2.31%+17.89%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.051), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011)
45404Dayton, OH19430$94,232+4.92%-3.22%+15.89%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.049), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.012), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012)
38126Memphis, TN32820$56,088+4.86%-3.26%+18.06%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.052), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.012), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.010)
74650Ralston, OK46140$124,185+4.85%-1.84%+11.95%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.045), NRI wildfire riskFEMA NRI wildfire risk score (0-100). Tract-level. (+0.011), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011)
70041Buras, LA35380$140,424+4.81%-3.59%+11.59%10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.048), FEMA other disastersOther-category FEMA disasters in the trailing 10 years. (+0.016), Mean hospital star ratingMean overall 1-5 CMS star rating across rated hospitals in the county. (-0.009)
Important caveats on the leaderboard:
  • The p10/p90 bands tell the real story. Top zips show point predictions of +5% to +7%, but the 80% prediction interval per zip spans roughly [-2.7%, +15.8%] — a 18.7-point band on median.
  • The negative SHAP contribution from fred.real_rate_10y appears for every zip — it's not a zip-specific signal, just "rates are high right now."
  • This is a model output, not investment advice. The model is unaware of zoning, school districts, walkability, specific listings, or your personal constraints.

2. Personal-overlay shortlists (Phase 4)

Bottom line: Two pre-filtered shortlists. The owner-occupier picks land mostly in DC-metro suburbs (22 of 30 zips fall in the Washington, DC metro); the investor picks spread across ~15 mid-tier metros at a median 13.7% gross rent yield.

Two parallel ZIP-level shortlists, both nationwide and both built on the latest snapshot (2023-12), one tuned for an owner-occupier and one for a cash-flow investor. Filter thresholds use the median across top-100-metro zips on the snapshot (after forward-filling slow-moving structural features like school student-teacher ratio and ACS demographics from the most recent non-null panel month — these features lag the price target by 1–2 years and don't change month-to-month). The underlying parquets keep all 30 rows; the display tables below dedupe by metro (CBSA) so you don't see the same city ten times.

2a. Owner-occupier (price-return focus)

Scored on the price-only fwd_3y model. Filters keep zips with positive expected price return AND low climate risk (wildfire + flood ≤ median), low recent disaster frequency (≤ 5 FEMA declarations in 10 years), strong schools (student-teacher ratio ≤ median), good health outcomes (premature death rate ≤ median, as a life-expectancy proxy), and majority owner-occupied (acs__owner_occupied_pct ≥ 0.5).

30 zips; 22 of 30 zips fall in Washington, DC. Dominant metros: Washington, DC (22), Toledo, OH (7), Madison, WI (1). Median ZHVI $722,904; median predicted fwd_3y price return +1.59%.

ZIPCity, StateMetroZHVIPred fwd_3yp10p90NRI wildfireNRI floodFEMA 10yStud/TchrPrem. death (yppl)ACS own%
43406Bradner, OHToledo, OH$152,730+2.29%-1.18%+10.62%45.553.7214.264230.79
20152Chantilly, VAWashington, DC$832,524+2.09%-3.82%+7.72%33.035.7412.532430.85
53566Monroe, WIMadison, WI$245,767+1.64%-1.38%+9.47%52.542.4212.561960.69

Owner-occupier filters skipped:

2b. Investor (total-return + rent-yield focus)

Scored on the total-return fwd_3y model (price appreciation + ZORI rent carry, trained in the Phase 2 total-return run). Filters keep zips in the top quartile of predicted total return AND with a gross rent yield ≥ 6%, a ZHVI in the cash-flow-friendly band ($80K–$400K), and ACS owner-occupancy share in [0.4, 0.7] so there's an active rental market without signalling deep distress.

30 zips; spread across 15 metros: Kansas City, MO (4), St. Louis, MO (4), Detroit, MI (3). Median ZHVI $89,010; median rent yield 13.74%; median predicted fwd_3y total return +27.90%.

ZIPCity, StateMetroZHVIZORIRent yieldPred fwd_3y totalPred fwd_3y (price)ACS own%
19133Philadelphia, PAPhiladelphia, PA$81,289$1,278+18.87%+32.60%+3.28%0.41
64128Kansas City, MOKansas City, MO$89,801$1,165+15.57%+32.12%+7.27%0.45
63137Saint Louis, MOSt. Louis, MO$86,974$1,100+15.18%+31.21%+3.56%0.51
35228Birmingham, ALBirmingham, AL$81,166$1,024+15.14%+31.06%+2.66%0.61
48219Detroit, MIDetroit, MI$80,777$1,085+16.12%+29.74%+2.14%0.55
38114Memphis, TNMemphis, TN$82,035$865+12.65%+29.50%+5.05%0.45
45406Dayton, OHDayton, OH$85,512$1,006+14.12%+28.72%+4.13%0.48
44128Cleveland, OHCleveland, OH$94,931$1,114+14.08%+28.47%+2.78%0.48
44306Akron, OHAkron, OH$80,899$849+12.59%+28.27%+3.58%0.45
46016Anderson, INIndianapolis, IN$80,043$778+11.66%+27.90%+4.51%0.42
39212Jackson, MSJackson, MS$85,231$1,016+14.31%+26.12%+2.79%0.64
32209Jacksonville, FLJacksonville, FL$117,406$1,079+11.03%+26.11%+5.06%0.47
15210Pittsburgh, PAPittsburgh, PA$106,779$1,173+13.18%+25.50%+2.37%0.55
21213Baltimore, MDBaltimore, MD$119,160$1,477+14.88%+25.18%+2.24%0.55
78207San Antonio, TXSan Antonio, TX$120,432$1,188+11.84%+24.93%+2.05%0.44

2c. Overlap and threshold transparency

The two shortlists do not overlap on any ZIP. The owner-occupier profile up-weights low-risk, school-strong, low-disaster areas while the investor profile favors lower-priced, higher-yield areas; they end up pointing at different zips by construction.

Median thresholds used (over top-100-metro snapshot zips):

Caveats specific to this overlay.
  • NIBRS crime not yet in the feature store. The owner-occupier spec called for a crime filter but the FBI NIBRS county-year crime aggregate is gated on CDE_API_KEY and not yet loaded. The filter is skipped, documented above.
  • Total-return model has no quantile boosters. The investor table shows only point predictions for total return — the Phase 2 total-return run was point-only by design (runtime budget).
  • Slow-moving features are forward-filled. School ratios, ACS demographics, NRI scores, FEMA disaster counts, and premature mortality use each ZIP's most-recent non-null value across the panel (not the scoring month, which is often unrefreshed). For stable structural features this is a fair approximation; we're not claiming a real-time read on these.
  • This is not investment advice. The model is unaware of zoning, specific listings, walkability, your personal commute or family constraints — it's a quantitative sort, not a buy recommendation.

3. Macro context: is now a good time? And how does RE stack up against other hedges?

Bottom line: The model predicts the broad RE universe will return -1.35%/yr over the next 3 years — the 25th percentile of post-2012 history. Across 25 years of monthly rolling windows, residential RE has trailed every other inflation hedge in our cohort (BTC, gold, equities, REITs) before leverage and rent are added back.

Everything above is about where to buy. The two questions below are about when and what else: does the model think the broad RE universe is cheap or expensive right now, and how has private RE actually fared against the other classic inflation hedges (stocks, gold, REITs, BTC) when you put them on the same axis?

3a. Is now a good time to buy?

For each month from 2014 through the latest in the panel, we score every top-100-metro ZIP with the headline post-2012 LightGBM and take the universe mean of the predicted fwd_3y annualized return. The model's prediction at month t can be compared to the realized universe-mean fwd_3y return three years later (dashed line) for all months where t+3y is observable.

The model's current (2026-04) universe-mean predicted fwd_3y annualized return is -1.35% (median of the per-ZIP distribution that month; 10,396 ZIPs scored, 80% prediction interval roughly [-2.61%, -0.14%]). The historical median across all scored months in the post-2012 panel is +5.95% with an interquartile range of [-1.35%, +6.35%]. The current prediction sits at the 25th percentile of history — the model is currently bearish vs the historical baseline (n = 160 months scored).

Read the timing chart carefully. The model was trained on data ≤ 2017-12 with a test window of 2018-01 onward, so anything from 2014-2017 is in-sample. From 2018 forward, the gap between blue (predicted) and green (realized) is honest out-of-sample error. The model under-predicted the 2020-2022 COVID-era surge — a regime the training set didn't cover — and is now predicting near-zero / mildly negative returns because trailing real-rate prints are at decade-plus highs.

How wide is the model's uncertainty over time?

Three separate LightGBM models trained at the 10th, 50th, and 90th conditional quantiles produce an 80% prediction band around the median. When the band is narrow, the model is confident; when it widens, it is hedging.

3b. Real estate vs other inflation hedges

Real estate is rarely held in isolation in a real portfolio — the question "is now a good time to buy a house" is really "is real estate the best use of the marginal dollar right now". We put residential RE on the same chart as the S&P 500, broad equities (VTI), publicly-traded REITs (IYR / VNQ), gold, and Bitcoin, all in real terms (net of CPI). Residential RE is measured unlevered and net of nothing (rent, maintenance, property tax, transaction costs all ignored) — conservative for RE since the typical homeowner has both leverage and carrying costs.

Historical 3y / 5y annualized real returns (rolling, since 2000)

assethistory startsmedian 3y real returnstd 3y real3y windows positivemedian 5y real return5y windows positive
US residential RE (ZHVI top-100 mean)2000+3.53%5.13%71%+3.14%73%
S&P 500 (price)2000+6.74%8.49%77%+6.75%72%
S&P 500 (total return)2000+8.87%8.66%81%+8.67%77%
VTI (broad US equity)2001+9.26%7.58%88%+8.82%85%
Gold (front-month futures)2000+10.60%11.29%77%+7.62%81%
REITs (IYR ETF)2000+6.28%10.70%75%+5.09%75%
REITs (VNQ ETF)2004+4.58%10.19%71%+4.11%81%
Bitcoin2014+65.59%63.72%99%+56.02%98%

Trailing 3-year real returns ending the most recent month

assetwindow endstrailing 3y real returntrailing 3y nominaltrailing-10y median 3y realspread vs 10y median
Bitcoin2026-05-01+36.99%+41.23%+65.59%-28.60 pp
Gold (front-month futures)2026-05-01+35.33%+40.35%+3.79%+31.54 pp
S&P 500 (total return)2026-05-01+19.42%+23.12%+9.51%+9.91 pp
VTI (broad US equity)2026-05-01+19.01%+22.70%+9.22%+9.79 pp
S&P 500 (price)2026-05-01+17.82%+21.47%+7.33%+10.49 pp
REITs (VNQ ETF)2026-05-01+7.61%+10.94%+3.12%+4.49 pp
REITs (IYR ETF)2026-05-01+7.16%+10.48%+3.45%+3.72 pp
US residential RE (ZHVI top-100 mean)2026-04-01-0.10%+3.05%+3.75%-3.85 pp

Cumulative real return per asset, 2000-now (log scale)

What beat what. Across the full 2000-now history of monthly rolling windows, the median 3-year annualized real return ranking is dominated by Bitcoin (+65.6%, on a much shorter post-2014 history and with extreme tail risk) and then by gold and broad-equity / S&P 500 total return. REITs sit in the +4–6% range. The Zillow ZHVI top-100 mean — our private-RE benchmark, before leverage and before rent — comes in at +3.53% median 3y real, ranking #8/8. The trailing-3y leader as of the latest month is Bitcoin at +36.99% real.

Honest caveats on the hedge comparison:
  • BTC selection bias. Bitcoin's history starts in 2014, a strictly post-GFC, mostly-bull-market regime. Other assets include 2000-2002 and 2008-2009.
  • REITs are not private RE. IYR/VNQ are leveraged, mark-to-market, and earn public-market liquidity premia. Best free TR proxy, but different animal.
  • ZHVI ignores rent, maintenance, and property tax. The yield-aware fwd_3y_total model in Section 4 partially addresses this per ZIP.
  • No leverage assumed. Real-estate investors typically run 4–5× leverage; the equity assets here are unlevered.

4. Alternative ranking: yield-aware total return

Bottom line: A separate model trained on price + rent carry produces a substantially different shortlist — only ~22% overlap with the price-only top-50 — tilted toward rust-belt high-yield ZIPs (Detroit, Cleveland, Dayton) at a median 15.8% gross rent yield. Useful for cash-flow investors; smaller universe than the price-only run.

Everything in Section 1 ranks zips by predicted price appreciation. An investor who cares about yield (rent flowing in, not just paper appreciation) wants a different ranking: a separate LightGBM trained on fwd_3y_total = price_return + ZORI rent yield carry. The leaderboard below shows the top 30 zips by that target. Caveat: ZORI covers only ~11% of zip-months, so this ranking's universe is much smaller (~2K zips vs ~10K for the price-only model).

ZIPCBSAZHVIZORIGross yieldPred fwd_3y total ann.Top SHAP drivers (hover for description)
3090112260$71,236$1,22920.7%+34.05%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.086), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.064), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.025)
3810632820$55,228$86518.8%+33.93%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.087), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.066), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.028)
4823819820$55,536$1,04122.5%+33.38%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.088), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.029)
1913337980$81,289$1,27818.9%+32.60%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.086), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.062), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.024)
4360545780$59,051$95319.4%+32.48%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.086), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.026)
1913237980$78,112$1,25819.3%+32.31%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.085), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.063), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.023)
2122312580$76,194$1,35521.3%+32.24%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.084), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.023)
1461140380$94,557$1,35117.1%+32.18%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.086), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.018)
6412828140$89,801$1,16515.6%+32.12%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.084), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.022)
4360945780$61,337$82316.1%+31.94%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.085), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.057), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.026)
4541719430$66,696$1,00018.0%+31.81%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.084), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.023)
4360745780$71,388$1,22520.6%+31.54%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.083), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.057), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.022)
4021231140$69,586$91515.8%+31.44%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.084), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.061), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.022)
4411017410$79,486$90613.7%+31.30%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.082), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.045), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.024)
4021031140$71,852$99816.7%+31.30%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.084), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.061), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.022)
3521113820$75,450$1,09817.5%+31.24%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.083), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.060), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.023)
4410517410$69,008$96016.7%+31.23%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.084), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.057), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.021)
6313741180$86,974$1,10015.2%+31.21%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.085), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.060), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.022)
4822819820$58,895$1,05221.4%+31.18%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.082), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.025)
4410817410$69,151$1,03518.0%+31.18%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.086), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.023)
4822719820$62,223$1,13721.9%+31.07%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.081), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.025)
3522813820$81,166$1,02415.1%+31.06%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.083), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.060), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.023)
5320633340$68,826$76613.3%+30.97%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.078), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.042), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.023)
4021131140$85,011$1,07615.2%+30.87%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.075), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.063), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.020)
4820519820$53,492$1,15826.0%+30.80%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.081), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.055), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.026)
1914037980$86,893$1,20716.7%+30.71%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.082), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.022)
6313641180$74,951$1,08217.3%+30.29%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.081), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.021)
4411217410$82,598$75611.0%+29.93%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.077), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.038), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.022)
6312141180$83,882$1,02414.7%+29.90%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.073), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.061), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.020)
4822419820$78,053$1,19518.4%+29.79%Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.080), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.021)

The decision-different finding: only 11 of 50 zips appear in both this and the price-only top 50 — a 22% overlap. Price-only tilts toward sunbelt fringe (Tulsa, Kansas City, Memphis, New Orleans) at median rent yield ~12.5%. Total-return is dominated by rust-belt high-yield zips (Detroit ×7, Cleveland ×5, Dayton, Akron, Toledo, Birmingham) at median rent yield ~15.8%. Same model architecture, just yield-aware → very different shortlist.

5. Headline results

Bottom line: The best-performing model for each horizon, on both temporal splits. The post-2012 block is the main result; the pre-2008 block is a stress test through the housing crisis. A signal that shows up in both is much harder to dismiss as overfitting to one regime.

The table below picks the best-performing model for each horizon, reported on both temporal splits. The post-2012 block is the main result (larger, more recent test window); the pre-2008 block is a stress test through the 2008 housing crisis. A signal that shows up in both — same sign, same order-of-magnitude spread — is much harder to dismiss as overfitting to one regime.

horizonbest modelSpearman ρdecile spreadtop decilebottom decile
post-2012 (main)
short (1y)elasticnet+0.311+9.26%+13.01%+3.75%
medium (3y)lightgbm+0.327+7.42%+10.24%+2.82%
long (5y)lightgbm+0.432+3.65%+10.09%+6.44%
pre-2008 (GFC stress)
short (1y)elasticnet+0.490+14.18%+6.05%-8.13%
medium (3y)elasticnet+0.346+10.68%+3.68%-6.99%
long (5y)lightgbm+0.070+2.60%+5.34%+2.74%
Why R² is not reported as the primary metric: traditional R² is negative on both test windows for nearly every model. This is not a model failure — it's a property of the test windows. Both contain regime shifts (housing crash; rate shock + COVID) where the mean realized return is very different from the training period. Spatial ranking (Spearman + decile spread) is intact and is what matters for a buy-this-zip- not-that-zip decision.

Why this study exists

Bottom line: Most "where should I buy?" advice is qualitative or single-metric. This study casts a wide net (140+ candidate predictors) and lets out-of-sample model performance + SHAP attribution decide what actually moves forward returns.

Residential real estate is the largest asset class most people will ever own, and the conventional wisdom for picking where and when to buy is dominated by qualitative heuristics (school districts, "up-and-coming neighborhoods", proximity to Whole Foods) or single-metric proxies (population growth, median income). This study asks: which of these signals actually predict forward returns, and by how much, separated by time horizon — and are there underweighted predictors that the textbook approach misses?

The framing is deliberately quantitative. We built a 3.73-million-row panel of every zip code in the top 100 US metros, monthly, 2000-2026, joined with 140+ candidate predictors organized into 11 thematic classes. We trained gradient-boosted models to predict 1-, 3-, and 5-year forward home-price returns on two backtest windows: pre-2008 (GFC stress test) and post-2012 (rate-shock + COVID).

The setup in one minute

Bottom line: Predict 1y / 3y / 5y forward home-price returns per ZIP, using only information available at the start of the holding period. Two temporal splits (pre-2008, post-2012) plus walk-forward and spatial CV.

What we predict

For each zip code at each month, we compute the forward home-price return (Zillow ZHVI index) over three horizons — 1 year, 3 years, 5 years — annualized for the latter two. These are the targets our models try to learn from upstream features.

The chart below illustrates what those forward windows look like for a single real ZIP. The diamond marks an example "prediction date"; the three shaded windows are the 1-, 3-, and 5-year forward periods we score the model against. Predict-only-on-what-you-knew- at-time-t, score-on-what-actually-happened-by-time-t+h is the discipline that makes the backtest honest.

How we predict it

For each (zip, month) row, we join a wide set of features that were knowable at that time (with explicit publication-lag adjustment per source — e.g., HMDA mortgage data isn't available until 9 months after the year it describes). We then train two families of models per horizon:

How we validate

Two temporal splits, both reported in this run:

This run additionally reports spatial cross-validation (hold out entire CBSAs) and walk-forward CV (rolling 12-month test windows) on top of the two static splits — see Section 1 below.

14 thematic feature classes (click to expand)

The current feature store covers 22 data sources organized into 11 thematic classes, producing 140+ modeling features after coverage filtering. Hover over any feature name in the SHAP charts below for a one-line description.

  • Macro / rates: 30Y mortgage, 10Y TIPS, M2, Case-Shiller, lumber, US unemployment + their 1/3/12-month deltas (FRED)
  • Inventory: months-of-supply, days-on-market, price cuts, active/new/pending listings per zip (Realtor.com)
  • Demographics: population by age cohort, household income, education, home value, rent, owner-occupancy at ZCTA (Census ACS, 10 vintages 2013-2022); per-capita personal income at county (BEA)
  • Migration: county-to-county AGI flow (IRS SOI)
  • Supply: building permits at MSA (Census BPS); business establishments + employment per zip (Census ZBP)
  • Jobs: wages + YoY growth at county (BLS QCEW); monthly unemployment rate at county (BLS LAUS)
  • Affordability / cost of living: effective property-tax rate per ZCTA (ACS B25103 / B25077); HUD Small-Area Fair Market Rent per ZIP; residential electricity price per state (EIA)
  • Healthcare: CMS Hospital Compare star ratings + bed counts by county; County Health Rankings composite + headline measures (life expectancy, premature death, smoking, obesity)
  • Schools: NCES district-year enrollment, pupil-teacher ratio, free-lunch share, per-pupil spending — county-broadcast
  • Politics: county presidential vote share + winning margin (MIT Election Lab, 2000-2024)
  • Climate: trailing-10-year disaster declarations by category (OpenFEMA); flood, wildfire, hurricane, heatwave risk scores at tract (FEMA NRI); annual PM2.5 + ozone air quality (EPA AQS)
  • Amenities + infrastructure: EV charging stations per zip (DOE AFDC)
  • Behavioral: monthly Wikipedia pageviews per metro article (search-interest proxy)
  • Derived (no new data): gross rental yield = 12 × ZORI / ZHVI; rolling 24-mo ZHVI volatility; ZHVI drawdown vs trailing-60-mo peak
Pending modules with code ready but waiting on credentials (click to expand)

HMDA (tract-level mortgage records — currently only state-level aggregations), FCC BDC broadband, NOAA VIIRS satellite nightlights, Foursquare POIs (Whole Foods / coffee / breweries), FBI NIBRS county-year crime rates (needs CDE_API_KEY), USAspending federal $ flows.

6. Confidence: walk-forward and spatial robustness

Bottom line: The headline fwd_3y Spearman survives both stronger CV regimes. Walk-forward over 5 rolling 12-month windows gives ρ = +0.447 ± 0.233; spatial-CV (whole-CBSA holdout) on the within-metro residual gives ρ = +0.287 — the genuine cross-sectional alpha after stripping out the "expensive zips stay expensive" level effect.

A single train/test split can flatter or punish a model by accident. Two stronger CV regimes re-run the post-2012 LightGBM to bound how much of the headline number is structural vs. noise.

1a. Walk-forward CV (rolling-origin)

Train through year T−1, test on year T; roll the cutoff from 2018-12 through 2022-12. Five test years per horizon (fewer for fwd_5y, which would need observability beyond the panel's end). The schematic below shows the sliding window: each row is one fold, blue is the training range, green is the held-out test year.

horizonn foldsmean Spearman ρstd ρ 95% interval ρmean decile spreadstd decile spread
short (1y)5+0.3770.179[+0.161, +0.536]+8.03%3.23%
medium (3y)5+0.4470.233[+0.123, +0.698]+5.86%2.70%
long (5y)3+0.4760.108[+0.372, +0.577]+4.96%0.78%

For the headline fwd_3y horizon, mean Spearman ρ = +0.447 with std 0.233 across 5 folds — the +0.327 figure from the single static post-2012 split survives in expectation, but with meaningful year-to-year noise. 2020 is the variance driver, as expected.

1b. Spatial CV (whole-CBSA holdout)

5-fold CBSA holdout, with train+test both restricted to ≤ 2017-12 so the only stressor is the spatial axis (no temporal regime shift). Two variants: the raw model predicts the price return directly; the demeaned model predicts the within-metro residual — i.e., the model is given each ZIP's return after subtracting the average return of every other ZIP in the same metro that month. That isolates "can the model rank ZIPs inside a metro?" from the easier question "can the model tell expensive metros from cheap ones?".

The plot below shows the actual partition: every top-100 metro is assigned to exactly one of five folds (color). Bubble size = metro population.

Raw fwd_3y

horizonn foldsmean Spearman ρstd ρ mean decile spreadstd decile spread
short (1y)5+0.7860.044+25.06%5.14%
medium (3y)5+0.8170.032+20.54%3.34%
long (5y)5+0.8450.027+16.92%2.19%

Demeaned (within-metro residual) fwd_3y

horizonn foldsmean Spearman ρstd ρ mean decile spreadstd decile spread
short (1y)5+0.2300.016+3.72%0.30%
medium (3y)5+0.2870.028+3.36%0.37%
long (5y)5+0.3090.035+3.34%0.37%

The headline lesson. The raw model's Spearman ρ = +0.817 with decile spread +20.54% looks spectacular — but most of it is a level effect ("expensive zips stay expensive"). The demeaned model's Spearman ρ = +0.287 with decile spread +3.36% is the genuine cross-sectional alpha.

7. Did this actually work? — portfolio backtest

Bottom line: Replaying the strategy month-by-month from 2018-01 through 2021-12, the top-decile basket beat the equal-weight universe in 91.7% of months at +2.23 pp/yr alpha. The yield-aware variant hits 100.0% at +6.52 pp/yr on a smaller yield-observable universe.

The most decision-relevant question. We replay history: at each month from 2018-01 through 2021-12, we run the model with only the data it could have seen at that time, take its top-decile picks (a "basket" of about 1,000 ZIPs), and look up what those ZIPs actually returned 3 years later. Compare to the universe mean (equal-weight all ZIPs) and to the bottom-decile basket. Alpha is the basket's excess annualized return over the universe mean.

modelmonthsuniverse / motop-10% basket top-10% realizeduniverse realizedbottom-10% realized top excesstop − bot spreadhit rate (top > univ)
Price-only (fwd_3y)489,932~992+11.37%+9.14%+7.41%+2.23 pp+3.95 pp91.7%
Total-return (fwd_3y_total)482,084~208+20.37%+13.85%+7.58%+6.52 pp+12.78 pp100.0%

The model adds skill. The price-only fwd_3y headline model's top decile beat the universe mean in 91.7% of months (48/48), earning roughly +2.23 percentage points per year alpha and a +3.95 pp/yr top-vs-bottom spread. The yield-aware fwd_3y_total model is in another class — 100.0% hit rate, +6.52 pp/yr alpha — but its scoring universe is much smaller (only ZIPs where Zillow publishes rent data).

2a. Per-cohort wealth: $1 in, $? out after 3 years

The same backtest expressed as terminal wealth. For each entry month, $1 invested in the top-decile basket, the universe, or the bottom-decile basket is compounded forward for the full 3-year hold.

2b. Calibration — is the model a good ranker, a good forecaster, or both?

A model can rank ZIPs correctly yet still be mis-calibrated — predicting +3% when realized averages +7%. We bin every test-set prediction into 20 quantile buckets and plot mean predicted vs mean realized in each bucket. Points on the dashed line are perfectly calibrated.

The cloud's slope is the rank quality (steeper = stronger ranking); the cloud's shift relative to the diagonal is the calibration error. A model can have a steep slope but sit far below the diagonal — useful for picking but biased in level. That pattern is exactly what we'd expect in a regime-shift test window like post-2012, and is why R² is not the headline metric.

8. Decile spread, every model × horizon × split

Bottom line: LightGBM wins decisively on the medium and long horizons in the post-2012 split; the demographics-only hedonic baseline edges it at fwd_1y — a useful sanity check that the signal lives in the data, not the algorithm.

Each bar is one model on one horizon on one test window. Positive values mean the model's top-decile picks beat its bottom-decile picks; negative values mean it anti-ranked. The left panel is the pre-2008 split (smaller and noisier); the right is post-2012 (the main result).

3a. Realized return by predicted decile (headline horizon × split)

The chart above collapses each (model, horizon, split) into a single "top minus bottom" number. The one below zooms into the headline (post-2012 LightGBM, fwd_3y) and shows the realized return of every decile, not just the extremes. A monotone climb from decile 1 → decile 10 is the strong-form claim: the model is ordering ZIPs, not just separating the very best from the very worst.

LightGBM wins decisively on the medium and long horizons in the post-2012 split. On the 1-year horizon, the demographics-only hedonic baseline (a simple linear regression using only Census ACS variables) actually edges LightGBM — a useful sanity check that when a simple baseline ties a sophisticated model, the signal probably comes from the data more than the algorithm.

9. What did the model learn? — SHAP feature importance

Bottom line: Real rates dominate medium/long-horizon SHAP (10Y TIPS yield is the biggest mean-abs feature at fwd_3y and fwd_5y); drawdown vs trailing peak dominates at fwd_1y (mean reversion). County Republican vote share and Wikipedia pageviews show up in the top 10 at multiple horizons — identity / attention proxies the textbook real-estate model doesn't include.

For each horizon, we compute mean absolute SHAP value per feature on the training sample. SHAP decomposes each prediction into per-feature contributions. The bars below show the 15 features that the LightGBM model used most heavily.

Short horizon (1 year)

Medium horizon (3 years)

Long horizon (5 years)

4a. Per-ZIP attribution — why this ZIP, not that one?

The aggregate SHAP charts above answer "which features does the model rely on, on average?" The two waterfalls below answer the more decision-relevant version: for this specific ZIP, which features pushed its predicted return up or down, and by how much? Green bars push the prediction higher than the model's base rate; red bars push it lower.

Top-decile example

Bottom-decile example

The pattern that recurs across nearly every top-decile ZIP: a large positive contribution from the drawdown feature combined with a uniform negative contribution from the 10Y real rate. Bottom-decile ZIPs typically show the opposite.

Four findings worth flagging:

10. Does the signal survive price stratification?

Bottom line: Decile spreads of +6.35% / +6.57% / +3.34% across <$200K / $200K–$500K / $500K+ tiers — the model isn't merely buying cheap homes.

A common worry with any model that ranks zip codes is that it has secretly learned a price-level proxy: cheap zips mean-revert upward, expensive zips compound more slowly, and the "alpha" is just a roundabout way of buying low. To check that, we split the post-2012 test set into three ZHVI tiers and re-compute the LightGBM fwd_3y decile spread within each tier.

price tier (ZHVI)test rowsmedian ZHVI top decile realizedbottom decile realizeddecile spread
<$200K158,318$152,181+11.98%+5.63%+6.35%
$200K–$500K339,898$307,788+9.23%+2.66%+6.57%
$500K+151,656$705,620+6.25%+2.90%+3.34%

Interpretation. The decile spread is +6.35% for sub-$200K homes, +6.57% in the $200K–$500K middle, and +3.34% for $500K+ homes — all positive, all out-of-sample. The two cheaper tiers carry essentially the same spread, and the $500K+ tier roughly halves but stays positive. This is the pattern you'd expect if the model has real cross-sectional signal at every price point and luxury markets simply have flatter forward returns — not the pattern you'd expect if the model were secretly a price-mean-reversion proxy.

Caveats and what's not in this run

arbok · model-first, personal-overlay layered on top · all data free / public sources · code at src/arbok/