Jump to section ▾
- Executive summary
- How to read this report
- 1. Where might the model want to buy today?
- 2. Personal-overlay shortlists (Phase 4)
- 3. Macro context: timing + hedges
- 4. Alternative ranking: yield-aware total return
- 5. Headline results
- Why this study exists
- The setup in one minute
- 6. Confidence: walk-forward + spatial robustness
- 7. Did this actually work? — portfolio backtest
- 8. Decile spread, every model × horizon × split
- 9. What did the model learn? — SHAP feature importance
- 10. Does the signal survive price stratification?
- Caveats and what's not in this run
What predicts US residential real-estate returns?
Executive summary
- Headline decile spread. On the 3-year horizon (post-2012 split), the model's top-decile ZIPs averaged +10.24% realized returns vs. the bottom decile at +2.82% — a +7.42-point out-of-sample spread.
- Backtest hit-rate. Replaying month-by-month from 2018-01 through 2021-12 (48 basket-formation months × 3y hold), the top-decile basket beat the equal-weight universe in 91.7% of months at roughly +2.23 pp/yr alpha.
- Timing verdict. The model's current (2026-04) universe-mean predicted fwd_3y annualized return is -1.35% — the 25th percentile of post-2012 history. The model is currently bearish vs. its historical baseline.
- Hedge comparison. US residential RE comes last in an 8-asset cohort on median 3y real return (+3.5%); BTC and gold lead, REITs/equities mid-pack. Caveat: ZHVI is unlevered and ignores rent.
- Non-obvious finding. Drawdown vs. the trailing-60-month ZHVI peak is the #1 mean-abs SHAP feature at fwd_1y — cross-sectional mean reversion is the dominant short-horizon signal, not demographics or schools.
How to read this report (a 60-second primer)
This report uses some specialized terms repeatedly. The shortest possible definitions:
- ZIP / metro (CBSA)
- A ZIP code is a USPS postal area (~22,000 across the US). A "metro" — Census's Core-Based Statistical Area or CBSA — is a grouping of nearby ZIPs that share a labor market. We restrict to the top 100 metros by population.
- Forward return (fwd_1y, fwd_3y, fwd_5y)
- "fwd_3y" is the price change of a typical home in a ZIP over the next 3 years, annualized. The model uses only information available at the start of the holding period — no peeking ahead.
- ZHVI / ZORI
- Zillow's monthly Home Value Index (the price level we predict) and Observed Rent Index (the yield-aware variant of the model).
- LightGBM
- A gradient-boosted decision-tree model — the standard ML workhorse for tabular data. The headline model in this report; the appendix compares it to ElasticNet and a demographics-only hedonic baseline.
- SHAP
- For each prediction, SHAP attributes a contribution to every input feature. Top-SHAP features tell us what the model relies on most. See the methodology page.
- Decile spread
- The realized return of the top 10% of ZIPs the model ranked highest, minus the bottom 10%. The bottom-line "would picking this model's top zips have helped?" number.
- Backtest
- Replay history. At each past month, we run the model on data available then, take its top-decile picks, and check what those ZIPs actually returned 3 years later.
1. Where might the model want to buy today?
Hover any zip for ZIP code, metro, ZHVI, predicted fwd_3y return, and decile rank. Drag to pan; scroll to zoom. Open the map full-screen ↗
Using the post-2012 LightGBM trained on 3-year forward returns, we scored every zip in the top-100 US metros at the most recent month with enough feature coverage to make a confident prediction. The table below is rolled up to one row per metro (CBSA), showing the metro's principal (largest-population) city, the best-scoring ZIP inside that metro, the mean predicted return across the metro's top-5 ZIPs, and how many of its ZIPs land in the top 10% of all predictions nationally. Hover over any feature for its description; click any column header to sort.
| Metro (principal city) | CBSA | Best ZIP | Best ZIP price | Mean pred (top 5 zips) | Pred for best ZIP | p10 (80% PI lo) | p90 (80% PI hi) | Zips in top 10% | Top SHAP drivers for best ZIP (hover) |
|---|---|---|---|---|---|---|---|---|---|
| Kansas City, MO | 28140 | 64127 | $87,604 | +6.23% | +7.38% | -3.65% | +19.90% | 14 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.017), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012) |
| Chicago, IL | 16980 | 46406 | $77,661 | +5.03% | +6.97% | -2.02% | +17.20% | 6 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.049), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.016), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012) |
| New Orleans, LA | 35380 | 70117 | $197,104 | +4.83% | +6.79% | -2.60% | +12.26% | 17 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.048), FEMA other disastersOther-category FEMA disasters in the trailing 10 years. (+0.023), NRI wildfire riskFEMA NRI wildfire risk score (0-100). Tract-level. (+0.013) |
| Buffalo, NY | 15380 | 14211 | $105,043 | +4.25% | +6.44% | -1.86% | +16.96% | 13 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.052), FEMA other disastersOther-category FEMA disasters in the trailing 10 years. (+0.015), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011) |
| Memphis, TN | 32820 | 38109 | $89,220 | +5.42% | +6.37% | -1.31% | +16.81% | 15 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.048), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011) |
| Milwaukee, WI | 33340 | 53206 | $68,826 | +2.88% | +6.22% | -2.84% | +18.82% | 4 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.018), Median HH incomeMedian household income (ACS, $ per year). (+0.012) |
| Tulsa, OK | 46140 | 74126 | $90,854 | +3.82% | +5.33% | -1.03% | +14.43% | 10 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.045), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.017), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012) |
| Jacksonville, FL | 27260 | 32209 | $117,406 | +4.27% | +5.06% | -2.09% | +13.77% | 19 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.048), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.008) |
| Dayton, OH | 19430 | 45404 | $94,232 | +4.37% | +4.92% | -3.22% | +15.89% | 10 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.049), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.012), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012) |
| Omaha, NE | 36540 | 51545 | $80,280 | +4.76% | +4.76% | -1.76% | +12.77% | 1 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.042), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.016), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012) |
| Akron, OH | 10420 | 44307 | $72,092 | +4.09% | +4.68% | -4.31% | +16.59% | 10 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.054), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011) |
| Oklahoma City, OK | 36420 | 74026 | $106,941 | +4.65% | +4.65% | -1.74% | +12.92% | 1 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.041), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.013), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011) |
| Toledo, OH | 45780 | 43607 | $71,388 | +3.32% | +4.63% | -3.61% | +15.96% | 7 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.054), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.012), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011) |
| Louisville/Jefferson County, KY | 31140 | 40211 | $85,011 | +2.64% | +4.60% | -3.18% | +16.41% | 3 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.052), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.015), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.010) |
| Tampa, FL | 45300 | 34661 | $158,723 | +4.55% | +4.55% | -2.97% | +10.45% | 1 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.045), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.010), Mean hospital star ratingMean overall 1-5 CMS star rating across rated hospitals in the county. (-0.010) |
| Indianapolis, IN | 26900 | 46016 | $80,043 | +2.25% | +4.51% | -3.42% | +16.30% | 3 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.053), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.012), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.010) |
| Baltimore, MD | 12580 | 21223 | $76,194 | +2.50% | +4.40% | -4.61% | +15.76% | 5 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.053), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.014), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.009) |
| St. Louis, MO | 41180 | 63133 | $61,113 | +3.45% | +4.37% | -3.68% | +16.66% | 12 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.054), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.013), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012) |
| Pittsburgh, PA | 38300 | 15640 | $86,013 | +4.37% | +4.37% | -1.50% | +12.28% | 1 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.040), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.015), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.010) |
| Cincinnati, OH | 17140 | 45214 | $123,142 | +3.45% | +4.29% | -3.34% | +15.33% | 5 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.055), Median HH incomeMedian household income (ACS, $ per year). (+0.013), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.009) |
| Cleveland, OH | 17410 | 44003 | $60,286 | +4.15% | +4.15% | -1.82% | +11.20% | 1 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.042), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.014), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011) |
| Baton Rouge, LA | 12940 | 70805 | $86,721 | +2.99% | +4.11% | -3.29% | +14.95% | 10 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.045), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.014), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012) |
| Wichita, KS | 48620 | 67214 | $82,065 | +2.62% | +4.05% | -3.42% | +16.55% | 5 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.053), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.012), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012) |
| Philadelphia, PA | 37980 | 19140 | $86,893 | +2.58% | +4.03% | -2.58% | +17.74% | 3 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.054), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.016), Median HH incomeMedian household income (ACS, $ per year). (+0.009) |
| Atlanta, GA | 12060 | 31816 | $129,303 | +4.03% | +4.03% | -3.04% | +10.21% | 1 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.038), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011) |
| Knoxville, TN | 28940 | 37729 | $106,337 | +3.99% | +3.99% | -2.59% | +12.97% | 1 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.041), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.014), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.013) |
| Detroit, MI | 19820 | 48238 | $55,536 | +3.66% | +3.97% | -4.37% | +19.09% | 13 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.063), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.014), M2 money supplySeasonally-adjusted M2 money supply in billions of $ (FRED: M2SL). Liquidity proxy. (+0.012) |
| Riverside, CA | 40140 | 93562 | $69,621 | +3.95% | +3.95% | -5.50% | +12.99% | 1 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.015), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011) |
| Augusta, GA | 12260 | 30901 | $71,236 | +1.83% | +3.89% | -4.01% | +15.47% | 2 | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.047), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.014), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011) |
1a. ZIP-level detail (top 20)
The same model expanded back to per-ZIP rows for users who want to drill into specific neighborhoods rather than cities.
| ZIP | City, State | CBSA | ZHVI | ZORI | Pred fwd_3y ann. | p10 (80% PI lo) | p90 (80% PI hi) | Top SHAP drivers (hover for feature description) |
|---|---|---|---|---|---|---|---|---|
| 64127 | Kansas City, MO | 28140 | $87,604 | $987 | +7.38% | -3.65% | +19.90% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.017), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012) |
| 64128 | Kansas City, MO | 28140 | $89,801 | $1,165 | +7.27% | -3.33% | +19.10% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.057), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.015), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012) |
| 46406 | Gary, IN | 16980 | $77,661 | — | +6.97% | -2.02% | +17.20% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.049), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.015), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012) |
| 70117 | New Orleans, LA | 35380 | $197,104 | $1,475 | +6.79% | -2.60% | +12.26% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.048), FEMA other disastersOther-category FEMA disasters in the trailing 10 years. (+0.023), NRI wildfire riskFEMA NRI wildfire risk score (0-100). Tract-level. (+0.013) |
| 14211 | Buffalo, NY | 15380 | $105,043 | — | +6.44% | -1.86% | +16.96% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.052), FEMA other disastersOther-category FEMA disasters in the trailing 10 years. (+0.015), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011) |
| 38109 | Memphis, TN | 32820 | $89,220 | $1,027 | +6.37% | -1.31% | +16.81% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.048), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011) |
| 53206 | Milwaukee, WI | 33340 | $68,826 | $766 | +6.22% | -2.84% | +18.82% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.018), Median HH incomeMedian household income (ACS, $ per year). (+0.012) |
| 64126 | Kansas City, MO | 28140 | $85,881 | — | +6.09% | -3.79% | +18.31% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.055), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.016), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012) |
| 38106 | Memphis, TN | 32820 | $55,228 | $865 | +5.87% | -2.08% | +18.31% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.052), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.013), Wikipedia pageviewsMonthly pageviews of the metro's primary Wikipedia article — search-interest proxy. (+0.011) |
| 64130 | Kansas City, MO | 28140 | $104,120 | $1,229 | +5.64% | -3.28% | +17.01% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.052), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.010) |
| 74126 | Tulsa, OK | 46140 | $90,854 | — | +5.33% | -1.03% | +14.43% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.044), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.017), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012) |
| 74130 | Tulsa, OK | 46140 | $91,695 | — | +5.13% | -1.03% | +13.39% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.044), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.018), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012) |
| 32209 | Jacksonville, FL | 27260 | $117,406 | $1,079 | +5.06% | -2.09% | +13.77% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.048), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.008) |
| 38114 | Memphis, TN | 32820 | $82,035 | $865 | +5.05% | -2.90% | +17.90% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.053), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011), Median HH incomeMedian household income (ACS, $ per year). (+0.008) |
| 46407 | Gary, IN | 16980 | $56,193 | — | +5.05% | -3.73% | +18.32% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.058), Median HH incomeMedian household income (ACS, $ per year). (+0.011), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011) |
| 38127 | Memphis, TN | 32820 | $88,799 | $990 | +4.93% | -2.31% | +17.89% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.051), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.011) |
| 45404 | Dayton, OH | 19430 | $94,232 | — | +4.92% | -3.22% | +15.89% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.049), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.012), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.012) |
| 38126 | Memphis, TN | 32820 | $56,088 | — | +4.86% | -3.26% | +18.06% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.052), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.012), Rep vote share (county)County Republican vote share at the prior presidential election (MIT Election Lab). (+0.010) |
| 74650 | Ralston, OK | 46140 | $124,185 | — | +4.85% | -1.84% | +11.95% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.045), NRI wildfire riskFEMA NRI wildfire risk score (0-100). Tract-level. (+0.011), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.011) |
| 70041 | Buras, LA | 35380 | $140,424 | — | +4.81% | -3.59% | +11.59% | 10Y real rate (TIPS)Yield on the 10-year Treasury Inflation-Protected Security (FRED: DFII10). Real cost of capital. (-0.048), FEMA other disastersOther-category FEMA disasters in the trailing 10 years. (+0.016), Mean hospital star ratingMean overall 1-5 CMS star rating across rated hospitals in the county. (-0.009) |
- The p10/p90 bands tell the real story. Top zips show point predictions of +5% to +7%, but the 80% prediction interval per zip spans roughly [-2.7%, +15.8%] — a 18.7-point band on median.
- The negative SHAP contribution from
fred.real_rate_10yappears for every zip — it's not a zip-specific signal, just "rates are high right now." - This is a model output, not investment advice. The model is unaware of zoning, school districts, walkability, specific listings, or your personal constraints.
2. Personal-overlay shortlists (Phase 4)
Two parallel ZIP-level shortlists, both nationwide and both built on the latest snapshot (2023-12), one tuned for an owner-occupier and one for a cash-flow investor. Filter thresholds use the median across top-100-metro zips on the snapshot (after forward-filling slow-moving structural features like school student-teacher ratio and ACS demographics from the most recent non-null panel month — these features lag the price target by 1–2 years and don't change month-to-month). The underlying parquets keep all 30 rows; the display tables below dedupe by metro (CBSA) so you don't see the same city ten times.
2a. Owner-occupier (price-return focus)
Scored on the price-only fwd_3y model. Filters keep zips with positive expected price return AND low climate risk (wildfire + flood ≤ median), low recent disaster frequency (≤ 5 FEMA declarations in 10 years), strong schools (student-teacher ratio ≤ median), good health outcomes (premature death rate ≤ median, as a life-expectancy proxy), and majority owner-occupied (acs__owner_occupied_pct ≥ 0.5).
30 zips; 22 of 30 zips fall in Washington, DC. Dominant metros: Washington, DC (22), Toledo, OH (7), Madison, WI (1). Median ZHVI $722,904; median predicted fwd_3y price return +1.59%.
| ZIP | City, State | Metro | ZHVI | Pred fwd_3y | p10 | p90 | NRI wildfire | NRI flood | FEMA 10y | Stud/Tchr | Prem. death (yppl) | ACS own% |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 43406 | Bradner, OH | Toledo, OH | $152,730 | +2.29% | -1.18% | +10.62% | 45.5 | 53.7 | 2 | 14.2 | 6423 | 0.79 |
| 20152 | Chantilly, VA | Washington, DC | $832,524 | +2.09% | -3.82% | +7.72% | 33.0 | 35.7 | 4 | 12.5 | 3243 | 0.85 |
| 53566 | Monroe, WI | Madison, WI | $245,767 | +1.64% | -1.38% | +9.47% | 52.5 | 42.4 | 2 | 12.5 | 6196 | 0.69 |
Owner-occupier filters skipped:
- crime__*_per_100k (NIBRS not in feature store — needs FBI CDE_API_KEY)
2b. Investor (total-return + rent-yield focus)
Scored on the total-return fwd_3y model (price appreciation + ZORI rent carry, trained in the Phase 2 total-return run). Filters keep zips in the top quartile of predicted total return AND with a gross rent yield ≥ 6%, a ZHVI in the cash-flow-friendly band ($80K–$400K), and ACS owner-occupancy share in [0.4, 0.7] so there's an active rental market without signalling deep distress.
30 zips; spread across 15 metros: Kansas City, MO (4), St. Louis, MO (4), Detroit, MI (3). Median ZHVI $89,010; median rent yield 13.74%; median predicted fwd_3y total return +27.90%.
| ZIP | City, State | Metro | ZHVI | ZORI | Rent yield | Pred fwd_3y total | Pred fwd_3y (price) | ACS own% |
|---|---|---|---|---|---|---|---|---|
| 19133 | Philadelphia, PA | Philadelphia, PA | $81,289 | $1,278 | +18.87% | +32.60% | +3.28% | 0.41 |
| 64128 | Kansas City, MO | Kansas City, MO | $89,801 | $1,165 | +15.57% | +32.12% | +7.27% | 0.45 |
| 63137 | Saint Louis, MO | St. Louis, MO | $86,974 | $1,100 | +15.18% | +31.21% | +3.56% | 0.51 |
| 35228 | Birmingham, AL | Birmingham, AL | $81,166 | $1,024 | +15.14% | +31.06% | +2.66% | 0.61 |
| 48219 | Detroit, MI | Detroit, MI | $80,777 | $1,085 | +16.12% | +29.74% | +2.14% | 0.55 |
| 38114 | Memphis, TN | Memphis, TN | $82,035 | $865 | +12.65% | +29.50% | +5.05% | 0.45 |
| 45406 | Dayton, OH | Dayton, OH | $85,512 | $1,006 | +14.12% | +28.72% | +4.13% | 0.48 |
| 44128 | Cleveland, OH | Cleveland, OH | $94,931 | $1,114 | +14.08% | +28.47% | +2.78% | 0.48 |
| 44306 | Akron, OH | Akron, OH | $80,899 | $849 | +12.59% | +28.27% | +3.58% | 0.45 |
| 46016 | Anderson, IN | Indianapolis, IN | $80,043 | $778 | +11.66% | +27.90% | +4.51% | 0.42 |
| 39212 | Jackson, MS | Jackson, MS | $85,231 | $1,016 | +14.31% | +26.12% | +2.79% | 0.64 |
| 32209 | Jacksonville, FL | Jacksonville, FL | $117,406 | $1,079 | +11.03% | +26.11% | +5.06% | 0.47 |
| 15210 | Pittsburgh, PA | Pittsburgh, PA | $106,779 | $1,173 | +13.18% | +25.50% | +2.37% | 0.55 |
| 21213 | Baltimore, MD | Baltimore, MD | $119,160 | $1,477 | +14.88% | +25.18% | +2.24% | 0.55 |
| 78207 | San Antonio, TX | San Antonio, TX | $120,432 | $1,188 | +11.84% | +24.93% | +2.05% | 0.44 |
2c. Overlap and threshold transparency
The two shortlists do not overlap on any ZIP. The owner-occupier profile up-weights low-risk, school-strong, low-disaster areas while the investor profile favors lower-priced, higher-yield areas; they end up pointing at different zips by construction.
Median thresholds used (over top-100-metro snapshot zips):
wildfire: median = 52.592flood: median = 62.692school: median = 15.093premature_death: median = 7500.495
- NIBRS crime not yet in the feature store. The owner-occupier
spec called for a crime filter but the FBI NIBRS county-year crime
aggregate is gated on
CDE_API_KEYand not yet loaded. The filter is skipped, documented above. - Total-return model has no quantile boosters. The investor table shows only point predictions for total return — the Phase 2 total-return run was point-only by design (runtime budget).
- Slow-moving features are forward-filled. School ratios, ACS demographics, NRI scores, FEMA disaster counts, and premature mortality use each ZIP's most-recent non-null value across the panel (not the scoring month, which is often unrefreshed). For stable structural features this is a fair approximation; we're not claiming a real-time read on these.
- This is not investment advice. The model is unaware of zoning, specific listings, walkability, your personal commute or family constraints — it's a quantitative sort, not a buy recommendation.
3. Macro context: is now a good time? And how does RE stack up against other hedges?
Everything above is about where to buy. The two questions below are about when and what else: does the model think the broad RE universe is cheap or expensive right now, and how has private RE actually fared against the other classic inflation hedges (stocks, gold, REITs, BTC) when you put them on the same axis?
3a. Is now a good time to buy?
For each month from 2014 through the latest in the panel, we score every top-100-metro ZIP with the headline post-2012 LightGBM and take the universe mean of the predicted fwd_3y annualized return. The model's prediction at month t can be compared to the realized universe-mean fwd_3y return three years later (dashed line) for all months where t+3y is observable.
The model's current (2026-04) universe-mean predicted fwd_3y annualized return is -1.35% (median of the per-ZIP distribution that month; 10,396 ZIPs scored, 80% prediction interval roughly [-2.61%, -0.14%]). The historical median across all scored months in the post-2012 panel is +5.95% with an interquartile range of [-1.35%, +6.35%]. The current prediction sits at the 25th percentile of history — the model is currently bearish vs the historical baseline (n = 160 months scored).
How wide is the model's uncertainty over time?
Three separate LightGBM models trained at the 10th, 50th, and 90th conditional quantiles produce an 80% prediction band around the median. When the band is narrow, the model is confident; when it widens, it is hedging.
3b. Real estate vs other inflation hedges
Real estate is rarely held in isolation in a real portfolio — the question "is now a good time to buy a house" is really "is real estate the best use of the marginal dollar right now". We put residential RE on the same chart as the S&P 500, broad equities (VTI), publicly-traded REITs (IYR / VNQ), gold, and Bitcoin, all in real terms (net of CPI). Residential RE is measured unlevered and net of nothing (rent, maintenance, property tax, transaction costs all ignored) — conservative for RE since the typical homeowner has both leverage and carrying costs.
Historical 3y / 5y annualized real returns (rolling, since 2000)
| asset | history starts | median 3y real return | std 3y real | 3y windows positive | median 5y real return | 5y windows positive |
|---|---|---|---|---|---|---|
| US residential RE (ZHVI top-100 mean) | 2000 | +3.53% | 5.13% | 71% | +3.14% | 73% |
| S&P 500 (price) | 2000 | +6.74% | 8.49% | 77% | +6.75% | 72% |
| S&P 500 (total return) | 2000 | +8.87% | 8.66% | 81% | +8.67% | 77% |
| VTI (broad US equity) | 2001 | +9.26% | 7.58% | 88% | +8.82% | 85% |
| Gold (front-month futures) | 2000 | +10.60% | 11.29% | 77% | +7.62% | 81% |
| REITs (IYR ETF) | 2000 | +6.28% | 10.70% | 75% | +5.09% | 75% |
| REITs (VNQ ETF) | 2004 | +4.58% | 10.19% | 71% | +4.11% | 81% |
| Bitcoin | 2014 | +65.59% | 63.72% | 99% | +56.02% | 98% |
Trailing 3-year real returns ending the most recent month
| asset | window ends | trailing 3y real return | trailing 3y nominal | trailing-10y median 3y real | spread vs 10y median |
|---|---|---|---|---|---|
| Bitcoin | 2026-05-01 | +36.99% | +41.23% | +65.59% | -28.60 pp |
| Gold (front-month futures) | 2026-05-01 | +35.33% | +40.35% | +3.79% | +31.54 pp |
| S&P 500 (total return) | 2026-05-01 | +19.42% | +23.12% | +9.51% | +9.91 pp |
| VTI (broad US equity) | 2026-05-01 | +19.01% | +22.70% | +9.22% | +9.79 pp |
| S&P 500 (price) | 2026-05-01 | +17.82% | +21.47% | +7.33% | +10.49 pp |
| REITs (VNQ ETF) | 2026-05-01 | +7.61% | +10.94% | +3.12% | +4.49 pp |
| REITs (IYR ETF) | 2026-05-01 | +7.16% | +10.48% | +3.45% | +3.72 pp |
| US residential RE (ZHVI top-100 mean) | 2026-04-01 | -0.10% | +3.05% | +3.75% | -3.85 pp |
Cumulative real return per asset, 2000-now (log scale)
What beat what. Across the full 2000-now history of monthly rolling windows, the median 3-year annualized real return ranking is dominated by Bitcoin (+65.6%, on a much shorter post-2014 history and with extreme tail risk) and then by gold and broad-equity / S&P 500 total return. REITs sit in the +4–6% range. The Zillow ZHVI top-100 mean — our private-RE benchmark, before leverage and before rent — comes in at +3.53% median 3y real, ranking #8/8. The trailing-3y leader as of the latest month is Bitcoin at +36.99% real.
- BTC selection bias. Bitcoin's history starts in 2014, a strictly post-GFC, mostly-bull-market regime. Other assets include 2000-2002 and 2008-2009.
- REITs are not private RE. IYR/VNQ are leveraged, mark-to-market, and earn public-market liquidity premia. Best free TR proxy, but different animal.
- ZHVI ignores rent, maintenance, and property tax. The yield-aware
fwd_3y_totalmodel in Section 4 partially addresses this per ZIP. - No leverage assumed. Real-estate investors typically run 4–5× leverage; the equity assets here are unlevered.
4. Alternative ranking: yield-aware total return
Everything in Section 1 ranks zips by predicted price appreciation. An investor
who cares about yield (rent flowing in, not just paper appreciation) wants a different
ranking: a separate LightGBM trained on
fwd_3y_total = price_return + ZORI rent yield carry. The leaderboard below
shows the top 30 zips by that target. Caveat: ZORI covers only ~11% of zip-months, so
this ranking's universe is much smaller (~2K zips vs ~10K for the price-only model).
| ZIP | CBSA | ZHVI | ZORI | Gross yield | Pred fwd_3y total ann. | Top SHAP drivers (hover for description) |
|---|---|---|---|---|---|---|
| 30901 | 12260 | $71,236 | $1,229 | 20.7% | +34.05% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.086), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.064), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.025) |
| 38106 | 32820 | $55,228 | $865 | 18.8% | +33.93% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.087), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.066), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.028) |
| 48238 | 19820 | $55,536 | $1,041 | 22.5% | +33.38% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.088), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.029) |
| 19133 | 37980 | $81,289 | $1,278 | 18.9% | +32.60% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.086), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.062), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.024) |
| 43605 | 45780 | $59,051 | $953 | 19.4% | +32.48% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.086), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.026) |
| 19132 | 37980 | $78,112 | $1,258 | 19.3% | +32.31% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.085), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.063), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.023) |
| 21223 | 12580 | $76,194 | $1,355 | 21.3% | +32.24% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.084), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.023) |
| 14611 | 40380 | $94,557 | $1,351 | 17.1% | +32.18% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.086), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.018) |
| 64128 | 28140 | $89,801 | $1,165 | 15.6% | +32.12% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.084), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.022) |
| 43609 | 45780 | $61,337 | $823 | 16.1% | +31.94% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.085), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.057), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.026) |
| 45417 | 19430 | $66,696 | $1,000 | 18.0% | +31.81% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.084), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.023) |
| 43607 | 45780 | $71,388 | $1,225 | 20.6% | +31.54% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.083), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.057), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.022) |
| 40212 | 31140 | $69,586 | $915 | 15.8% | +31.44% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.084), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.061), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.022) |
| 44110 | 17410 | $79,486 | $906 | 13.7% | +31.30% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.082), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.045), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.024) |
| 40210 | 31140 | $71,852 | $998 | 16.7% | +31.30% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.084), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.061), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.022) |
| 35211 | 13820 | $75,450 | $1,098 | 17.5% | +31.24% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.083), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.060), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.023) |
| 44105 | 17410 | $69,008 | $960 | 16.7% | +31.23% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.084), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.057), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.021) |
| 63137 | 41180 | $86,974 | $1,100 | 15.2% | +31.21% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.085), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.060), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.022) |
| 48228 | 19820 | $58,895 | $1,052 | 21.4% | +31.18% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.082), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.025) |
| 44108 | 17410 | $69,151 | $1,035 | 18.0% | +31.18% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.086), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.023) |
| 48227 | 19820 | $62,223 | $1,137 | 21.9% | +31.07% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.081), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.025) |
| 35228 | 13820 | $81,166 | $1,024 | 15.1% | +31.06% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.083), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.060), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.023) |
| 53206 | 33340 | $68,826 | $766 | 13.3% | +30.97% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.078), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.042), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.023) |
| 40211 | 31140 | $85,011 | $1,076 | 15.2% | +30.87% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.075), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.063), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.020) |
| 48205 | 19820 | $53,492 | $1,158 | 26.0% | +30.80% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.081), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.055), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.026) |
| 19140 | 37980 | $86,893 | $1,207 | 16.7% | +30.71% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.082), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.022) |
| 63136 | 41180 | $74,951 | $1,082 | 17.3% | +30.29% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.081), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.059), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.021) |
| 44112 | 17410 | $82,598 | $756 | 11.0% | +29.93% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.077), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.038), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.022) |
| 63121 | 41180 | $83,882 | $1,024 | 14.7% | +29.90% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.073), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.061), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.020) |
| 48224 | 19820 | $78,053 | $1,195 | 18.4% | +29.79% | Median home valueACS-reported median owner-occupied home value (self-reported survey). (+0.080), rent_yield_tFeature rent_yield_t. No detailed label registered. (+0.058), ZHVI at time tCurrent Zillow Home Value Index for the zip — the price level we're predicting the return on. (+0.021) |
The decision-different finding: only 11 of 50 zips appear in both this and the price-only top 50 — a 22% overlap. Price-only tilts toward sunbelt fringe (Tulsa, Kansas City, Memphis, New Orleans) at median rent yield ~12.5%. Total-return is dominated by rust-belt high-yield zips (Detroit ×7, Cleveland ×5, Dayton, Akron, Toledo, Birmingham) at median rent yield ~15.8%. Same model architecture, just yield-aware → very different shortlist.
5. Headline results
The table below picks the best-performing model for each horizon, reported on both temporal splits. The post-2012 block is the main result (larger, more recent test window); the pre-2008 block is a stress test through the 2008 housing crisis. A signal that shows up in both — same sign, same order-of-magnitude spread — is much harder to dismiss as overfitting to one regime.
| horizon | best model | Spearman ρ | decile spread | top decile | bottom decile |
|---|---|---|---|---|---|
| post-2012 (main) | |||||
| short (1y) | elasticnet | +0.311 | +9.26% | +13.01% | +3.75% |
| medium (3y) | lightgbm | +0.327 | +7.42% | +10.24% | +2.82% |
| long (5y) | lightgbm | +0.432 | +3.65% | +10.09% | +6.44% |
| pre-2008 (GFC stress) | |||||
| short (1y) | elasticnet | +0.490 | +14.18% | +6.05% | -8.13% |
| medium (3y) | elasticnet | +0.346 | +10.68% | +3.68% | -6.99% |
| long (5y) | lightgbm | +0.070 | +2.60% | +5.34% | +2.74% |
Why this study exists
Residential real estate is the largest asset class most people will ever own, and the conventional wisdom for picking where and when to buy is dominated by qualitative heuristics (school districts, "up-and-coming neighborhoods", proximity to Whole Foods) or single-metric proxies (population growth, median income). This study asks: which of these signals actually predict forward returns, and by how much, separated by time horizon — and are there underweighted predictors that the textbook approach misses?
The framing is deliberately quantitative. We built a 3.73-million-row panel of every zip code in the top 100 US metros, monthly, 2000-2026, joined with 140+ candidate predictors organized into 11 thematic classes. We trained gradient-boosted models to predict 1-, 3-, and 5-year forward home-price returns on two backtest windows: pre-2008 (GFC stress test) and post-2012 (rate-shock + COVID).
The setup in one minute
What we predict
For each zip code at each month, we compute the forward home-price return (Zillow ZHVI index) over three horizons — 1 year, 3 years, 5 years — annualized for the latter two. These are the targets our models try to learn from upstream features.
The chart below illustrates what those forward windows look like for a single real ZIP. The diamond marks an example "prediction date"; the three shaded windows are the 1-, 3-, and 5-year forward periods we score the model against. Predict-only-on-what-you-knew- at-time-t, score-on-what-actually-happened-by-time-t+h is the discipline that makes the backtest honest.
How we predict it
For each (zip, month) row, we join a wide set of features that were knowable at that time (with explicit publication-lag adjustment per source — e.g., HMDA mortgage data isn't available until 9 months after the year it describes). We then train two families of models per horizon:
- Baselines: predict the global mean; predict the within-metro mean; OLS regression on Census demographics only ("hedonic"). These are what a real model has to beat.
- Real models: ElasticNet (regularized linear, interpretable) and LightGBM (gradient-boosted trees, captures interactions). Both report Spearman rank correlation and decile spread on out-of-sample test data; LightGBM additionally produces SHAP feature attributions.
How we validate
Two temporal splits, both reported in this run:
- pre_2008 split: train on data through 2007-12, test on 2008-2013. Stress-tests whether the model breaks across the 2008 housing crisis (the "GFC" — Global Financial Crisis).
- post_2012 split: train through 2017-12, test on 2018-2024. Stress-tests across the Fed's 2022 rate hike cycle (the "rate shock") and the COVID-era demand surge.
This run additionally reports spatial cross-validation (hold out entire CBSAs) and walk-forward CV (rolling 12-month test windows) on top of the two static splits — see Section 1 below.
14 thematic feature classes (click to expand)
The current feature store covers 22 data sources organized into 11 thematic classes, producing 140+ modeling features after coverage filtering. Hover over any feature name in the SHAP charts below for a one-line description.
- Macro / rates: 30Y mortgage, 10Y TIPS, M2, Case-Shiller, lumber, US unemployment + their 1/3/12-month deltas (FRED)
- Inventory: months-of-supply, days-on-market, price cuts, active/new/pending listings per zip (Realtor.com)
- Demographics: population by age cohort, household income, education, home value, rent, owner-occupancy at ZCTA (Census ACS, 10 vintages 2013-2022); per-capita personal income at county (BEA)
- Migration: county-to-county AGI flow (IRS SOI)
- Supply: building permits at MSA (Census BPS); business establishments + employment per zip (Census ZBP)
- Jobs: wages + YoY growth at county (BLS QCEW); monthly unemployment rate at county (BLS LAUS)
- Affordability / cost of living: effective property-tax rate per ZCTA (ACS B25103 / B25077); HUD Small-Area Fair Market Rent per ZIP; residential electricity price per state (EIA)
- Healthcare: CMS Hospital Compare star ratings + bed counts by county; County Health Rankings composite + headline measures (life expectancy, premature death, smoking, obesity)
- Schools: NCES district-year enrollment, pupil-teacher ratio, free-lunch share, per-pupil spending — county-broadcast
- Politics: county presidential vote share + winning margin (MIT Election Lab, 2000-2024)
- Climate: trailing-10-year disaster declarations by category (OpenFEMA); flood, wildfire, hurricane, heatwave risk scores at tract (FEMA NRI); annual PM2.5 + ozone air quality (EPA AQS)
- Amenities + infrastructure: EV charging stations per zip (DOE AFDC)
- Behavioral: monthly Wikipedia pageviews per metro article (search-interest proxy)
- Derived (no new data): gross rental yield = 12 × ZORI / ZHVI; rolling 24-mo ZHVI volatility; ZHVI drawdown vs trailing-60-mo peak
Pending modules with code ready but waiting on credentials (click to expand)
HMDA (tract-level mortgage records — currently only state-level aggregations),
FCC BDC broadband, NOAA VIIRS satellite nightlights, Foursquare POIs (Whole Foods /
coffee / breweries), FBI NIBRS county-year crime rates (needs CDE_API_KEY),
USAspending federal $ flows.
6. Confidence: walk-forward and spatial robustness
A single train/test split can flatter or punish a model by accident. Two stronger CV regimes re-run the post-2012 LightGBM to bound how much of the headline number is structural vs. noise.
1a. Walk-forward CV (rolling-origin)
Train through year T−1, test on year T; roll the cutoff from 2018-12 through 2022-12. Five test years per horizon (fewer for fwd_5y, which would need observability beyond the panel's end). The schematic below shows the sliding window: each row is one fold, blue is the training range, green is the held-out test year.
| horizon | n folds | mean Spearman ρ | std ρ | 95% interval ρ | mean decile spread | std decile spread |
|---|---|---|---|---|---|---|
| short (1y) | 5 | +0.377 | 0.179 | [+0.161, +0.536] | +8.03% | 3.23% |
| medium (3y) | 5 | +0.447 | 0.233 | [+0.123, +0.698] | +5.86% | 2.70% |
| long (5y) | 3 | +0.476 | 0.108 | [+0.372, +0.577] | +4.96% | 0.78% |
For the headline fwd_3y horizon, mean Spearman ρ = +0.447 with std 0.233 across 5 folds — the +0.327 figure from the single static post-2012 split survives in expectation, but with meaningful year-to-year noise. 2020 is the variance driver, as expected.
1b. Spatial CV (whole-CBSA holdout)
5-fold CBSA holdout, with train+test both restricted to ≤ 2017-12 so the only stressor is the spatial axis (no temporal regime shift). Two variants: the raw model predicts the price return directly; the demeaned model predicts the within-metro residual — i.e., the model is given each ZIP's return after subtracting the average return of every other ZIP in the same metro that month. That isolates "can the model rank ZIPs inside a metro?" from the easier question "can the model tell expensive metros from cheap ones?".
The plot below shows the actual partition: every top-100 metro is assigned to exactly one of five folds (color). Bubble size = metro population.
Raw fwd_3y
| horizon | n folds | mean Spearman ρ | std ρ | mean decile spread | std decile spread |
|---|---|---|---|---|---|
| short (1y) | 5 | +0.786 | 0.044 | +25.06% | 5.14% |
| medium (3y) | 5 | +0.817 | 0.032 | +20.54% | 3.34% |
| long (5y) | 5 | +0.845 | 0.027 | +16.92% | 2.19% |
Demeaned (within-metro residual) fwd_3y
| horizon | n folds | mean Spearman ρ | std ρ | mean decile spread | std decile spread |
|---|---|---|---|---|---|
| short (1y) | 5 | +0.230 | 0.016 | +3.72% | 0.30% |
| medium (3y) | 5 | +0.287 | 0.028 | +3.36% | 0.37% |
| long (5y) | 5 | +0.309 | 0.035 | +3.34% | 0.37% |
The headline lesson. The raw model's Spearman ρ = +0.817 with decile spread +20.54% looks spectacular — but most of it is a level effect ("expensive zips stay expensive"). The demeaned model's Spearman ρ = +0.287 with decile spread +3.36% is the genuine cross-sectional alpha.
7. Did this actually work? — portfolio backtest
The most decision-relevant question. We replay history: at each month from 2018-01 through 2021-12, we run the model with only the data it could have seen at that time, take its top-decile picks (a "basket" of about 1,000 ZIPs), and look up what those ZIPs actually returned 3 years later. Compare to the universe mean (equal-weight all ZIPs) and to the bottom-decile basket. Alpha is the basket's excess annualized return over the universe mean.
| model | months | universe / mo | top-10% basket | top-10% realized | universe realized | bottom-10% realized | top excess | top − bot spread | hit rate (top > univ) |
|---|---|---|---|---|---|---|---|---|---|
| Price-only (fwd_3y) | 48 | 9,932 | ~992 | +11.37% | +9.14% | +7.41% | +2.23 pp | +3.95 pp | 91.7% |
| Total-return (fwd_3y_total) | 48 | 2,084 | ~208 | +20.37% | +13.85% | +7.58% | +6.52 pp | +12.78 pp | 100.0% |
The model adds skill. The price-only fwd_3y headline model's top decile beat the
universe mean in 91.7% of months (48/48),
earning roughly +2.23 percentage points per year alpha
and a +3.95 pp/yr top-vs-bottom spread. The yield-aware
fwd_3y_total model is in another class — 100.0%
hit rate, +6.52 pp/yr alpha — but its scoring
universe is much smaller (only ZIPs where Zillow publishes rent data).
2a. Per-cohort wealth: $1 in, $? out after 3 years
The same backtest expressed as terminal wealth. For each entry month, $1 invested in the top-decile basket, the universe, or the bottom-decile basket is compounded forward for the full 3-year hold.
2b. Calibration — is the model a good ranker, a good forecaster, or both?
A model can rank ZIPs correctly yet still be mis-calibrated — predicting +3% when realized averages +7%. We bin every test-set prediction into 20 quantile buckets and plot mean predicted vs mean realized in each bucket. Points on the dashed line are perfectly calibrated.
The cloud's slope is the rank quality (steeper = stronger ranking); the cloud's shift relative to the diagonal is the calibration error. A model can have a steep slope but sit far below the diagonal — useful for picking but biased in level. That pattern is exactly what we'd expect in a regime-shift test window like post-2012, and is why R² is not the headline metric.
8. Decile spread, every model × horizon × split
Each bar is one model on one horizon on one test window. Positive values mean the model's top-decile picks beat its bottom-decile picks; negative values mean it anti-ranked. The left panel is the pre-2008 split (smaller and noisier); the right is post-2012 (the main result).
3a. Realized return by predicted decile (headline horizon × split)
The chart above collapses each (model, horizon, split) into a single "top minus bottom" number. The one below zooms into the headline (post-2012 LightGBM, fwd_3y) and shows the realized return of every decile, not just the extremes. A monotone climb from decile 1 → decile 10 is the strong-form claim: the model is ordering ZIPs, not just separating the very best from the very worst.
LightGBM wins decisively on the medium and long horizons in the post-2012 split. On the 1-year horizon, the demographics-only hedonic baseline (a simple linear regression using only Census ACS variables) actually edges LightGBM — a useful sanity check that when a simple baseline ties a sophisticated model, the signal probably comes from the data more than the algorithm.
9. What did the model learn? — SHAP feature importance
For each horizon, we compute mean absolute SHAP value per feature on the training sample. SHAP decomposes each prediction into per-feature contributions. The bars below show the 15 features that the LightGBM model used most heavily.
Short horizon (1 year)
Medium horizon (3 years)
Long horizon (5 years)
4a. Per-ZIP attribution — why this ZIP, not that one?
The aggregate SHAP charts above answer "which features does the model rely on, on average?" The two waterfalls below answer the more decision-relevant version: for this specific ZIP, which features pushed its predicted return up or down, and by how much? Green bars push the prediction higher than the model's base rate; red bars push it lower.
Top-decile example
Bottom-decile example
The pattern that recurs across nearly every top-decile ZIP: a large positive contribution from the drawdown feature combined with a uniform negative contribution from the 10Y real rate. Bottom-decile ZIPs typically show the opposite.
Four findings worth flagging:
- Drawdown is the #1 short-horizon predictor. The derived "ZHVI drawdown vs trailing 60-month peak" feature carries the largest mean-abs SHAP at fwd_1y. Zips that have fallen most relative to their recent peak tend to snap back hardest — a clean cross-sectional mean-reversion signal.
- Real rates dominate medium/long by a wide margin. At fwd_3y, 10Y TIPS yield's mean-abs SHAP is more than 4× the next-largest feature; at fwd_5y, more than 3×. Medium and long-horizon ranking is mostly about macro conditions, not zip-specific properties.
- New political + behavioral features are earning their keep. County Republican vote share (MIT Election Lab) appears in the top 10 for every horizon — and at fwd_5y, both party vote shares appear, so the model is reading the political-identity gradient, not taking sides. Wikipedia metro-article pageviews ranks #3 at fwd_3y.
- Climate has receded to a short-horizon signal. FEMA disaster counts and NRI wildfire risk drop out of the top 10 for fwd_3y and fwd_5y under the expanded feature pool, while staying in fwd_1y — most likely capturing short-term storm-driven price dynamics.
10. Does the signal survive price stratification?
A common worry with any model that ranks zip codes is that it has secretly learned a
price-level proxy: cheap zips mean-revert upward, expensive zips compound more slowly,
and the "alpha" is just a roundabout way of buying low. To check that, we split the
post-2012 test set into three ZHVI tiers and re-compute the LightGBM
fwd_3y decile spread within each tier.
| price tier (ZHVI) | test rows | median ZHVI | top decile realized | bottom decile realized | decile spread |
|---|---|---|---|---|---|
| <$200K | 158,318 | $152,181 | +11.98% | +5.63% | +6.35% |
| $200K–$500K | 339,898 | $307,788 | +9.23% | +2.66% | +6.57% |
| $500K+ | 151,656 | $705,620 | +6.25% | +2.90% | +3.34% |
Interpretation. The decile spread is +6.35% for sub-$200K homes, +6.57% in the $200K–$500K middle, and +3.34% for $500K+ homes — all positive, all out-of-sample. The two cheaper tiers carry essentially the same spread, and the $500K+ tier roughly halves but stays positive. This is the pattern you'd expect if the model has real cross-sectional signal at every price point and luxury markets simply have flatter forward returns — not the pattern you'd expect if the model were secretly a price-mean-reversion proxy.
Caveats and what's not in this run
- HMDA mortgage records are in the feature store but didn't reach the trained model. Tract-year origination counts + institutional-share are loaded at ~12% panel coverage, but the data only spans 2020 / 2022 / 2023. The post-2012 split's train window ends at 2017-12 — entirely pre-HMDA — so prep's coverage filter drops every HMDA column from X.
- Four data sources still pending. FCC broadband, NOAA VIIRS satellite
nightlights, Foursquare POIs, and FBI NIBRS crime aggregates (needs
CDE_API_KEY) are wired into the assembler but blocked on credentials. - Owner-occupier overlay is geographically narrow. 22 of 30 zips fall in the Washington, DC metro because the filter chain collapses to wealthy suburbs of one dominant metro.
- Total-return modeling now done but on a smaller universe. ZORI rent index
covers only ~11% of zip-months, so
fwd_3y_totaltrains on ~215K rows vs ~1.98M for price-only. - ElasticNet regressed under the new QuantileTransformer scaling. The Wave-1 fix solved an RMSE blowup but flipped fwd_3y Spearman to negative. LightGBM is unaffected and remains the headline model.
- Pre-2008 coverage is thin. Realtor inventory (2016+), QCEW (2010+), ACS (2013+), BPS (2011+), IRS-SOI (2011+) all start post-GFC. The pre_2008 results in the headline table train on 27 features vs 41 post-2012.
- fwd_5y walk-forward CV has 3 folds, not 5 — the 2021-12 and 2022-12 cutoffs would need fwd_5y observability through 2026-12 / 2027-12, beyond the panel's end.
arbok · model-first, personal-overlay layered on top · all data free / public sources · code at src/arbok/