Market Homogeneity from Correlation Networks: An Empirical US Equity Study
Abstract
This study constructs a monthly Homogeneity Index (HI) for US large-cap equities from the topology of thresholded return-correlation networks. Using a sector-balanced sample of approximately fifty S&P 500 constituents and daily adjusted closes from Yahoo Finance (2019 onward), we estimate pairwise correlations over rolling twenty-day windows, retain statistically strong links, and summarize network structure into a scalar index combining linkage rate, dominant-cluster share, density, and concentration. High-homogeneity months—when stocks move as a tightly linked block—are associated with constrained diversification and elevated systemic co-movement risk; low-homogeneity months favour cross-sectional differentiation. We overlay SPY benchmark returns, classify regimes with reference thresholds, compare contemporaneous and forward returns across regimes, and visualize correlation networks at peak and trough homogeneity. Results are descriptive and hypothesis-generating; they do not constitute investment advice.
Introduction and Research Context
During stress episodes, equity correlations tend to rise toward unity—a phenomenon documented in contagion and systemic-risk literatures (Longin and Solnik, 2001; Ang and Chen, 2002). Portfolio managers observe that diversification benefits shrink precisely when they are most needed. A natural research question is whether network structure in the cross-section of stock returns can be distilled into a tractable market-state indicator.
This research addresses three core questions:
- Does a correlation-network homogeneity measure vary meaningfully over time in US large caps?
- Are high- and low-homogeneity regimes associated with different contemporaneous and forward SPY return profiles?
- How does the visual topology of stock linkages differ between dense and sparse correlation regimes?
We adopt a descriptive empirical framework—monthly index construction, regime counts, conditional return tables, and network exhibits—rather than claiming out-of-sample predictability or risk-adjusted alpha.
Theoretical Foundations: Correlation Networks and Market States
Several economic channels plausibly link network homogeneity to market state:
- Risk-on / risk-off rotation — macro shocks synchronize sector returns, increasing pairwise correlation and shrinking effective independent bets.
- Liquidity commonality — funding stress propagates through correlated deleveraging, thickening the correlation graph.
- Factor crowding — when a small set of systematic factors dominates, stocks cluster into few equivalence classes in correlation space.
- Idiosyncratic dispersion — calm, stock-picking regimes produce sparser networks with many small structural classes.
Formally, let denote the graph at month with vertices (stocks) and edges where . Network statistics—density, largest connected cluster share, and class fragmentation—summarize how uniformly returns co-move. The Homogeneity Index aggregates these into a single scalar with higher values indicating tighter, more uniform linkage.
Research Hypotheses
We evaluate four testable propositions:
- H₁ (temporal variation): exhibits persistent but time-varying structure across the 2019–2026 sample, with identifiable high- and low-homogeneity episodes.
- H₂ (return association): Contemporaneous SPY monthly returns differ on average between high- and low-homogeneity classifications.
- H₃ (forward linkage): Forward SPY returns over 20- and 60-trading-day horizons show regime-dependent averages, though effect sizes may be modest given small event counts.
- H₄ (network topology): Peak-homogeneity months display visibly denser correlation networks and fewer, larger structural equivalence classes than low-homogeneity months.
Regime thresholds (HI > 0.105 high; HI < 0.090 low) follow published reference values from prior international studies; US applications should treat them as research defaults and validate on local samples.
Data Sources and Empirical Methodology
The universe comprises a sector-balanced sample of ~50 S&P 500 constituents (five names per GICS sector where available). Daily Yahoo Finance adjusted closes from 2019-01-01 aggregate to rolling twenty-day return windows. SPY serves as the broad-market benchmark for overlay and conditional-return analysis.
The empirical pipeline implements four analytical layers:
- Layer 1 — correlation estimation: pairwise Pearson correlations over a twenty-day lookback; retain edges where .
- Layer 2 — network summary: compute density, structural-class count, dominant-class share, and concentration; form the weighted Homogeneity Index .
- Layer 3 — regime analysis: classify each month as high, normal, or low homogeneity; tabulate frequency, histogram, and annual summaries.
- Layer 4 — benchmark conditioning: contemporaneous and forward SPY returns by regime; scatter of HI vs next-month SPY; event tables for extreme episodes.
Interactive network graphs export node positions (stocks), edge weights (correlation strength), and structural-class colouring for the latest month and for representative high- vs low-homogeneity comparisons.
Empirical Results: Regime Structure and Benchmark Linkage
Analysis over the sample reveals the following patterns (exact figures update with each data refresh):
- Temporal variation (H₁): HI spans a wide range from roughly 0.04 to above 0.50, with a sample mean near 0.12; high-homogeneity months occur in a minority of observations but cluster around macro stress windows.
- Return association (H₂): High-homogeneity months show distinct average contemporaneous SPY returns versus low-homogeneity months; magnitudes and signs should be read with small-sample caution.
- Forward linkage (H₃): Average SPY returns 20 and 60 trading days after high- vs low-HI months are reported in the forward-return exhibit; statistical power is limited by event frequency.
- Network topology (H₄): Side-by-side network graphs at peak and trough HI visually confirm denser linkage and fewer structural classes during high-homogeneity episodes.
Discussion: Interpretation, Limitations, and Practical Considerations
The Homogeneity Index is a market-state descriptor, not a standalone trading rule. High HI signals reduced diversification benefit and elevated systemic co-movement; low HI supports stock-level differentiation strategies—but causal inference from correlational regime splits is fragile.
Limitations: (1) sector-balanced subsample of ~50 names may not fully represent the S&P 500; (2) correlation thresholds and lookback windows are research choices; (3) regime thresholds imported from non-US markets require local validation; (4) Yahoo Finance adjusted closes may differ from CRSP/Compustat; (5) forward-return statistics use overlapping windows and modest event counts.
Natural extensions: out-of-sample threshold calibration, sector-residual networks, dynamic conditional correlation models, and linkage to VIX or credit-spread regimes.
Interactive Empirical Exhibits (below)
The Results section reproduces the full homogeneity dataset. Each panel serves a distinct diagnostic role:
- Summary cards — current HI, structural-class count, network connectivity, regime frequency, sample percentiles.
- Correlation networks — latest observation and high- vs low-homogeneity regime comparison with interactive node/edge detail.
- HI time series — monthly index with reference bands and SPY overlay.
- Component decomposition — linkage rate, largest cluster, density, concentration over time.
- Structural metrics — class count and dominant-class size.
- Distribution & regime frequency — histogram and high/normal/low shares.
- SPY conditioning — contemporaneous returns by regime, HI vs forward-return scatter, forward-return comparison bars.
- Annual summary — mean/min/max HI by year.
- Correlation matrix — pairwise heatmap for highly connected names.
- Structural class table — membership of equivalence classes at latest date.
- Episode tables — high- and low-homogeneity dates with forward SPY returns.
Results
S&P 500 sector-balanced (49 stocks). Sample: 2019-01-01 through 2026-06-05. Benchmark: SPY.
Key empirical findings
- As of 2026-06-30, the homogeneity index stands at 0.288, classifying the market as high homogeneity relative to reference thresholds.
- High-homogeneity months account for 47.8% of the sample, periods when return correlations cluster and diversification across single names is most constrained.
- Low-homogeneity months represent 35.6% of observations, when cross-sectional dispersion in correlation structure is widest and stock-level differentiation dominates.
- Contemporaneous SPY monthly returns average +1.65% in high-homogeneity months versus +1.32% in low-homogeneity months (n=43 / 32).
- Sample mean HI is 0.117 with range [0.044, 0.516].
Current HI estimate
0.2880
highStructural classes
28
Universe: 49 stocks
Network connectivity
5.00%
59 significant pairs
High-homogeneity frequency
47.80%
Sample mean HI: 0.1170
Sample median HI
0.1030
75th percentile: 0.1330
90th percentile HI
0.1640
Reference bands: 0.105 / 0.09
Correlation network — latest observation
Cross-sectional map of statistically linked return co-movements. Node size reflects connectivity; colour distinguishes structurally equivalent positions in the network.
Observation date 2026-06-30
Nodes are stocks; edges connect pairs with return correlation above the study threshold. Line weight scales with correlation strength; node colour marks structural equivalence class.
Regime comparison: dense versus sparse correlation structure
Side-by-side networks at the sample maximum HI (left) and a representative low-homogeneity month (right), illustrating how market linkage tightens and loosens through time.
Peak homogeneity month
Nodes are stocks; edges connect pairs with return correlation above the study threshold. Line weight scales with correlation strength; node colour marks structural equivalence class.
Low-homogeneity reference month
Nodes are stocks; edges connect pairs with return correlation above the study threshold. Line weight scales with correlation strength; node colour marks structural equivalence class.
Homogeneity index time series
Monthly composite index with reference bands (high > 0.105, low < 0.09). Purple trace = HI; grey trace = normalized SPY benchmark.
Index component decomposition
Four weighted drivers of the homogeneity index: linkage rate (40%), largest-cluster share (30%), network density (20%), and concentration (10%).
Sample-average component contribution (% scale)
Network structural metrics over time
Count of distinct structural classes (bars) and size of the dominant class (line). Fewer, larger classes indicate tighter market-wide co-movement.
HI distribution
Empirical frequency of monthly homogeneity index values across the study sample.
Regime frequency
Proportion of monthly observations classified as high, normal, or low homogeneity.
SPY monthly return by homogeneity regime
Average contemporaneous SPY return when the index signals high, normal, or low market homogeneity.
Homogeneity index vs forward SPY return
Each point is a monthly observation: horizontal axis = HI, vertical axis = next-month SPY return. Colour encodes regime classification.
Average forward SPY returns: high vs low homogeneity
Mean SPY return following months classified as high- or low-homogeneity (20- and 60-trading-day horizons).
Annual HI summary
Mean, minimum, and maximum homogeneity index by calendar year.
Pairwise correlation matrix (2026-06-30)
Return correlations among the most connected names in the latest estimation window. Strong positive or negative co-movement defines network links.
| DLR | IP | FITB | DUK | FANG | HD | KKR | INVH | APA | CEG | DOW | XEL | AMAT | KMI | HST | CPT | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| DLR | 1 | 0.6 | 0.8 | 0.3 | -0.5 | 0.4 | 0.6 | 0.3 | -0.5 | 0.6 | -0.3 | 0.4 | 0.3 | -0.0 | 0.4 | 0.3 |
| IP | 0.6 | 1 | 0.5 | -0.0 | -0.4 | 0.6 | 0.5 | 0.3 | -0.6 | 0.5 | -0.3 | -0.1 | 0.2 | -0.6 | 0.5 | 0.2 |
| FITB | 0.8 | 0.5 | 1 | 0.4 | -0.6 | 0.5 | 0.6 | 0.4 | -0.3 | 0.6 | -0.4 | 0.4 | 0.1 | 0.0 | 0.5 | 0.3 |
| DUK | 0.3 | -0.0 | 0.4 | 1 | -0.3 | 0.4 | -0.1 | 0.5 | -0.2 | 0.2 | -0.2 | 0.9 | -0.4 | 0.4 | 0.1 | 0.5 |
| FANG | -0.5 | -0.4 | -0.6 | -0.3 | 1 | -0.3 | -0.6 | -0.4 | 0.8 | -0.3 | 0.8 | -0.3 | 0.3 | 0.4 | -0.4 | -0.0 |
| HD | 0.4 | 0.6 | 0.5 | 0.4 | -0.3 | 1 | 0.1 | 0.4 | -0.5 | 0.5 | -0.4 | 0.3 | -0.1 | -0.3 | 0.3 | 0.3 |
| KKR | 0.6 | 0.5 | 0.6 | -0.1 | -0.6 | 0.1 | 1 | 0.1 | -0.4 | 0.3 | -0.4 | -0.0 | 0.3 | -0.3 | 0.5 | -0.0 |
| INVH | 0.3 | 0.3 | 0.4 | 0.5 | -0.4 | 0.4 | 0.1 | 1 | -0.3 | 0.2 | -0.2 | 0.4 | -0.3 | -0.0 | 0.3 | 0.7 |
| APA | -0.5 | -0.6 | -0.3 | -0.2 | 0.8 | -0.5 | -0.4 | -0.3 | 1 | -0.4 | 0.7 | -0.3 | -0.1 | 0.5 | -0.5 | -0.0 |
| CEG | 0.6 | 0.5 | 0.6 | 0.2 | -0.3 | 0.5 | 0.3 | 0.2 | -0.4 | 1 | -0.3 | 0.4 | 0.4 | -0.2 | 0.2 | 0.1 |
| DOW | -0.3 | -0.3 | -0.4 | -0.2 | 0.8 | -0.4 | -0.4 | -0.2 | 0.7 | -0.3 | 1 | -0.3 | 0.1 | 0.5 | -0.5 | 0.1 |
| XEL | 0.4 | -0.1 | 0.4 | 0.9 | -0.3 | 0.3 | -0.0 | 0.4 | -0.3 | 0.4 | -0.3 | 1 | -0.1 | 0.3 | -0.0 | 0.3 |
| AMAT | 0.3 | 0.2 | 0.1 | -0.4 | 0.3 | -0.1 | 0.3 | -0.3 | -0.1 | 0.4 | 0.1 | -0.1 | 1 | -0.1 | 0.3 | -0.1 |
| KMI | -0.0 | -0.6 | 0.0 | 0.4 | 0.4 | -0.3 | -0.3 | -0.0 | 0.5 | -0.2 | 0.5 | 0.3 | -0.1 | 1 | -0.4 | 0.1 |
| HST | 0.4 | 0.5 | 0.5 | 0.1 | -0.4 | 0.3 | 0.5 | 0.3 | -0.5 | 0.2 | -0.5 | -0.0 | 0.3 | -0.4 | 1 | 0.4 |
| CPT | 0.3 | 0.2 | 0.3 | 0.5 | -0.0 | 0.3 | -0.0 | 0.7 | -0.0 | 0.1 | 0.1 | 0.3 | -0.1 | 0.1 | 0.4 | 1 |
Structural class size distribution
Ranked sizes of correlation-network equivalence classes as of 2026-06-30.
Correlation-network structure (2026-06-30)
Stocks grouped by symmetric positions in the thresholded correlation network.
| Class | Size | Members |
|---|---|---|
| 1 | 15 | GOOGL, TTD, CMCSA, WMT, TGT, CPAY, NTRS, UNH, BSX, CAH, JBHT, CHRW, MMM, AAPL, ANET |
| 2 | 3 | FANG, MPC, DOW |
| 3 | 2 | TKO, AXON |
| 4 | 2 | META, URI |
| 5 | 2 | KMI, XEL |
| 6 | 2 | KKR, CEG |
| 7 | 2 | AMAT, KLAC |
| 8 | 1 | BBY |
| 9 | 1 | HD |
| 10 | 1 | MAR |
| 11 | 1 | TPR |
| 12 | 1 | CAG |
High-homogeneity episodes
Months with HI above 0.105 — subsequent SPY returns.
| Date | HI | +20d | +60d |
|---|---|---|---|
| 2019-02-28 | 0.1350 | 1.20% | 1.90% |
| 2019-04-30 | 0.2180 | -5.40% | 2.50% |
| 2019-07-31 | 0.1630 | -2.90% | 1.50% |
| 2019-08-31 | 0.1530 | 1.30% | 8.50% |
| 2019-11-30 | 0.2080 | 3.80% | -4.50% |
| 2020-01-31 | 0.1300 | -3.90% | -10.70% |
| 2020-02-29 | 0.1420 | -14.90% | -1.20% |
| 2020-03-31 | 0.2710 | 13.80% | 19.80% |
| 2020-04-30 | 0.1390 | 4.80% | 11.80% |
| 2020-06-30 | 0.1580 | 5.40% | 5.30% |
| 2020-08-31 | 0.1070 | -4.50% | 4.40% |
| 2020-10-31 | 0.1070 | 10.80% | 12.60% |
| 2020-12-31 | 0.1230 | 0.60% | 5.90% |
| 2021-07-31 | 0.1140 | 3.30% | 4.50% |
| 2022-05-31 | 0.1170 | -7.50% | 2.00% |
| 2022-06-30 | 0.1130 | 9.20% | -3.00% |
| 2022-08-31 | 0.1220 | -7.80% | 2.20% |
| 2022-09-30 | 0.1500 | 8.90% | 7.30% |
| 2022-10-31 | 0.1240 | 2.30% | 5.50% |
| 2022-11-30 | 0.1220 | -5.50% | -2.40% |
Low-homogeneity episodes
Months with HI below 0.09 — subsequent SPY returns.
| Date | HI | +20d | +60d |
|---|---|---|---|
| 2019-01-31 | 0.0880 | 3.90% | 9.30% |
| 2019-03-31 | 0.0760 | 2.90% | 2.10% |
| 2019-05-31 | 0.0700 | 7.00% | 5.10% |
| 2019-10-31 | 0.0600 | 3.60% | 8.20% |
| 2020-07-31 | 0.0710 | 7.40% | 4.40% |
| 2020-11-30 | 0.0530 | 3.00% | 5.50% |
| 2021-01-31 | 0.0640 | 2.70% | 11.30% |
| 2021-02-28 | 0.0870 | 1.90% | 7.70% |
| 2021-03-31 | 0.0610 | 6.00% | 8.00% |
| 2021-05-31 | 0.0530 | 2.20% | 7.30% |
| 2021-06-30 | 0.0610 | 2.90% | 4.00% |
| 2021-10-31 | 0.0440 | -1.00% | -5.90% |
| 2021-11-30 | 0.0740 | 5.20% | -3.60% |
| 2021-12-31 | 0.0770 | -5.30% | -2.50% |
| 2022-01-31 | 0.0800 | -4.40% | -7.00% |
| 2022-02-28 | 0.0550 | 4.70% | -9.50% |
| 2022-03-31 | 0.0750 | -8.80% | -15.40% |
| 2022-04-30 | 0.0860 | -0.40% | -1.60% |
| 2022-07-31 | 0.0840 | -2.00% | -5.90% |
| 2023-01-31 | 0.0750 | -2.90% | 1.80% |