US Multi-Factor Risk & Return Model
What this study asks
When markets rise or fall as a block, individual stocks can still diverge on valuation, financial strength, and recent price behaviour. This page documents how those differences behave in a fixed US large-cap panel scored on three independent themes.
The analysis is descriptive and comparative: we measure whether each theme lines up with the next month of returns, whether squared or interaction terms add real explanatory power beyond a linear model, and how much of an equal-weight basket return can be traced to market beta, style premia, or an unexplained residual.
All inputs are public market data. Results refresh when the underlying price and fundamental snapshot is updated.
Universe and data
The investable set contains six stocks from each GICS sector in the S&P 500, selected with a fixed random seed so the panel stays stable across rebuilds. That yields roughly sixty names with balanced sector representation rather than cap-weight concentration.
Daily prices come from Yahoo Finance. Fundamental inputs (profitability, leverage, earnings, cash flow, and book proxies) come from the latest available issuer snapshot, applied only after a three-month reporting delay so same-day prices do not reflect information that would not yet have been public.
Momentum is computed entirely from price history; value and quality lean on the fundamental snapshot and therefore change mainly when prices move or when the snapshot is refreshed.
How the three composites are built
Each stock receives sub-scores for valuation (earnings yield, free-cash-flow yield, book-to-market proxy), quality (return on equity, inverse leverage, cash-flow stability), and momentum (twelve-month trend excluding the most recent month, plus six-month trend).
Sub-scores are standardized within the same calendar date and sector so a “high value” reading means cheap relative to peers in that industry, not cheap in absolute terms. The three composites are simple averages of their z-scored ingredients.
Extreme valuation readings are winsorized before scoring so a handful of distressed names does not dominate sector ranks.
Are higher-order terms worth it?
A linear model assumes each factor’s effect is constant regardless of level. In practice, crowding, regime shifts, or diminishing returns could make cheap stocks respond differently when they are already extreme.
We therefore compare four pooled specifications on the same panel: linear factors only; linear plus squared terms; linear plus pairwise interactions; and the full quadratic specification. Models are judged on adjusted R², AIC, BIC, and a nested F-test of linear versus the full nonlinear form.
The results section states clearly whether nonlinear terms are warranted or not warranted for this sample. When they are not, the linear model is preferred because extra parameters do not buy enough explanatory power.
Additional diagnostics
Information coefficients measure rank correlation between each composite and subsequent returns across the panel each day, summarized by mean IC, volatility, and hit rate.
Quintile spreads compare average forward returns of the top-scored names versus the bottom-scored names, giving an intuitive sense of economic magnitude separate from regression coefficients.
Factor correlation shows how independently the three themes move in the cross-section; high correlation would mean tilts are hard to separate in practice.
Attribution decomposes an equal-weight portfolio return into market, factor, and residual components through time.
How to interpret the output
Overlapping twenty-one-day forward returns mean t-statistics and p-values are indicative, not definitive hypothesis tests.
Sector-level regressions use smaller samples; a strong coefficient in one industry may not replicate out of sample.
This is research documentation for learning and monitoring — not a product recommendation or live strategy.
Interactive results
Below you will find narrative findings, a model comparison table, correlation and IC diagnostics, quintile spreads, factor premium charts, attribution, and sector regression tables. The preferred specification highlighted in the model table is the one used for attribution when nonlinear terms are not warranted.
Results
Summary
US large-cap panel (66 stocks, balanced across 11 sectors)
Key findings
- Higher-order terms are not worth the complexity here: adjusted R² gain versus linear is only 0.15 pp, nested F-test p=1.0000, and AIC is lowest for the linear model (ΔAIC=-157.5 for the full spec).
- In the pooled linear model, Value has the largest coefficient magnitude (0.943% per one cross-sectional z-score on 21-day forward returns) and is statistically distinguishable from zero at the 5% level.
- Rank–return alignment (Spearman IC) is strongest for Value (mean IC 0.059, hit rate 69% of months).
- Quintile sorts show top-scored names tend to outperform bottom-scored names by 1.97% per 21-day rebalance on Value (naive annualization ≈ 23.6%).
- Utilities shows the largest estimated momentum loading in sector-specific regressions (-1.30% per z-score), though sector samples are small.
- Attribution noise-adjusted contribution is largest for Quality (information ratio 125.35 on daily factor legs).
- Cumulative unattributed return (alpha) over the study window is negative at -4248.3% when factors and market are netted out.
Linear vs higher-order fit
Higher-order terms are not worth the complexity here: adjusted R² gain versus linear is only 0.15 pp, nested F-test p=1.0000, and AIC is lowest for the linear model (ΔAIC=-157.5 for the full spec).
| Specification | Params | R² | Adj. R² | AIC | BIC |
|---|---|---|---|---|---|
| Linear (baseline)(preferred) | 14 | 0.900% | 0.900% | -175027 | -174893 |
| Linear + squared terms | 17 | 1.000% | 1.000% | -175146 | -174983 |
| Linear + pairwise interactions | 17 | 1.000% | 1.000% | -175082 | -174919 |
| Linear + squares + interactions | 20 | 1.100% | 1.100% | -175185 | -174993 |
Nested F-test (linear vs full nonlinear): F = 28.22, p = 1.0000
Factor correlation (daily cross-section)
| Value | Quality | Momentum | |
|---|---|---|---|
| Value | 1.00 | -0.10 | -0.10 |
| Quality | -0.10 | 1.00 | -0.01 |
| Momentum | -0.10 | -0.01 | 1.00 |
Information coefficient (rank predictive power)
| Factor | Mean IC | IC vol | IC IR | Hit rate |
|---|---|---|---|---|
| Value | 0.059 | 0.118 | 0.51 | 69% |
| Quality | -0.008 | 0.129 | -0.06 | 48% |
| Momentum | 0.021 | 0.178 | 0.12 | 57% |
Quintile spread (top minus bottom)
Average 21-day forward return spread between highest- and lowest-scored names each day.
Factor premiums (weekly)
Cumulative factor contribution
Return attribution (monthly cumulative)
Pooled linear regression
| Factor | Coef | t-stat | p-value |
|---|---|---|---|
| Value | 0.900% | 18.32 | 0.0000 |
| Quality | 0.300% | 3.89 | 0.0000 |
| Momentum | 0.300% | 6.88 | 0.0000 |
Sector regressions
| Sector | N | R² | Value p | Quality p | Mom p |
|---|---|---|---|---|---|
| Information Technology | 9666 | 1.7% | 0.000 | 0.008 | 0.000 |
| Energy | 9666 | 0.4% | 0.000 | 0.795 | 0.263 |
| Industrials | 9666 | 1.1% | 0.000 | 0.000 | 0.553 |
| Consumer Discretionary | 9666 | 0.4% | 0.000 | 0.000 | 0.873 |
| Health Care | 9666 | 2.7% | 0.000 | 0.008 | 0.000 |
| Real Estate | 9666 | 0.4% | 0.000 | 0.116 | 0.167 |
| Consumer Staples | 9666 | 1.3% | 0.000 | 0.000 | 0.000 |
| Utilities | 9002 | 2.4% | 0.007 | 0.000 | 0.000 |
| Communication Services | 9666 | 0.6% | 0.006 | 0.000 | 0.000 |
| Financials | 9666 | 0.0% | 0.710 | 0.542 | 0.203 |
| Materials | 9666 | 0.6% | 0.004 | 0.006 | 0.953 |