CAPM vs Fama–French vs Carhart
Abstract
This project reproduces the empirical design of Goncharov (2023), who compared the capital asset pricing model (CAPM), the Fama–French three-factor model (FF3), and the Carhart four-factor model on a GICS-balanced sample of 30 US equities over 2018–2021 — a period spanning trade tensions and the COVID-19 market shock.
For each stock we estimate three time-series regressions on daily log excess returns: CAPM uses market excess return only; FF3 adds SMB and HML; Carhart adds momentum (MOM). Model quality is judged by adjusted R² and residual standard error (RSE), with paired one-tailed t-tests on cross-sectional RSE vectors as in the thesis.
Consistent with the original findings, multifactor extensions raise average adjusted R² materially relative to CAPM, while momentum contributes only marginally beyond FF3. Formal RSE tests typically fail to reject equal precision at conventional significance levels.
Methodology
Universe: Thirty US stocks from Goncharov’s Table 1, sampled across GICS sectors with large/small cap pairs where possible. Delisted tickers may be omitted if return history is unavailable.
Returns: Daily log returns ; excess return subtracts the Ken French daily risk-free rate.
Factors: Mkt-RF, SMB, HML, and momentum from the Dartmouth data library (same sources as the thesis).
Estimation: OLS with intercept for each model per stock. Adjusted R² penalizes extra factors; RSE uses degrees of freedom with = intercept plus factor count.
Inference: Paired one-tailed t-tests on RSE (CAPM vs FF3, FF3 vs Carhart, CAPM vs Carhart) after Shapiro–Wilk checks on the RSE vectors.
Model specifications
CAPM (one factor): Excess return on stock regressed on market excess return only. Tests whether a single beta explains daily variation during 2018–2021.
Fama–French three-factor (FF3): Adds SMB (size) and HML (value). Captures well-documented size and book-to-market effects in US equities.
Carhart four-factor: Adds momentum (MOM) to FF3. Momentum is often significant in US samples but may add little incremental adjusted R² once size and value are already included.
The results section reports cross-sectional means, medians, and dispersion of adjusted R² and RSE for each specification, with estimating equations shown below the tables.
What to look for in the results
Key insights (auto-generated from the sample) summarize mean R² gains, sector patterns, and RSE tests — read these before the tables.
The sector bar chart shows where FF3 adds the most explanatory power over CAPM; cyclical and consumer names often show larger lifts than utilities.
Per-stock grid sorting by FF3 adjusted R² highlights which names are well explained by factors vs dominated by idiosyncratic noise.
Contrast adjusted R² (variance explained) with RSE tests (residual precision): models can improve fit substantially even when formal RSE comparisons are borderline, as in Goncharov’s thesis.
Goodness-of-fit: adjusted R² and RSE
Adjusted R² measures how much daily excess return variance is explained after penalizing additional factors. Goncharov reports large gains from CAPM to FF3 and only modest further gains from adding momentum — the bar chart on this page reproduces that pattern on the estimated sample.
Residual standard error (RSE) is the regression residual volatility; lower RSE implies tighter fit. Paired one-tailed t-tests on cross-sectional RSE vectors test whether a more complex model significantly reduces residual noise versus a simpler one (CAPM vs FF3, FF3 vs Carhart, CAPM vs Carhart).
When RSE differences are not statistically significant at conventional levels, a simpler model may be preferred on parsimony grounds even if adjusted R² rises slightly with extra factors.
Sample period and economic context (2018–2021)
The thesis window includes the 2018 trade-policy volatility, the sharp Q4 2018 equity drawdown, the 2020 COVID-19 crash and policy-driven rebound, and the 2021 reopening / inflation narrative. Factor models must fit both calm and crisis days — a stringent test for fixed linear betas.
A GICS-balanced panel reduces sector concentration: financials, technology, health care, and energy names enter with large/small pairs where possible so results are not driven by a single industry.
Stocks with incomplete return history are dropped; the dashboard reports how many of the requested thirty stocks were successfully estimated.
Empirical results
Summary cards report mean adjusted R² and percentage improvements (FF3 vs CAPM, Carhart vs FF3). The bar chart visualizes average fit by model; the RSE table documents paired test statistics; the per-stock grid shows individual α, β, and fit metrics.
Compare headline statistics against Goncharov (2023) thesis tables when validating the replication.
Limitations
Public price data are not CRSP; corporate actions and delistings can differ from the thesis sample. Some 2018–2021 tickers are no longer listed.
The thesis used Excel OLS; this implementation uses statsmodels. Small numerical differences are expected.
Time-series fit on 2018–2021 does not prove that factors price the cross-section out of sample; it only measures explanatory power in-sample.
Research and education only — not investment advice.
Empirical results
Key analytical insights
- On average, FF3 explains 39.2% of daily excess return variance versus 29.4% under CAPM (+33.4% relative gain).
- Adding momentum (Carhart) lifts mean adjusted R² only marginally to 39.5% (+0.83% vs FF3), consistent with Goncharov (2023): size and value factors capture most of the incremental fit in 2018–2021.
- 24 of 24 stocks (100%) have higher adjusted R² under FF3 than CAPM; momentum improves FF3 on only 12 names.
- Largest CAPM→FF3 gain: INN (+27.2 pp adjusted R²). Smallest gain: BGS (+0.1 pp).
- Real Estate shows the largest average FF3 uplift over CAPM (+23.4 pp adjusted R²), suggesting sector composition matters for multifactor fit.
- Mean FF3 market beta is 0.95 (cross-section of 24 stocks); values above 1.0 indicate above-market sensitivity in the COVID-era window.
- Under FF3, 0 stocks have |t| ≥ 1.96 on the daily alpha intercept; the rest are consistent with pricing by MKT, SMB, and HML alone in-sample.
- Mean residual standard error falls from 2.40% (CAPM) to 2.20% (FF3) per day — about 8.3% lower residual volatility despite more parameters.
- Paired RSE test (CAPM vs FF3): t = 5.17, p ≈ 0.000 (one-tailed) — FF3 residuals are statistically tighter than CAPM in this sample, even though Goncharov’s thesis often reports insignificant RSE differences at 5%.
- 8 stocks exceed 50% FF3 adjusted R² (well-explained idiosyncratic structure); low-fit names are often smaller caps or single-factor-dominated stories.
Sample
24 / 30 stocks
2018-01-01 → 2021-12-31
Mean adj. R² (CAPM)
29.4%
Mean adj. R² (FF3)
39.2%
+33.4% vs CAPM
Mean adj. R² (Carhart)
39.5%
+0.83% vs FF3
Average explanatory power
Sector average adjusted R²
Mean in-sample fit by GICS sector (stocks with ≥2 names per sector). Highlights where multifactor models add the most over CAPM.
Residual standard error tests
One-tailed paired t-tests (H₀: simpler model RSE ≤ nested model). Thesis found no significant improvement at 5%.
| Comparison | Mean RSE (A) | Mean RSE (B) | t | p (1-tail) | Result |
|---|---|---|---|---|---|
| CAPM vs FF3 | 0.0240 | 0.0220 | 5.174 | 0.000 | FF3 significantly lower RSE than CAPM |
| FF3 vs CAR | 0.0220 | 0.0220 | 2.821 | 0.005 | CAR significantly lower RSE than FF3 |
| CAPM vs CAR | 0.0240 | 0.0220 | 5.251 | 0.000 | CAR significantly lower RSE than CAPM |
Per-stock regressions
| Ticker | Sector | CAPM adj. R² | FF3 adj. R² | Carhart adj. R² | CAPM RSE | FF3 RSE | Carhart RSE |
|---|---|---|---|---|---|---|---|
| HON | Industrials | 61.5% | 71.8% | 72.1% | 0.0110 | 0.0090 | 0.0090 |
| SYF | Financials | 49.6% | 69.0% | 69.0% | 0.0210 | 0.0160 | 0.0160 |
| AIG | Financials | 45.7% | 68.4% | 68.4% | 0.0200 | 0.0150 | 0.0150 |
| MPC | Energy | 42.1% | 56.7% | 57.3% | 0.0240 | 0.0210 | 0.0210 |
| HST | Real Estate | 34.7% | 54.3% | 56.4% | 0.0210 | 0.0170 | 0.0170 |
| IBM | Information Technology | 48.4% | 52.6% | 53.4% | 0.0130 | 0.0120 | 0.0120 |
| PPG | Materials | 44.7% | 52.2% | 52.3% | 0.0140 | 0.0130 | 0.0130 |
| INN | Real Estate | 24.6% | 51.8% | 52.6% | 0.0310 | 0.0250 | 0.0250 |
| AROC | Energy | 24.9% | 47.7% | 47.6% | 0.0340 | 0.0290 | 0.0290 |
| DRI | Consumer Discretionary | 37.3% | 46.3% | 46.3% | 0.0240 | 0.0220 | 0.0220 |
| LUV | Industrials | 31.6% | 45.3% | 46.5% | 0.0210 | 0.0190 | 0.0180 |
| JNJ | Health Care | 36.9% | 42.6% | 42.8% | 0.0110 | 0.0100 | 0.0100 |
| TMUS | Communication Services | 39.2% | 39.7% | 39.7% | 0.0140 | 0.0140 | 0.0140 |
| LEN | Consumer Discretionary | 33.1% | 35.9% | 36.6% | 0.0230 | 0.0220 | 0.0220 |
| BZH | Consumer Discretionary | 29.2% | 34.3% | 35.1% | 0.0320 | 0.0310 | 0.0310 |
| MSEX | Utilities | 24.0% | 24.8% | 24.8% | 0.0190 | 0.0190 | 0.0190 |
| ECPG | Financials | 14.5% | 23.9% | 23.8% | 0.0340 | 0.0320 | 0.0320 |
| TWIN | Industrials | 18.4% | 23.2% | 23.2% | 0.0360 | 0.0350 | 0.0350 |
| ED | Utilities | 16.0% | 23.2% | 23.2% | 0.0140 | 0.0140 | 0.0140 |
| LXU | Materials | 11.7% | 22.9% | 22.9% | 0.0510 | 0.0470 | 0.0470 |
| IRWD | Health Care | 15.8% | 22.6% | 22.7% | 0.0270 | 0.0260 | 0.0260 |
| LOCO | Consumer Discretionary | 11.7% | 19.6% | 19.6% | 0.0280 | 0.0260 | 0.0260 |
| CPB | Consumer Staples | 5.6% | 7.9% | 8.1% | 0.0170 | 0.0170 | 0.0170 |
| BGS | Consumer Staples | 4.0% | 4.1% | 4.1% | 0.0280 | 0.0280 | 0.0280 |
Missing prices: HA, AMBC, WOW, CTXS, EBIX, LLNW
Regression equations
CAPM: R_i,t - R_f,t = α_i + β_i (R_m,t - R_f,t) + ε_i,t
FF3: R_i,t - R_f,t = α_i + β_MKT (MKT) + β_SMB SMB + β_HML HML + ε_i,t
CAR: R_i,t - R_f,t = α_i + β_MKT (MKT) + β_SMB SMB + β_HML HML + β_MOM MOM + ε_i,t