CAPM vs Fama–French vs Carhart

Abstract

This project reproduces the empirical design of Goncharov (2023), who compared the capital asset pricing model (CAPM), the Fama–French three-factor model (FF3), and the Carhart four-factor model on a GICS-balanced sample of 30 US equities over 2018–2021 — a period spanning trade tensions and the COVID-19 market shock.

For each stock we estimate three time-series regressions on daily log excess returns: CAPM uses market excess return only; FF3 adds SMB and HML; Carhart adds momentum (MOM). Model quality is judged by adjusted R² and residual standard error (RSE), with paired one-tailed t-tests on cross-sectional RSE vectors as in the thesis.

Consistent with the original findings, multifactor extensions raise average adjusted R² materially relative to CAPM, while momentum contributes only marginally beyond FF3. Formal RSE tests typically fail to reject equal precision at conventional significance levels.

Mean adjusted R² rises from 29.4% (CAPM) to 39.2% (FF3), a 33.4% relative improvement.
Carhart four-factor mean fit is 39.5% (+0.83% vs FF3) — momentum adds little on average.
FF3 beats CAPM on adjusted R² for 24 of 24 stocks.
Paired RSE test favors FF3 (t=5.17, one-tailed p≈0.000).
On average, FF3 explains 39.2% of daily excess return variance versus 29.4% under CAPM (+33.4% relative gain).
Adding momentum (Carhart) lifts mean adjusted R² only marginally to 39.5% (+0.83% vs FF3), consistent with Goncharov (2023): size and value factors capture most of the incremental fit in 2018–2021.
24 of 24 stocks (100%) have higher adjusted R² under FF3 than CAPM; momentum improves FF3 on only 12 names.
Largest CAPM→FF3 gain: INN (+27.2 pp adjusted R²). Smallest gain: BGS (+0.1 pp).
Real Estate shows the largest average FF3 uplift over CAPM (+23.4 pp adjusted R²), suggesting sector composition matters for multifactor fit.
Mean FF3 market beta is 0.95 (cross-section of 24 stocks); values above 1.0 indicate above-market sensitivity in the COVID-era window.
Under FF3, 0 stocks have |t| ≥ 1.96 on the daily alpha intercept; the rest are consistent with pricing by MKT, SMB, and HML alone in-sample.
Mean residual standard error falls from 2.40% (CAPM) to 2.20% (FF3) per day — about 8.3% lower residual volatility despite more parameters.
Paired RSE test (CAPM vs FF3): t = 5.17, p ≈ 0.000 (one-tailed) — FF3 residuals are statistically tighter than CAPM in this sample, even though Goncharov’s thesis often reports insignificant RSE differences at 5%.
8 stocks exceed 50% FF3 adjusted R² (well-explained idiosyncratic structure); low-fit names are often smaller caps or single-factor-dominated stories.

Methodology

Universe: Thirty US stocks from Goncharov’s Table 1, sampled across GICS sectors with large/small cap pairs where possible. Delisted tickers may be omitted if return history is unavailable.

Returns: Daily log returns $R_{t} = ln (P_{t} / P_{t - 1})$ ; excess return subtracts the Ken French daily risk-free rate.

Factors: Mkt-RF, SMB, HML, and momentum from the Dartmouth data library (same sources as the thesis).

Estimation: OLS with intercept for each model per stock. Adjusted R² penalizes extra factors; RSE uses $T - k$ degrees of freedom with $k$ = intercept plus factor count.

Inference: Paired one-tailed t-tests on RSE (CAPM vs FF3, FF3 vs Carhart, CAPM vs Carhart) after Shapiro–Wilk checks on the RSE vectors.

Model specifications

CAPM (one factor): Excess return on stock $i$ regressed on market excess return only. Tests whether a single beta explains daily variation during 2018–2021.

Fama–French three-factor (FF3): Adds SMB (size) and HML (value). Captures well-documented size and book-to-market effects in US equities.

Carhart four-factor: Adds momentum (MOM) to FF3. Momentum is often significant in US samples but may add little incremental adjusted R² once size and value are already included.

The exhibits in the goodness-of-fit section report cross-sectional means, medians, and dispersion of adjusted R² and RSE for each specification, with estimating equations shown alongside the tables.

CAPM: R_i,t - R_f,t = α_i + β_i (R_m,t - R_f,t) + ε_i,t

FF3: R_i,t - R_f,t = α_i + β_MKT (MKT) + β_SMB SMB + β_HML HML + ε_i,t

CAR: R_i,t - R_f,t = α_i + β_MKT (MKT) + β_SMB SMB + β_HML HML + β_MOM MOM + ε_i,t

What to look for in the results

Key insights (auto-generated from the sample) summarize mean R² gains, sector patterns, and RSE tests — read these before the tables.

The sector bar chart shows where FF3 adds the most explanatory power over CAPM; cyclical and consumer names often show larger lifts than utilities.

Per-stock grid sorting by FF3 adjusted R² highlights which names are well explained by factors vs dominated by idiosyncratic noise.

Contrast adjusted R² (variance explained) with RSE tests (residual precision): models can improve fit substantially even when formal RSE comparisons are borderline, as in Goncharov’s thesis.

Goodness-of-fit: adjusted R² and RSE

Adjusted R² measures how much daily excess return variance is explained after penalizing additional factors. Goncharov reports large gains from CAPM to FF3 and only modest further gains from adding momentum — the bar chart on this page reproduces that pattern on the estimated sample.

Residual standard error (RSE) is the regression residual volatility; lower RSE implies tighter fit. Paired one-tailed t-tests on cross-sectional RSE vectors test whether a more complex model significantly reduces residual noise versus a simpler one (CAPM vs FF3, FF3 vs Carhart, CAPM vs Carhart).

When RSE differences are not statistically significant at conventional levels, a simpler model may be preferred on parsimony grounds even if adjusted R² rises slightly with extra factors.

Sample

24 / 30 stocks

2018-01-01 → 2021-12-31

Mean adj. R² (CAPM)

29.4%

Mean adj. R² (FF3)

39.2%

+33.4% vs CAPM

Mean adj. R² (Carhart)

39.5%

+0.83% vs FF3

Comparison	Mean RSE (A)	Mean RSE (B)	t	p (1-tail)	Result
CAPM vs FF3	0.0240	0.0220	5.174	0.000	FF3 significantly lower RSE than CAPM
FF3 vs CAR	0.0220	0.0220	2.821	0.005	CAR significantly lower RSE than FF3
CAPM vs CAR	0.0240	0.0220	5.251	0.000	CAR significantly lower RSE than CAPM

Ticker	Sector	CAPM adj. R²	FF3 adj. R²	Carhart adj. R²	CAPM RSE	FF3 RSE	Carhart RSE
HON	Industrials	61.5%	71.8%	72.1%	0.0110	0.0090	0.0090
SYF	Financials	49.6%	69.0%	69.0%	0.0210	0.0160	0.0160
AIG	Financials	45.7%	68.4%	68.4%	0.0200	0.0150	0.0150
MPC	Energy	42.1%	56.7%	57.3%	0.0240	0.0210	0.0210
HST	Real Estate	34.7%	54.3%	56.4%	0.0210	0.0170	0.0170
IBM	Information Technology	48.4%	52.6%	53.4%	0.0130	0.0120	0.0120
PPG	Materials	44.7%	52.2%	52.3%	0.0140	0.0130	0.0130
INN	Real Estate	24.6%	51.8%	52.6%	0.0310	0.0250	0.0250
AROC	Energy	24.9%	47.7%	47.6%	0.0340	0.0290	0.0290
DRI	Consumer Discretionary	37.3%	46.3%	46.3%	0.0240	0.0220	0.0220
LUV	Industrials	31.6%	45.3%	46.5%	0.0210	0.0190	0.0180
JNJ	Health Care	36.9%	42.6%	42.8%	0.0110	0.0100	0.0100
TMUS	Communication Services	39.2%	39.7%	39.7%	0.0140	0.0140	0.0140
LEN	Consumer Discretionary	33.1%	35.9%	36.6%	0.0230	0.0220	0.0220
BZH	Consumer Discretionary	29.2%	34.3%	35.1%	0.0320	0.0310	0.0310
MSEX	Utilities	24.0%	24.8%	24.8%	0.0190	0.0190	0.0190
ECPG	Financials	14.5%	23.9%	23.8%	0.0340	0.0320	0.0320
TWIN	Industrials	18.4%	23.2%	23.2%	0.0360	0.0350	0.0350
ED	Utilities	16.0%	23.2%	23.2%	0.0140	0.0140	0.0140
LXU	Materials	11.7%	22.9%	22.9%	0.0510	0.0470	0.0470
IRWD	Health Care	15.8%	22.6%	22.7%	0.0270	0.0260	0.0260
LOCO	Consumer Discretionary	11.7%	19.6%	19.6%	0.0280	0.0260	0.0260
CPB	Consumer Staples	5.6%	7.9%	8.1%	0.0170	0.0170	0.0170
BGS	Consumer Staples	4.0%	4.1%	4.1%	0.0280	0.0280	0.0280

Missing prices: HA, AMBC, WOW, CTXS, EBIX, LLNW

Sample period and economic context (2018–2021)

The thesis window includes the 2018 trade-policy volatility, the sharp Q4 2018 equity drawdown, the 2020 COVID-19 crash and policy-driven rebound, and the 2021 reopening / inflation narrative. Factor models must fit both calm and crisis days — a stringent test for fixed linear betas.

A GICS-balanced panel reduces sector concentration: financials, technology, health care, and energy names enter with large/small pairs where possible so results are not driven by a single industry.

Stocks with incomplete return history are dropped; the dashboard reports how many of the requested thirty stocks were successfully estimated.

Conclusion

CAPM, Fama–French three-factor, and Carhart four-factor specifications explain a large share of daily return variation for most passive equity ETFs in the 2018–2021 sample, with alphas close to zero after controlling for systematic exposures.

Compare headline statistics against Goncharov (2023) thesis tables when validating the replication. Survivorship and vendor data differences may prevent exact numerical match.

Limitations

Public price data are not CRSP; corporate actions and delistings can differ from the thesis sample. Some 2018–2021 tickers are no longer listed.

The thesis used Excel OLS; this implementation uses statsmodels. Small numerical differences are expected.

Time-series fit on 2018–2021 does not prove that factors price the cross-section out of sample; it only measures explanatory power in-sample.

Research and education only — not investment advice.