Difference-in-Differences (DiD) — главный causal inference метод когда нельзя сделать A/B-тест (geo rollout, marketing campaign, regulatory change). Использовался в Card & Krueger (1994) для оценки эффекта минимальной зарплаты — Нобелевская премия по экономике 2021. В этом гайде — математика + SQL + Python примеры на реальных аналитических кейсах.
Когда использовать DiD
✅ Идеальные кейсы
- Geo rollout — фича запущена в Москве, не в регионах
- Marketing campaign — promo для одного segment, не для другого
- Policy change — изменение pricing для одной категории
- Natural experiment — внешний shock (закрытие конкурента в одном регионе)
- Phased rollout — фича запускается постепенно по сегментам
❌ Когда DiD не работает
- Spillover effects — treatment влияет и на control (через social network)
- Selection bias — treatment group systematically different from control
- Parallel trends violation — pre-period trends не параллельны
- Anticipation effects — control group меняет behavior зная про treatment
Математика DiD
Базовая формула
$$\hat{ATE} = (\bar{Y}_{T, post} - \bar{Y}_{T, pre}) - (\bar{Y}_{C, post} - \bar{Y}_{C, pre})$$
где:
- $T$ — treatment group, $C$ — control
- $pre$ — до изменения, $post$ — после
- $\hat{ATE}$ — Average Treatment Effect estimate
Regression формулировка
$$Y_{it} = \alpha + \beta \cdot Treatment_i + \gamma \cdot Post_t + \delta \cdot (Treatment_i \times Post_t) + \epsilon_{it}$$
$\delta$ — это наш DiD effect. Coefficient на interaction term Treatment × Post.
Parallel trends assumption
Критично: pre-treatment trends в treatment и control groups должны быть параллельными.
Тест:
- Plot pre-period trends в обоих groups
- Run placebo test (pretend pre-period split was actual treatment)
- Эта значимость = violation, DiD assumption нарушена
SQL пример: geo rollout DataLens dashboard
Кейс: Запустили новый DataLens dashboard для analysts в Москве. Регионы — control group. Период: pre (3 месяца до), post (3 месяца после).
\\\sql
WITH user_periods AS (
SELECT
user_id,
CASE WHEN region = 'Moscow' THEN 1 ELSE 0 END AS treatment,
CASE WHEN event_date >= '2026-03-01' THEN 1 ELSE 0 END AS post,
sessions_per_week
FROM user_metrics
WHERE event_date BETWEEN '2025-12-01' AND '2026-05-31'
),
aggregated AS (
SELECT treatment, post,
AVG(sessions_per_week) AS avg_sessions
FROM user_periods
GROUP BY treatment, post
)
SELECT
-- DiD estimate
(
(SELECT avg_sessions FROM aggregated WHERE treatment = 1 AND post = 1) -
(SELECT avg_sessions FROM aggregated WHERE treatment = 1 AND post = 0)
) -
(
(SELECT avg_sessions FROM aggregated WHERE treatment = 0 AND post = 1) -
(SELECT avg_sessions FROM aggregated WHERE treatment = 0 AND post = 0)
) AS did_effect;
\\\
Python: full DiD analysis with regression
\\\python
import pandas as pd
import statsmodels.formula.api as smf
# Load data: user_id, region, event_date, sessions_per_week
df['treatment'] = (df['region'] == 'Moscow').astype(int)
df['post'] = (df['event_date'] >= '2026-03-01').astype(int)
df['treatment_post'] = df['treatment'] * df['post']
# Regression DiD
model = smf.ols('sessions_per_week ~ treatment + post + treatment_post', data=df).fit()
print(model.summary())
# DiD effect = coefficient on treatment_post
did_effect = model.params['treatment_post']
p_value = model.pvalues['treatment_post']
print(f"DiD effect: {did_effect:.3f} (p={p_value:.4f})")
\\\
Parallel trends test
Перед DiD анализом обязательно проверить parallel trends:
\\\python
import matplotlib.pyplot as plt
# Plot pre-period trends
pre_period = df[df['event_date'] < '2026-03-01']
trends = pre_period.groupby(['event_date', 'treatment'])['sessions_per_week'].mean().unstack()
trends.plot(figsize=(10, 6))
plt.title('Pre-treatment trends: should be parallel')
plt.show()
# Statistical test: interaction of treatment × continuous time
pre_period['time_idx'] = (pre_period['event_date'] - pre_period['event_date'].min()).dt.days
pretest_model = smf.ols('sessions_per_week ~ treatment + time_idx + treatment:time_idx', data=pre_period).fit()
print(f"Pretrend coefficient: {pretest_model.params['treatment:time_idx']:.5f}")
print(f"p-value: {pretest_model.pvalues['treatment:time_idx']:.4f}")
# If p-value < 0.05 — parallel trends violated, DiD assumption fails
\\\
Кейс: marketing campaign A/B impossible
Ситуация: Marketing запустил promo для users из Москвы в апреле 2026. Получили +15% sessions в Москве. Это эффект promo или просто сезонность?
DiD approach:
\\\python
# Sessions in Moscow vs regions (regions never get promo)
moscow_pre = 100 # baseline March
moscow_post = 115 # April with promo
region_pre = 95 # baseline March
region_post = 100 # April without promo (general trend)
did_effect = (moscow_post - moscow_pre) - (region_post - region_pre)
print(f"DiD: {did_effect}") # 15 - 5 = 10
# Pure promo effect = 10, not 15. 5 — общий тренд.
\\\
Insight: Reported +15% effect был частично из-за seasonality (+5%). Чистый promo effect = +10%. Если бы взяли naive «до vs после» — overstate effect в 1.5x.
Кейс: pricing change для категории
Ситуация: Подняли цену на category A на 10%. Category B без изменений.
\\\python
# Volume в обеих categories до и после
cat_a_pre = 1000
cat_a_post = 850 # -15% volume
cat_b_pre = 1200
cat_b_post = 1150 # -4% (общий market trend)
did = (cat_a_post - cat_a_pre) - (cat_b_post - cat_b_pre)
print(f"DiD volume effect: {did}") # -150 - (-50) = -100
# Pure pricing impact: -100 units (10% volume loss tied to price)
# vs naive: -150 units (overstate)
\\\
Advanced: Synthetic Control (DiD++)
Когда нет good control group, synthetic control method создаёт weighted combination of donor pool to mimic treatment group's pre-period.
Используется в:
- Country-level policy evaluation (single country treatment)
- Brand expansion analysis
- Market entry studies
Library: \pip install pysyncon\ (Synthetic Control package).
Common mistakes
❌ Не проверять parallel trends
Самый частый mistake. Без parallel trends DiD biased.
❌ Selection bias в treatment assignment
Если treatment group systematically different (richer, more active), DiD biased.
❌ Spillover effects
Treatment может влиять на control через network effects (users in treatment area tell friends в control area). Биас.
❌ Anticipation effects
Если control group знает что treatment скоро коснётся их → они меняют behavior preemptively → biased DiD.
❌ Too short post-period
Effect может проявиться позже. Minimum 4 weeks, ideal 8-12.
FAQ
DiD vs PSM (Propensity Score Matching)?
- DiD: panel data (один subject во времени)
- PSM: cross-sectional (matching similar subjects)
- Можно combine: PSM-DiD для double-robust estimation.
Сколько pre-period нужно?
Минимум 4 недели, ideally 12+. Для parallel trends test нужны достаточные observations.
DiD на binary outcomes?
Используется logistic regression аналог. Coefficient on interaction term — DiD effect (log-odds scale).
Сколько control groups нужно?
1 minimum, 3+ better. Multiple controls — more robust estimates.
Что дальше
- CUPED variance reduction
- Multi-armed bandit
- A/B-тесты на Python
- Sample size калькулятор
- 453 кейса для практики causal inference
Источники
- Angrist & Pischke «Mostly Harmless Econometrics» — Chapter 5
- Cunningham «Causal Inference: The Mixtape» (open source)
- Card & Krueger (1994) — original DiD application