DiDdifference-in-differencescausal inferenceA/B-тестыSQLPython

Difference-in-Differences (DiD) для аналитика: SQL + Python примеры

2026-06-04 16 мин

Difference-in-Differences (DiD) — главный causal inference метод когда нельзя сделать A/B-тест (geo rollout, marketing campaign, regulatory change). Использовался в Card & Krueger (1994) для оценки эффекта минимальной зарплаты — Нобелевская премия по экономике 2021. В этом гайде — математика + SQL + Python примеры на реальных аналитических кейсах.

Главная идея DiD
**До-после** в treatment group **минус** до-после в control group. Это **очищает** treatment effect от seasonal trends, общих macro changes, и других confounders.

Когда использовать DiD

✅ Идеальные кейсы

❌ Когда DiD не работает

Математика DiD

Базовая формула

$$\hat{ATE} = (\bar{Y}_{T, post} - \bar{Y}_{T, pre}) - (\bar{Y}_{C, post} - \bar{Y}_{C, pre})$$

где:

Regression формулировка

$$Y_{it} = \alpha + \beta \cdot Treatment_i + \gamma \cdot Post_t + \delta \cdot (Treatment_i \times Post_t) + \epsilon_{it}$$

$\delta$ — это наш DiD effect. Coefficient на interaction term Treatment × Post.

Parallel trends assumption

Критично: pre-treatment trends в treatment и control groups должны быть параллельными.

Тест:

SQL пример: geo rollout DataLens dashboard

Кейс: Запустили новый DataLens dashboard для analysts в Москве. Регионы — control group. Период: pre (3 месяца до), post (3 месяца после).

\\\sql

WITH user_periods AS (

SELECT

user_id,

CASE WHEN region = 'Moscow' THEN 1 ELSE 0 END AS treatment,

CASE WHEN event_date >= '2026-03-01' THEN 1 ELSE 0 END AS post,

sessions_per_week

FROM user_metrics

WHERE event_date BETWEEN '2025-12-01' AND '2026-05-31'

),

aggregated AS (

SELECT treatment, post,

AVG(sessions_per_week) AS avg_sessions

FROM user_periods

GROUP BY treatment, post

)

SELECT

-- DiD estimate

(

(SELECT avg_sessions FROM aggregated WHERE treatment = 1 AND post = 1) -

(SELECT avg_sessions FROM aggregated WHERE treatment = 1 AND post = 0)

) -

(

(SELECT avg_sessions FROM aggregated WHERE treatment = 0 AND post = 1) -

(SELECT avg_sessions FROM aggregated WHERE treatment = 0 AND post = 0)

) AS did_effect;

\\\

Python: full DiD analysis with regression

\\\python

import pandas as pd

import statsmodels.formula.api as smf

# Load data: user_id, region, event_date, sessions_per_week

df['treatment'] = (df['region'] == 'Moscow').astype(int)

df['post'] = (df['event_date'] >= '2026-03-01').astype(int)

df['treatment_post'] = df['treatment'] * df['post']

# Regression DiD

model = smf.ols('sessions_per_week ~ treatment + post + treatment_post', data=df).fit()

print(model.summary())

# DiD effect = coefficient on treatment_post

did_effect = model.params['treatment_post']

p_value = model.pvalues['treatment_post']

print(f"DiD effect: {did_effect:.3f} (p={p_value:.4f})")

\\\

Перед DiD анализом обязательно проверить parallel trends:

\\\python

import matplotlib.pyplot as plt

# Plot pre-period trends

pre_period = df[df['event_date'] < '2026-03-01']

trends = pre_period.groupby(['event_date', 'treatment'])['sessions_per_week'].mean().unstack()

trends.plot(figsize=(10, 6))

plt.title('Pre-treatment trends: should be parallel')

plt.show()

# Statistical test: interaction of treatment × continuous time

pre_period['time_idx'] = (pre_period['event_date'] - pre_period['event_date'].min()).dt.days

pretest_model = smf.ols('sessions_per_week ~ treatment + time_idx + treatment:time_idx', data=pre_period).fit()

print(f"Pretrend coefficient: {pretest_model.params['treatment:time_idx']:.5f}")

print(f"p-value: {pretest_model.pvalues['treatment:time_idx']:.4f}")

# If p-value < 0.05 — parallel trends violated, DiD assumption fails

\\\

Кейс: marketing campaign A/B impossible

Ситуация: Marketing запустил promo для users из Москвы в апреле 2026. Получили +15% sessions в Москве. Это эффект promo или просто сезонность?

DiD approach:

\\\python

# Sessions in Moscow vs regions (regions never get promo)

moscow_pre = 100 # baseline March

moscow_post = 115 # April with promo

region_pre = 95 # baseline March

region_post = 100 # April without promo (general trend)

did_effect = (moscow_post - moscow_pre) - (region_post - region_pre)

print(f"DiD: {did_effect}") # 15 - 5 = 10

# Pure promo effect = 10, not 15. 5 — общий тренд.

\\\

Insight: Reported +15% effect был частично из-за seasonality (+5%). Чистый promo effect = +10%. Если бы взяли naive «до vs после» — overstate effect в 1.5x.

Кейс: pricing change для категории

Ситуация: Подняли цену на category A на 10%. Category B без изменений.

\\\python

# Volume в обеих categories до и после

cat_a_pre = 1000

cat_a_post = 850 # -15% volume

cat_b_pre = 1200

cat_b_post = 1150 # -4% (общий market trend)

did = (cat_a_post - cat_a_pre) - (cat_b_post - cat_b_pre)

print(f"DiD volume effect: {did}") # -150 - (-50) = -100

# Pure pricing impact: -100 units (10% volume loss tied to price)

# vs naive: -150 units (overstate)

\\\

Advanced: Synthetic Control (DiD++)

Когда нет good control group, synthetic control method создаёт weighted combination of donor pool to mimic treatment group's pre-period.

Используется в:

Library: \pip install pysyncon\ (Synthetic Control package).

Common mistakes

❌ Не проверять parallel trends

Самый частый mistake. Без parallel trends DiD biased.

❌ Selection bias в treatment assignment

Если treatment group systematically different (richer, more active), DiD biased.

❌ Spillover effects

Treatment может влиять на control через network effects (users in treatment area tell friends в control area). Биас.

❌ Anticipation effects

Если control group знает что treatment скоро коснётся их → они меняют behavior preemptively → biased DiD.

❌ Too short post-period

Effect может проявиться позже. Minimum 4 weeks, ideal 8-12.

FAQ

DiD vs PSM (Propensity Score Matching)?

Сколько pre-period нужно?

Минимум 4 недели, ideally 12+. Для parallel trends test нужны достаточные observations.

DiD на binary outcomes?

Используется logistic regression аналог. Coefficient on interaction term — DiD effect (log-odds scale).

Сколько control groups нужно?

1 minimum, 3+ better. Multiple controls — more robust estimates.

Что дальше

Источники

Тренируй на реальных кейсах
453 кейса + AI разбор. Включая causal inference задачи.
Открыть кейсы →