Quantile treatment effects: estimands, identification, and a panel-data pitfall

A quantile treatment effect at level $τ$ is not the treatment effect for units at the $τ$ -th percentile of $Y$ . It is the difference between the $τ$ -th quantile of the treated potential-outcome distribution and the $τ$ -th quantile of the untreated potential-outcome distribution. Firpo, Fortin & Lemieux (2009) formalized this distinction and introduced the Recentered Influence Function (RIF) regression that lets you estimate unconditional QTEs as a coefficient in an OLS regression. In panel settings, especially combined with two-way fixed effects, the conditional/unconditional distinction is often missed, with consequences for policy that are both predictable and avoidable.

1. Two QTEs, not one

The unconditional QTE at level $τ$ is

QTE_{τ} = Q_{Y (1)} (τ) - Q_{Y (0)} (τ) . (1)

It compares quantiles of the marginal distributions of the two potential outcomes. If $Y$ is wages, $QTE_{0.5}$ is the difference in median wages between the treated and untreated worlds.

The conditional QTE at level $τ$ and covariate $x$ is

QTE_{τ} (x) = Q_{Y (1) ∣ X = x} (τ) - Q_{Y (0) ∣ X = x} (τ) . (2)

It compares quantiles of the potential-outcome distributions conditional on covariates.

The two are not related by simple averaging. The quantile of a marginal distribution is not the average of conditional quantiles. Formally, even if $QTE_{τ} (x) = Δ$ for every $x$ , the unconditional $QTE_{τ}$ need not equal $Δ$ , it equals $Δ$ only when the treatment does not change the shape of the outcome distribution at all.

1.1 The distinction visualized

Conditional vs. unconditional QTEs

The left panel shows two values of $X$ with different conditional QTEs ( $+ 0.5$ and $+ 1.2$ ). The right panel shows that averaging over $X$ to get the marginal distributions, the unconditional median shift is neither $0.5$ nor $1.2$ nor their average, it depends on the full mixture structure.

Consequences for practice:

Quantile regression à la Koenker & Bassett (1978) estimates conditional quantiles. Using the coefficient as a quantile treatment effect conflates (1) and (2).
RIF regression (§3) is the remedy: it targets (1) directly via a clever re-parameterization.

2. Conditional quantile regression, briefly

The Koenker-Bassett (1978) estimator minimizes the asymmetric check function

ρ_{τ} (u) = u (τ - 1 [u < 0]) .

For a linear model $Q_{Y ∣ X} (τ) = X^{⊤} β (τ)$ , the estimator is

\hat{β} (τ) = ar g β min i \sum ρ_{τ} (Y_{i} - X_{i}^{⊤} β) .

$\hat{β} (τ)$ is consistent for the conditional quantile function and $n$ -asymptotically normal under standard regularity conditions. The coefficient on a treatment indicator $D$ in this regression estimates $QTE_{τ} (x)$ , not $QTE_{τ}$ . If you want (1), you need something else.

3. Firpo-Fortin-Lemieux: the RIF regression

The elegant insight: the $τ$ -th quantile is a functional of the outcome distribution $F_{Y}$ , and its influence function has a closed form.

Definition. The influence function of the functional $ν (F) = Q_{F} (τ) = F^{- 1} (τ)$ at the distribution $F_{Y}$ is

IF (y; q_{τ}, F_{Y}) = \frac{τ - 1 [ y \leq q _{τ} ]}{f _{Y} ( q _{τ} )},

where $q_{τ} = Q_{F_{Y}} (τ)$ and $f_{Y}$ is the density.

The Recentered Influence Function (RIF) is

RIF (y; q_{τ}) = q_{τ} + IF (y; q_{τ}, F_{Y}) = q_{τ} + \frac{τ - 1 [ y \leq q _{τ} ]}{f _{Y} ( q _{τ} )} . (3)

Two properties make RIF useful.

Property 1: centering. $E [RIF (Y; q_{τ})] = q_{τ}$ . This follows from $E [1 [Y \leq q_{τ}]] = τ$ .

Property 2: linearity under mixture. For any decomposition $F_{Y} = \int F_{Y ∣ X = x} d P (x)$ into conditional distributions,

q_{τ} = E [RIF (Y; q_{τ})] = \int E [RIF (Y; q_{τ}) ∣ X = x] d P (x) .

So the unconditional $τ$ -th quantile decomposes into an integral of the conditional expectation of the RIF, which is a regression problem. Fit a linear model $E [RIF (Y; q_{τ}) ∣ X, D] = X^{⊤} γ (τ) + δ (τ) D$ by OLS; the coefficient $\hat{δ} (τ)$ is a consistent estimate of the unconditional QTE at $τ$ (with a causal interpretation under unconfoundedness).

3.1 The RIF visualized

RIF for the 0.75 quantile, density and RIF function

The RIF is a step function that equals $q_{τ} - (1 - τ) / f (q_{τ})$ below the quantile and $q_{τ} + τ / f (q_{τ})$ above. Its expectation under $F_{Y}$ is exactly $q_{τ}$ . When you regress the RIF on covariates, you are decomposing the movement of the quantile under a perturbation in the covariate distribution.

3.2 The RIF regression procedure

Estimate $\overset{q}{^}_{τ}$ as the sample $τ$ -quantile.
Estimate $\hat{f}_{Y} (\overset{q}{^}_{τ})$ by a kernel density.
Compute $RIF_{i} = \overset{q}{^}_{τ} + (τ - 1 [Y_{i} \leq \overset{q}{^}_{τ}]) / \hat{f}_{Y} (\overset{q}{^}_{τ})$ for each observation.
Regress $RIF_{i}$ on $X_{i}$ and $D_{i}$ by OLS.
The coefficient on $D$ is the unconditional QTE at $τ$ .

The standard error from the OLS regression is not directly valid, it ignores the uncertainty in $\overset{q}{^}_{τ}$ and $\hat{f}_{Y} (\overset{q}{^}_{τ})$ . Robust alternatives include bootstrap-all-steps or the analytical correction in Firpo, Fortin & Lemieux (2009).

4. QTE identification under unconfoundedness

Assume $(Y (0), Y (1)) ⊥ D ∣ X$ (unconfoundedness) and overlap $0 < e (X) = P (D = 1 ∣ X) < 1$ . Then

F_{Y (d)} (y) = \int F_{Y ∣ D = d, X = x} (y) d P (x) for d \in {0, 1},

i.e., the potential-outcome CDF is identified by averaging conditional CDFs. Invert to get $Q_{Y (d)} (τ)$ . The unconditional QTE (1) is then identified.

Practical implementation: estimate the conditional CDFs by distributional regression or kernel smoothing, weight by inverse-propensity $1/ \overset{e}{^} (X)$ or $1/ (1 - \overset{e}{^} (X))$ , integrate, then invert. Chernozhukov, Fernández-Val & Melly (2013) give a general framework. Chernozhukov & Hansen (2005) provide an IV extension.

5. The panel-data pitfall

In panel settings with unit fixed effects, researchers often run conditional quantile regressions with unit dummies and interpret the coefficient as the effect on units at the $τ$ -th percentile. This is a common mistake with a specific structure.

Consider a simple panel $Y_{i t} = α_{i} + γ_{t} + β D_{i t} + ϵ_{i t}$ . The within-transformed outcome $\tilde{Y}_{i t} = Y_{i t} - \overset{ˉ}{Y}_{i} - \overset{ˉ}{Y}_{t} + \overset{ˉ}{Y}$ has a very different distribution from the raw $Y_{i t}$ . Running quantile regression on $\tilde{Y}_{i t}$ and interpreting the coefficient as “the effect on the top decile of $Y$ ” is a category error: the top decile of $\tilde{Y}$ is not the top decile of $Y$ .

5.1 The two subpopulations are different

Panel pitfall: top-quartile group differs under raw vs. within-transformed outcomes

The left panel highlights the top-quartile group defined by raw $Y$ ; the right panel highlights the top-quartile group defined by the within-transformed $\tilde{Y} = Y - \overset{ˉ}{Y}_{i}$ . The two groups barely overlap. A policy “target the top quartile” means something different in each case, and a quantile treatment effect estimated on within-transformed data tells you about the second group, not the first.

The pitfall is asymmetric: if effects are monotone in baseline $Y$ , the naive approach can assign the largest estimated $τ$ -QTE to the wrong end of the distribution. This reasoning error has been documented in applied work to produce policy recommendations that systematically target the wrong subpopulation.

5.2 The right tool for panel QTEs

Callaway & Li (2019) extend the Callaway-Sant’Anna DiD framework to quantile treatment effects on the treated. The estimand is

QTT (g, t; τ) = Q_{Y_{t} (g) ∣ G = g} (τ) - Q_{Y_{t} (0) ∣ G = g} (τ),

i.e., the difference between the $τ$ -th quantiles of the treated and untreated outcome distributions for cohort $g$ at period $t$ , a QTT, not the $τ$ -th quantile of the individual treatment effect (which would require rank invariance, the conflation Section 1 warns against). Identification uses a copula stability assumption in place of parallel trends, which is well-suited to settings where the entire distribution matters, inequality-oriented policy evaluation, for instance. The method recovers the dynamic distributional effect of treatment on the treated without conflating conditional and unconditional estimands.

6. Three real-life applications

Wage decomposition and the gender wage gap. Firpo, Fortin & Lemieux (2011) apply RIF regression to Current Population Survey data to decompose the change in the U.S. gender wage gap across the distribution. The bottom-decile gap is driven by different factors than the top-decile gap, a finding invisible to conditional quantile regression.

Inequality effects of education. Lemieux (2006) uses RIF-OLS to decompose increases in wage inequality into components from rising returns to education vs. rising residual inequality. The RIF framework is essential because the object of interest, the $90/10$ wage ratio, is a functional of the marginal distribution.

Microcredit distributional effects. Banerjee et al. (2015) and subsequent work use quantile treatment effects on microfinance RCT data to show that the average effect of microcredit is near zero but there are meaningful positive effects in the upper tail and negligible effects in the lower tail. The unconditional QTE framework is the right tool for exactly this question.

7. Open questions

Continuous-treatment quantile heterogeneity. RIF regression is designed for binary or discrete treatment. Continuous-treatment QTEs require different machinery (quantile partial effects, Chernozhukov, Fernández-Val & Kowalski 2015).

DML for quantile targets. Semenova & Chernozhukov (2021) cover CATE but the unconditional QTE target has not been integrated cleanly with the DML framework. An open methodological direction.

Finite-sample inference. RIF regression is sensitive to the density-at-quantile estimate $\hat{f}_{Y} (\overset{q}{^}_{τ})$ . Robust alternatives (distribution regression, M-quantile regression) are available but trade off interpretability.

Multivariate QTEs. When the outcome is multivariate (e.g., wage and hours), the notion of “the $τ$ -th quantile” becomes multivariate. Depth-based and copula-based extensions exist but lack a unified framework.

8. References

Firpo, S., Fortin, N. M., & Lemieux, T. (2009). Unconditional quantile regressions. Econometrica, 77(3), 953–973.
Firpo, S., Fortin, N. M., & Lemieux, T. (2011). Decomposition methods in economics. Handbook of Labor Economics, 4A, 1–102.
Koenker, R., & Bassett, G. (1978). Regression quantiles. Econometrica, 46(1), 33–50.
Koenker, R. (2005). Quantile regression. Cambridge University Press.
Chernozhukov, V., & Hansen, C. (2005). An IV model of quantile treatment effects. Econometrica, 73(1), 245–261.
Chernozhukov, V., Fernández-Val, I., & Melly, B. (2013). Inference on counterfactual distributions. Econometrica, 81(6), 2205–2268.
Chernozhukov, V., Fernández-Val, I., & Kowalski, A. E. (2015). Quantile regression with censoring and endogeneity. Journal of Econometrics, 186(1), 201–221.
Callaway, B., & Li, T. (2019). Quantile treatment effects in difference in differences models with panel data. Quantitative Economics, 10(4), 1579–1618.
Lemieux, T. (2006). Increasing residual wage inequality: composition effects, noisy data, or rising demand for skill? American Economic Review, 96(3), 461–498.
Banerjee, A., Duflo, E., Glennerster, R., & Kinnan, C. (2015). The miracle of microfinance? Evidence from a randomized evaluation. AEJ: Applied Economics, 7(1), 22–53.

Figures produced by reproducible Python scripts in the accompanying code. All illustrations are analytical / small-simulation; no benchmark datasets are used, for both pedagogical clarity and reproducibility.

Hovhannes Grigoryan

Explorer

Quantile treatment effects: estimands, identification, and a panel-data pitfall

Quantile treatment effects: estimands, identification, and a panel-data pitfall

1. Two QTEs, not one

1.1 The distinction visualized

2. Conditional quantile regression, briefly

3. Firpo-Fortin-Lemieux: the RIF regression

3.1 The RIF visualized

3.2 The RIF regression procedure

4. QTE identification under unconfoundedness

5. The panel-data pitfall

5.1 The two subpopulations are different

5.2 The right tool for panel QTEs

6. Three real-life applications

7. Open questions

8. References

Graph View

Table of Contents

Backlinks