ldecomp: Stata module decomposing the total effects in a logistic regression into direct and indirect effects

Author: Maarten L. Buis

ldecomp decomposes the total effects of a categorical variable in logistic regresion into direct and indirect effects using a method method by Erikson et al. (2005) and a generalization of this method by Buis (2010). Consider an example where social class has an indirect effect on attending college through academic performance in high school. The indirect effect is obtained by comparing the proportion of lower class students that attend college with the counterfactual proportion of lower class students if they had the distribution of performance of the higher class students. This captures the association between class and attending college due to differences in performance. The direct effect of class is obtained by comparing the proportion of higher class students with the counterfactual proportion of lower class students if they had the same distribution of performance as the higher class students. This way the variable performance is kept constant. If these comparisons are carried out in the form of log odds ratios than the total effect will equal the sum of the direct and indirect effects. In its original form this method assumes that the variable through which the indirect effect occurs is normally distributed. This is generalized by Buis (2010) by allowing this variable to have any distribution, which has the added advantage of simplifying the method.

This package can be installed by typing in Stata: ssc install ldecomp

Supporting Materials

References

Buis, M.L. (2010). Direct and indirect effects in a logit model. The Stata Journal, 10(1):11--29.
link

Erikson, R, J.H. Goldthorpe, M. Jackson, M. Yaish, D.R. Cox (2005). On class differentials in educational attainment. Proceedings of the National Academy of Science, 102(27): 9730-9732.

Jackson, M, R. Erikson, J. Goldthorpe, M. Yaish (2007). Primary and secondary effects in class differentials in educational attainment: The transition to A-level courses in England and Wales. Acta Sociologica, 50(3): 211-229.

Example

. use http://fmwww.bc.edu/repec/bocode/w/wisconsin.dta, clear (Wisconsin Longitudinal Study)

. . recode ocf57 2=1 3=2 4=3 5=3 (ocf57: 6196 changes made)

. label define ocf57 1 "lower" 2 "middle" 3 "higher", modify

. label value ocf57 ocf57

. . ldecomp college , direct(ocf57) indirect(hsrankq) (running _ldecomp on estimation sample)

Bootstrap replications (50) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50

Bootstrap results Number of obs = 8923 Replications = 50 ------------------------------------------------------------------------------ | Observed Bootstrap Normal-based | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 2/1 | total | .4367997 .0910963 4.79 0.000 .2582542 .6153452 indirect1 | .0593679 .0260807 2.28 0.023 .0082507 .110485 direct1 | .3774319 .0863059 4.37 0.000 .2082754 .5465883 indirect2 | .0586611 .0260593 2.25 0.024 .0075858 .1097364 direct2 | .3781386 .0865485 4.37 0.000 .2085066 .5477706 -------------+---------------------------------------------------------------- 3/1 | total | 1.410718 .0574691 24.55 0.000 1.29808 1.523355 indirect1 | .2058881 .0188002 10.95 0.000 .1690405 .2427358 direct1 | 1.204829 .0524854 22.96 0.000 1.10196 1.307699 indirect2 | .2012494 .0188692 10.67 0.000 .1642666 .2382323 direct2 | 1.209468 .0524411 23.06 0.000 1.106686 1.312251 -------------+---------------------------------------------------------------- 3/2 | total | .9739179 .0801047 12.16 0.000 .8169155 1.13092 indirect1 | .1461109 .0291997 5.00 0.000 .0888806 .2033413 direct1 | .8278069 .079138 10.46 0.000 .6726993 .9829146 indirect2 | .1432144 .0298827 4.79 0.000 .0846454 .2017834 direct2 | .8307035 .0799325 10.39 0.000 .6740388 .9873682 ------------------------------------------------------------------------------ in equation i/j (comparing groups i and j) let the fist subscript of Odds be the distribution of the the indirect variable let the second subscript of Odds be the conditional probabilities Method 1: Indirect effect = ln(Odds_ij/Odds_jj) Direct effect = ln(Odds_ii/Odds_ij) Method 2: Indirect effect = ln(Odds_ii/Odds_ji) Direct effect = ln(Odds_ji/Odds_jj)

value labels 1 lower 2 middle 3 higher

[do-file]

The example below illustrates how one can add control variables, in this case south. In this case we don't want to fix the value of south at its mean value, instead using the at() option we fix it at 0. Notice that in this case the direct and indirect effects have opposite signs, so college educated people tend to be more likely to be a member of a union, but this effect is dampened by the fact that they end up in higher occupations in which one is less likely to be a member of a union.

. sysuse nlsw88, clear (NLSW, 1988 extract)

. gen byte high = occupation < 3 if occupation <. (9 missing values generated)

. gen byte middle = occupation >= 3 & occupation < 7 if occupation < . (9 missing values generated)

. ldecomp union south, direct(collgrad) indirect(high middle) at(south 0) (running _ldecomp on estimation sample)

Bootstrap replications (50) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50

Bootstrap results Number of obs = 1869 Replications = 50 ------------------------------------------------------------------------------ | Observed Bootstrap Normal-based | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 1/0 | total | .5053109 .1393997 3.62 0.000 .2320924 .7785293 indirect1 | -.1099997 .0535487 -2.05 0.040 -.2149533 -.0050461 direct1 | .6153106 .1257786 4.89 0.000 .3687891 .861832 indirect2 | -.1196662 .0532876 -2.25 0.025 -.2241079 -.0152244 direct2 | .624977 .1282919 4.87 0.000 .3735295 .8764246 ------------------------------------------------------------------------------ in equation i/j (comparing groups i and j) let the fist subscript of Odds be the distribution of the the indirect variable let the second subscript of Odds be the conditional probabilities Method 1: Indirect effect = ln(Odds_ij/Odds_jj) Direct effect = ln(Odds_ii/Odds_ij) Method 2: Indirect effect = ln(Odds_ii/Odds_ji) Direct effect = ln(Odds_ji/Odds_jj)

value labels 0 not college grad 1 college grad

[do-file]