seqlogit: Stata module to fit a sequential logit model

Author: Maarten L. Buis

seqlogit fits a sequential logit model. This model is know under a variety of other names: sequential logit model (Tutz 1991),sequential response model (maddala 1983), continuation ratio logit (Agresti 2002), model for nested dichotomies (fox 1997), and the Mare model (shavit and blossfeld93) (after (Mare 1981)).

A sequential logit model can be estimated quite simply by estimating a number of logit models. The seqlogit package serves three additional purposes: First, it makes it easier to test hypotheses across transitions since the entire model is estimated simultaneously. Second, it implements the decomposition proposed by Buis (2010a) of the effect of an explanatory variable on the outcome of the process described by the sequential logit into the contributions of each of the transitions. Third, it implements and extends the strategy proposed by Buis (2010b) of doing a sensitivity analysis to investigate the potential influence of unobserved variables.

For this last purpose, the seqlogit package allows one to estimate a sequential logit given a scenario concerning the unobserved variables. These effects will only be estimated when the sd() option is specified. A regular sequential logit model, which assumes that there is no unobserved heterogeneity, is estimated if the sd() option is not specified. The scenarios assume that these unobserved variables either add up to a standardized normally (Gaussian) distributed variable (when the pr() is not specified), or to a standardized discrete variable (when the pr() is specified). The effects of this aggregate unobserved variable during each transition are specified in the sd() option. The correlation during the first transition between this unobserved variable and the variable specified in the ofinterest() option is specified in the rho() option. The scenarios are estimated using maximum simulated likelihood, while the regular sequential logit model is estimated using regular maximum likelihood.

Supporting materials

References

Agresti, Alan 2002, Categorical Data Analysis, 2nd edition. Hoboken, NJ: Wiley-Interscience.

Buis, Maarten L. 2009a, Not all transitions are equal: The relationship between inequality of educational opportunities and inequality of educational outcomes. in: Maarten L. Buis, Inequality of Educational Outcome and Inequality of Educational Opportunity in the Netherlands during the 20th Century. link

Buis, Maarten L. 2009b, The consequences of unobserved heterogeneity in a sequential logit model. in: Maarten L. Buis, Inequality of Educational Outcome and Inequality of Educational Opportunity in the Netherlands during the 20th Century. link

Fox, John 1997, Applied Regression Analysis, Linear Models, and Related Methods. Thousand Oaks: Sage.

Maddala, G.S. 1983, Limited Dependent and Qualitative Variables in Econometrics. Cambridge: Cambridge University Press.

Mare, Robert D. 1981, Change and Stability in educational Stratification, American Sociological Review, 46(1): 72-87.

Shavit, Yossi and Hans-Peter Blossfeld 1993, Persistent Inequality: Changing Educational Attainment in Thirteen Countries Boulder: Westview Press.

Tutz, Gerhard 1991, Sequential models in categorical regression, Computational Statistics & Data Analysis, 11(3): 275-295.

examples

An example showing the decomposition.

. sysuse nlsw88, clear (NLSW, 1988 extract)

. gen ed = cond(grade< 12, 1, /// > cond(grade==12, 2, /// > cond(grade<16,3,4))) if grade < . (2 missing values generated)

. gen byr = (1988-age-1950)/10

. gen white = race == 1 if race < .

. . seqlogit ed byr south, /// > ofinterest(white) over(byr) /// > tree(1 : 2 3 4, 2 : 3 4, 3 : 4) /// > levels(1=6, 2=12, 3=14, 4= 16) /// > or

Transition tree:

Transition 1: 1 : 2 3 4 Transition 2: 2 : 3 4 Transition 3: 3 : 4

Computing starting values for:

Transition 1 Transition 2 Transition 3

Iteration 0: log likelihood = -2881.2013 Iteration 1: log likelihood = -2881.2013

Number of obs = 2244 LR chi2(12) = 110.38 Log likelihood = -2881.2013 Prob > chi2 = 0.0000

------------------------------------------------------------------------------ ed | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _2_3_4v1 | byr | 3.377124 1.062584 3.87 0.000 1.822735 6.257061 south | .6440004 .0807557 -3.51 0.000 .5036723 .8234254 white | 2.17841 .3029219 5.60 0.000 1.658726 2.860913 _white_X_byr | .3330505 .1351488 -2.71 0.007 .1503489 .7377681 -------------+---------------------------------------------------------------- _3_4v2 | byr | 1.139388 .3722391 0.40 0.690 .6005969 2.161523 south | .8258418 .0793651 -1.99 0.046 .6840607 .997009 white | 1.090765 .1244936 0.76 0.447 .8721274 1.364214 _white_X_byr | .9148277 .3372712 -0.24 0.809 .4441455 1.884314 -------------+---------------------------------------------------------------- _4v3 | byr | 1.217693 .5757529 0.42 0.677 .4820255 3.076134 south | 1.501026 .2063442 2.95 0.003 1.146501 1.965178 white | 1.340784 .2183215 1.80 0.072 .9744438 1.84485 _white_X_byr | .7585029 .4037806 -0.52 0.604 .2671958 2.153203 ------------------------------------------------------------------------------

. . seqlogitdecomp, /// > overat(byr -.5, byr 0, byr .4) /// > subtitle("1945" "1950" "1954") /// > eqlabel(`""finish" "high school""' /// > `""high school v" "some college""' /// > `""some college v" "college""') /// > xline(0) yline(0)

[do-file]

second example graph

An example showing how to estimate one scenario

. sysuse nlsw88, clear (NLSW, 1988 extract)

. gen ed = cond(grade< 12, 1, /// > cond(grade==12, 2, /// > cond(grade<16,3,4))) if grade < . (2 missing values generated)

. gen byr = (1988-age-1950)/10

. gen white = race == 1 if race < .

. . seqlogit ed byr south, /// > ofinterest(white) over(byr) /// > tree(1 : 2 3 4, 2 : 3 4, 3 : 4) /// > levels(1=6, 2=12, 3=14, 4= 16) /// > or pr(.25 .25 .25 .25) sd(1 1.5 2)

Transition tree:

Transition 1: 1 : 2 3 4 Transition 2: 2 : 3 4 Transition 3: 3 : 4

Computing starting values for:

Transition 1 Transition 2 Transition 3

Iteration 0: log likelihood = -3002.0013 Iteration 1: log likelihood = -2887.2733 Iteration 2: log likelihood = -2881.5685 Iteration 3: log likelihood = -2881.5522 Iteration 4: log likelihood = -2881.5522

Number of obs = 2244 LR chi2(12) = 109.68 Log likelihood = -2881.5522 Prob > chi2 = 0.0000

------------------------------------------------------------------------------ ed | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _2_3_4v1 | byr | 4.149592 1.533882 3.85 0.000 2.010767 8.563455 south | .6183716 .0867931 -3.42 0.001 .4696529 .8141831 white | 2.400269 .3757075 5.59 0.000 1.766134 3.262092 _white_X_byr | .2723963 .1258161 -2.82 0.005 .1101649 .6735338 -------------+---------------------------------------------------------------- _3_4v2 | byr | 1.678533 .8038204 1.08 0.279 .6566054 4.290971 south | .6921869 .1001355 -2.54 0.011 .5212955 .9191001 white | 1.344001 .2299586 1.73 0.084 .9610787 1.879491 _white_X_byr | .6449065 .3511892 -0.81 0.421 .2218033 1.875105 -------------+---------------------------------------------------------------- _4v3 | byr | 1.769152 1.151783 0.88 0.381 .4938572 6.337656 south | 1.508165 .3110586 1.99 0.046 1.006674 2.259482 white | 1.826855 .432249 2.55 0.011 1.148954 2.904727 _white_X_byr | .5226782 .3929925 -0.86 0.388 .1197377 2.281591 ------------------------------------------------------------------------------ Distribution of the standardized unobserved variable is:

| point 1 point 2 point 3 point 4 -------------+-------------------------------------------- mass point | -1.341641 -.4472136 .4472136 1.341641 proportion | .25 .25 .25 .25

The effect of the standardized unobserved variable is fixed at: -------------------- equation | sd ------------+------- _2_3_4v1 | 1 _3_4v2 | 1.5 _4v3 | 2 --------------------

. . uhdesc

| p(atrisk) mean(e) sd(e) corr(e,x) -------------+-------------------------------------------- 2 3 4v1 | 1.000 0.000 1.000 0.000 3 4v2 | 0.861 0.179 1.469 -0.049 4v3 | 0.432 1.358 1.517 -0.061

.

[do-file]