Recodes, dummy variables, and product terms can be generated temporarily within the program itself, so that the user will not have to create such variables before running a regression.

One numeric variable is specified as the dependent variable or the variable to be predicted. In order for this variable to be used as a dependent variable in logit or probit regression, it must be coded to have exactly two categories: 0 and 1. If the variable you want to use as a dependent variable is not already coded as a simple 0/1 variable, you can create a dummy variable, or you can recode the variable temporarily. If the dependent variable is left as anything other than a simple 0/1 variable, the program will recode the dependent variable automatically. The lowest valid score will be recoded to the value '0', and all other scores will be recoded to the value '1'.

Ordinarily this program is invoked by the Web interface for the SDA programs, and the user does not have to deal with the keywords given in this document. Output from the program is generally in HTML, which can be viewed with a Web browser.

It is also possible to run the program directly by preparing a command file, which specifies the variables to be analyzed and the options to use. This document explains how to prepare such a file. The name of this batch command file is specified to the program after the `-b' option flag.

Keyword Possible Specification Default (if no keyword) _____________________________________________________________________COefficients= PROBIT Calculate LOGIT regression coefficients and results STUdy= path(s) of dataset(s) Look for variables in current directory only SAvefile= filename to receive output Output sent to screen (overwrites existing file) (standard output) DEP= name of dependent variable REQUIRED INDep= names of independent vars REQUIRED (separated by spaces/commas) Weight= name of weight variable No weighting Filter= name(s) and codes of filter No filter variable(s) STRatum= name of variable giving No stratification for sample stratum computing standard errors $1: Force one stratum CLuster= name of variable giving No cluster variable for sample cluster computing standard errors GVARCase= LOWER or UPPER No force to lower/upper case DUMMYgenmax= A number between 1 and 100 Max of 25 dummy vars can be (max dummy vars) generated by the "m:" syntax for a single categorical var NDEcimals= number of decimals for main 3 decimal places results (coefficients, SE's)

Keyword Possible Specification Default (if no keyword) _____________________________________________________________________COLORcoding= Yes No color coding of coefficients or headings LAnguagefile= Name of file with non-English English labels on labels and messages output RUNtitle= Title or comments for run No title or comments SHORTlist= Yes (omit list of Output list of all indep vars at top) independent variables TExt= Yes No text for variables

You can specify the desired number of decimal places in parentheses for univariate statistics and 'BPRODuct' if the default, listed below, is not satisfactory. Note, however, that the number of decimals specified for 'BPRODuct' will override the number specified for 'UNIvariate'.

Keyword Possible Specification Default (if no keyword) _____________________________________________________________________OTHERstats= TTests (ndec) No T-tests EXPB No exp(B) for logit FTest (ndec) No Global F-test UNIvariate (ndec) No univariate statistics BPRODuct (ndec) No B*Mean statistics COEFF (ndec) No covar of coefficients matrix CONF (90, 95, or 99) No confidence intervals ('CONF' alone gives 95% CI)

To change the number of decimals for the other (optional) statistics, put the desired number of decimals in parentheses after specifying the statistic. Note that requesting the BPRODUCT statistics will force the output of the univariate statistics as well. And the specification of decimal places for the BPRODUCT statistics will override any specification of decimal places for the univariate statistics.

This appending feature applies to the keywords for specifying the independent variables, the filter variables, and the 'otherstats=' keyword. It also applies to the 'study=' keyword, for specifying the locations of the SDA dataset directories. If other keywords are repeated, the program will print an error message and stop.

# as a dummy variable

study = /sa/testdata dep = spend(d:1-2) indep = age, educ gender savefile = mylogit.htm -----------------------------------

# Redefine some ranges; use weight and filter variables;

# and request descriptive text for the variables.

dep = spend(d:1-2) indep = age(18-30) educ gender coefficients = probit otherstats = ttests otherstats = univariate weight= wtvar filters= var21(1-3) var30(1) text = yes savefile = mylogit.htm -----------------------------------

# for calculating complex standard errors.

dep = spend(d:1-2) indep = age, educ gender stratum = stratvar cluster = psuvar savefile = mylogit.htm -----------------------------------

# (necessary if not the current directory).

# Get 90% confidence intervals.

# Also request some optional statistics, most with a specified number of decimals.

study = /sa/testdata study = /sa/testdata/newvars dep = spend(d:1-2) indep = age educ gender recodedvar otherstats = conf(90) otherstats = ttests ftest(4) coeff(8) bproduct(2) savefile = mylogit.htm

CSM, UC Berkeley/ISA

January 30, 2017