Several optional statistics such as medians, percentiles, and standard errors can also be calculated and displayed in each cell of the output table. Note that the standard error option refers only to the mean, not to the median or percentile. Each statistic can be displayed with a specified number of decimal places.
Ordinarily this program is invoked by the Web interface for the
SDA programs, and the user does not have to deal with the
keywords given in this document. Output from the program is
usually in HTML, which is sent to the user's Web browser.
However, output can also be produced as a CSV file so that the
user can feed the results into other procedures, either for
special formatting or for other purposes.
CSV output
is produced if 'TYPE = CSV'. is specified.
It is also possible to run the program in batch mode by preparing a command file, which specifies the variables to be analyzed and the options to use. This document explains how to prepare such a file. The name of this batch command file is specified to the program after the `-b' option flag.
Keyword Possible Specification Default (if no keyword)
_____________________________________________________________________
STUdy= path of dataset directory Look for variables in
current directory only
SAvefile= filename to receive output Output sent to screen
(overwrite existing file) (standard output)
Variable Specifications
DEPendent= variables name(s) REQUIRED
(separated by spaces/commas)
ROWvar= variable name(s) REQUIRED
(separated by spaces/commas)
COLUMNvar= variable name(s) No column variable
CONtrolvar= variable name(s) No control variable
Weight= name of weight variable No weighting
Filter= name(s) and codes of filter No filter
variable(s)
GVARCase= LOWER or UPPER No force to lower/upper case
STRatum= name of variable giving No stratification for
sample stratum computing standard errors
$1: Force one stratum
CLuster= name of variable giving No cluster variable for
sample cluster computing standard errors
General Options
COLORcoding= Yes No color coding of cells
or colored headings
LAnguagefile= pathname of file with English labels on
non-English labels output
NOTABle= Yes (to suppress tables of Display the tables
means, confidence intervals,
and diagnostic information
but still get other info)
TExt= Yes No text for variables
RUNtitle= title or comments for run No title or comments
Instead of displaying the main statistic directly, it is possible to display the DIFFERENCE from something else, by adding the `difference=' keyword. The difference for each cell can be the difference between the cell mean and either the overall mean, the mean in the same column of a specified row, or the mean in the same row of a specified column. If a row or column difference is requested, you must also specify the BASE CATEGORY to use for the comparison.
For differences between a specified row or column, it is possible to obtain the average of the differences, instead of the difference in the marginal column or row. This option is set in the Global Specifications section for the dataset in the SDA Manager (or in the general section of the HARC file by setting XMEANS=YES).
For each statistic the user can specify the number of desired decimal places (in parentheses, after the name of the statistic). See below for the default number of decimals for each statistic.
Keyword Possible Specification Default (if no keyword)
_____________________________________________________________________
MAINstat= MEANs (ndec) Display means, with
TOTALs (ndec) two decimal places
LOgit (ndec)
PRobit (ndec)
LP (ndec)
DIFference= Overall (ndec) Display main statistic
Row (ndec)
Column (ndec)
BASEcat= code for comparison row/column REQUIRED for row/column
differences
AVGDiffs= Yes No average differences
from a row or column
are displayed
If confidence intervals are requested, the upper and lower bounds of the confidence interval for the mean (or total or difference) in each cell are shown. (Confidence intervals for medians and percentiles are not available.) The default level of confidence is the 95 percent level, but the 90 or 99 percent levels can also be specified (in parentheses). The number of decimal places displayed will be the same as requested for the means. If both complex and SRS standard errors have been requested, only the complex standard errors are used for the confidence intervals.
Keyword Possible Specification Default (if no keyword)
_____________________________________________________________________
OTHERSTats=
CONFidence (level) No confidence intervals
(level can be 90,95,or 99)
(EITHER medians OR percentiles
can be specified, but not both)
MEDIAN (ndec) No Median of dep variable
PERCENTile (nth, ndec) No nth percentile
MINimum (ndec) No minimum value
MAXimum (ndec) No maximum value
Ncases No unweighted N's
WNcases (ndec) No weighted N's
(statistics for means only)
SER (ndec) No standard errors for
simple random sample
ZSTATistic (ndec) No Z- or T-statistics
P (ndec) No p-value
(only for differences
from a row or a col)
SD (ndec) No standard deviations
(for complex samples only)
SEC (ndec) No standard errors for
complex sample design
DEFT (ndec) No design effect
(for cluster samples only)
RHO (ndec) No cluster coefficient
REMEDIAN= ASNEEDED or ALWAYS NEVER: No remedian estimates
for medians or percentiles
(see section with additional
information below)
An ANOVA table can be produced. For simple random samples the ANOVA table and an F-test is produced. For complex samples the F-test is omitted and the only output is the eta-squared statistics, which show descriptively the proportion of the variance of the dependent variable that is explained by the row and column variables and their interaction.
For complex samples, a table with diagnostic information in each cell can also be produced.
A multiple classification analysis (MCA) can be carried out. The default number of decimals is 3, but another number of decimal places can be specified.
Keyword Possible Specification Default (if no keyword)
_____________________________________________________________________
ANova= Yes No anova table
OTHERTABles=
DIAGnostics No table with diagnostics
MCA (ndec) No Multiple Classification
Analysis
The statistic charted is the statistic specified with the 'MAINSTAT=' keyword (default is MEANS). However, if MEDIANS or PERCENTILES are specified with the 'OTHERSTats=' keyword, the chart can be based on the median or percentile (only one can be specified), by specifying 'PERCENTile' as the chart type.
Keyword Possible Specification Default (if no keyword)
_____________________________________________________________________
CHARTtype= PERCENTile Chart the 'MAINSTAT' instead
of medians or percentiles
TBLProperties= PATHNAME for chart properties REQUIRED for charts
file
Required location for SDA 4 is:
SDAROOT/tmpdir/xxx.cht
where 'SDAROOT' is the pathname
of the SDA installation on
your server, and
where 'xxx' is any name.
(See the last example below)
(This is a temporary filename,
to be passed on to the charting
servlet. The MEANS program
will generate multiple files
from the given filename, if
multiple charts are generated
because a control variable
was specified or because
multiple dependent or row or
column variables were
specified.)
CH_URL= URL of chart-generation REQUIRED for charts
servlet on the server.
Required URL for SDA 4 is:
http://SDAURL/sdaweb/charts
where 'SDAURL' is the
hostname of the SDAWEB
application on your server.
(See the last example below)
CH_MAXCHarts= Maximum number of charts to 25
create on this run (1-100)
CH_TYPe= Type of chart to create bar
(bar or line)
CH_ORientation= Orientation of BAR charts vertical
(vertical or horizontal)
CH_EFfects= Visual effects for BAR charts use2D
(use2D - 2 dimensional;
use3D - 3 dimensional)
CH_SHOWMeans= Yes No means or
Put means (or the specified other stats
statistic) on the chart on the chart
CH_FONT= Font to use in charts SansSerif
CH_COLor= Yes (create charts in color) Greyscale charts
CH_BARcolors= Path for custom palette file Standard colors
for bar charts
(See additional info below)
CH_LINEcolors= Path for custom palette file Standard colors
for line charts
(See additional info below)
CH_WIdth= Width of chart in pixels 600
CH_HEight= Height of chart in pixels 400
By default the various statistics generated for each cell of a table (such as percentages and number of cases) are output in separate sections (separate series of rows) in the CSV file.
If CSVCOMBine = yes, all of the statistics in each cell are output in the same section of the CSV file.
Keyword Possible Specification Default (if no keyword) _____________________________________________________________________ TYPE= csv (produce CSV output) Standard HTML output CSVCOMBine= yes (combine statistics) Not combined
Briefly, the variables will cycle in the following order:
control, column, row, dependent. All of the tables will be
produced using the same weight, filters, and other options.
For further information on this method of estimating the median
or percentile, see Peter J. Rousseeuw and Gilbert W. Bassett,
Jr., "The Remedian: A Robust Averaging Method for Large Data
Sets." Journal of the American Statistical Association,
March 1990, vol. 85, pp. 97-104. Note that SDA uses a base
of 101 to calculate the remedian.
Confidence intervals were formerly specified as an OTHERTABle and were output as a separate table. In SDA 4.1.2 and later, they are specified as an OTHERSTat and are shown as a row in the main table of results. Batch files using the older syntax will still run, but the confidence intervals will be displayed in the main table.
study = /archive/nes84
dep = vardep
row = var1
column = var3
otherstats = ncases
anova = yes
savefile = mymeans.htm
study = /archive/nes84
dep = vardep1 vardep2
row = var1(1-9) var2 var3(0-9)
column = var3, var4
weight= wtvar
filters= var21(1-3) var30(1)
otherstats = se, ncases
anova
savefile = mymeans.htm
study = /archive/nes94
dep = vote
row = party
column = sex
diffs = col(3)
basecat = 1
otherstats = se p ncases
anova
text
runtitle= Test run to demonstrate batch mode
savefile= mymeans.htm
study = /archive/nes94
dep = vote
row = party
column = sex
stratum = stratvar
cluster = psuvar
otherstats = sec ser deft rho ncases
othertables = confidence diagnostics
savefile= mymeans.htm
study = /sa/sdatest
dep = vardep
row = var1
column = var3
savefile = mymeans.htm
tblproperties = /var/www/sda/tmpdir/testing.cht
ch_url=http://sda.berkeley.edu/sdaweb/charts
ch_color = yes
ch_showmeans= yes