SDA 4.1 Documentation for TABLES


NAME

tables - run tables in batch mode

USAGE

tables -b filename [-t language_file]

DESCRIPTION

TABLES is a crosstabulation program. Ordinarily this program is invoked by the Web interface for the SDA programs, and the user does not have to deal with the keywords given in this document.

Output from the program is usually in HTML, which is sent to the user's Web browser. However, output can also be produced in XML or as a CSV file so that the user can feed the results into other procedures, either for special formatting or for other purposes.
XML output is produced if 'TYPE = XML'. is specified.
CSV output is produced if 'TYPE = CSV'. is specified.

Meaning of the option flags:

-b filename
It is possible to run this program in batch mode by preparing a command file, which specifies the variables to be analyzed and the options to use. This document explains how to prepare such a file. The name of this batch command file is specified to the program after the `-b' option flag.

-t filename
Write the strings used by all of the analysis programs into the specified file, so that they can be replaced with strings in another language.
See the internationalization document for details.

TABLES and the other analysis programs can then use the non- English strings to output the analysis results, if the name of the file containing the modified strings is passed to the program. (See the 'LAnguagefile=' option below.)

Ordinarily the name of the file with non-English strings will be specified within the SDA Manager, which will then arrange for this filename to be passed to the analysis programs when they are run interactively.


CONTENTS OF THIS DOCUMENT


KEYWORDS


The batch command file contains specifications for the analysis. These specifications are given in the form "keyword = something" with one keyword per line. Keywords may be given in any order, either in upper or in lower case. The valid keywords are as follows (with significant characters shown in capital letters):

Basic Specifications for the Tables


Keyword       Possible Specification          Default (if no keyword)
_____________________________________________________________________

STUdy=        path(s) of dataset(s)           Look for variables in
                                                current directory only
ROWvar=       variable name(s)                REQUIRED
               (separated by spaces/commas)

COLUMNvar=    variable name(s)                No column variable

CONtrolvar=   variable name(s)                No control variable

Weight=       name of weight variable         No weighting

Filter=       name(s) and codes of filter     No filter
                variable(s)

GVARCase=     LOWER or UPPER                  Do not convert all variable
                                                names to lower/upper case

STRatum=      name of variable giving         No stratification for
                sample stratum                  computing standard errors
              $1: Force one stratum

CLuster=      name of variable giving         No cluster variable for
                sample cluster                  computing standard errors

SAvefile=     filename to receive output      Output sent to screen
                (overwrite existing file)       (standard output)

TExt=         Yes                             No text for variables


Statistics in Each Cell

The user can specify one or more types of percentaging. For each percentage, the user can request standard errors and/or confidence intervals.

If a stratum and/or cluster variable has been specified in the command file, standard errors are calculated that take into account the complex design. The confidence intervals created from those standard errors will also reflect the complex design.

By default the percentages, confidence intervals, and weighted N's are displayed with one decimal place. The user can specify a different number of decimal places by putting that number in parentheses after the desired type of percent. Both the percent and the confidence interval will have the same number of decimal places. For standard errors, Z-statistics, and summary statistics the default is two decimal places. For design effects the default is three decimal places. A different number of decimal places can be specified in parentheses.

The Z-statistic for each frequency is the statistic that controls the color for each cell. Note that this statistic is based on expected values, in the chi-square sense, and it is not based on the standard error (SRS or complex) of the cell percents. By default the Z-statistics are output with 2 decimal places.


Keyword       Possible Specification          Default (if no keyword)
_____________________________________________________________________


Percents=     Row (ndec)                      No percentaging
              Column (ndec)
              Total (ndec)

OTHERSTats=
              CONFidence (level)              No confidence intervals
               (level can be 90,95,or 99)

              SE (ndec)                       No standard errors

              DEFT (ndec)                     No design effect (deft)

              ZSTAtistic (ndec)               No Z-statistics

              N [of cases]                    No unweighted N

              WN [of cases] (ndec)            No weighted N


Other Table Options


Keyword       Possible Specification          Default (if no keyword)
_____________________________________________________________________

COLORcoding=  Yes                             No color coding of cells
                                                or colored headings

MISSing=      Valid                           Exclude MD or out-of-
                                                range codes on the
                                                row, column, and
                                                control variables

NOTABle=      Yes (to suppress entire table   Display the table
                but still get other info)

RUNtitle=     title or comments for run       No title or comments

STAtistics=   Yes                             No summary statistics

NDECimals=    number of decimals in stats     2


LAnguagefile= Name of file with non-English   English labels on
                labels and messages             output

Chart Options

There are several chart options, assuming that the chart generation servlet is running on the server computer. Two of the specifications are required, in order to produce charts.

Keyword       Possible Specification          Default (if no keyword)
_____________________________________________________________________

TBLProperties= PATHNAME for chart properties   REQUIRED for charts
                file
               Required location for SDA 4 is:
               SDAROOT/tmpdir/xxx.cht
               where 'SDAROOT' is the pathname
                of the SDA installation on
                your server, and
               where 'xxx' is any name.
                (See the last example below)

               (This is a temporary filename,
                to be passed on to the charting
                servlet. The TABLES program
                will generate multiple files
                from the given filename, if
                multiple charts are generated
                because a control variable
                was specified or because
                multiple row or column
                variables were specified.)

CH_URL=         URL of chart-generation        REQUIRED for charts
                 servlet on the server.
                Required URL for SDA 4 is:
                http://SDAURL/sdaweb/charts
                 where 'SDAURL' is the
                 hostname of the SDAWEB
                 application on your server.
                 (See the last example below)

CH_MAXCHarts=   Maximum number of charts to     25
                 create on this run (1-100)

CH_TYPe=        Type of chart to create         stackedbar
                (stackedbar, bar, pie,
                 or line)

CH_ORientation= Orientation of BAR charts       vertical
                (vertical or horizontal)

CH_EFfects=     Visual effects for BAR charts   use2D
                (use2D - 2 dimensional;
                 use3D - 3 dimensional)

CH_SHOWPcts=    Yes (put percentages on the     No percentages
                  chart)

CH_FONT=        Font to use in charts           SansSerif

CH_COLor=       Yes (create charts in color)    Greyscale charts

CH_BARcolors=   Path for custom palette file    Standard colors
                  for bar charts
                 (See additional info below)

CH_LINEcolors=  Path for custom palette file    Standard colors
                  for line charts
                 (See additional info below)

CH_WIdth=       Width of chart in pixels        600

CH_HEight=      Height of chart in pixels       400


XML Output

The output from the TABLES program can be produced in XML instead of the default HTML. If 'TYPE = xml' is included in the batch command file, the output will be produced in XML.

Keyword       Possible Specification          Default (if no keyword)
_____________________________________________________________________

TYPE=          xml (produce XML output)       Standard HTML output


CSV Output

Instead of regular HTML output, the TABLES program can produce output as a CSV file (with commas separating the values output). If 'TYPE = csv' is included in the batch command file, the output will be produced as a CSV file. The name of the file is specified with the 'SAvefile=' keyword. The file name should ordinarily have a '.csv' suffix.

By default the various statistics generated for each cell of a table (such as percentages and number of cases) are output in separate sections (separate series of rows) in the CSV file.

If CSVCOMBine = yes, all of the statistics in each cell are output in the same section of the CSV file.


Keyword       Possible Specification          Default (if no keyword)
_____________________________________________________________________

TYPE=          csv (produce CSV output)       Standard HTML output

CSVCOMBine=    yes (combine statistics)       Not combined


ADDITIONAL INFORMATION


ABBREVIATIONS FOR KEYWORDS

Keywords can usually be abbreviated down to the number of characters required to differentiate them from other keywords. Sometimes only one character is required. The keyword for the weight variable, for instance, can be given as "weight=" or "wei=" or even "w=". Either upper or lower case may be used. In the list of keywords above, the minimum string of characters required for each specification is shown in capital letters.

Mention of keyword sufficient

The form `keyword=yes' may be shortened to `keyword'. For example, `statistics=yes' can be shortened to `statistics'.

COLORS FOR CHARTS

Each type of chart has a default set of colors that are used for successive bars or lines in the chart. To change the default set of colors, specify the full pathname of a file that specifies, on each line, the three RGB color codes for each successive color to use. This pathname is given after the 'CH_BARcolors=' keyword and/or the 'CH_LINEcolors=' keyword.

COMMENTS

Anything on a line beginning with "#" is ignored by the batch processor and can therefore be used for comments. Blank lines are also ignored.

EFFECT OF A WEIGHT VARIABLE

If a weight variable is specified, the weighted number of cases in each cell is used to calculate the percentages. Furthermore, all of the other statistics are based on the weights. Note that the weight variable is used for these purposes even if the weighted number of cases is not displayed in each cell.

ORDER OF PROCESSING LISTS

When more than one variable is given for the row, column, or control variable specifications, the tables are produced in the following order: Tables for EACH of the control variables are produced with the FIRST column variable and the FIRST row variable. Then the whole list of control variables is processed again for the SECOND column variable and the FIRST row variable; and so on until the whole set of column variables has been processed. Then the whole series is repeated for the SECOND row variable; and so on until all the row variables have been used.

Briefly, the variables will cycle in the following order: control, column, row. All of the tables will be produced using the same weight, filters, and other options.

REPETITION OF KEYWORDS

If there is not enough room on a line to list all of the desired variables, the keyword can be repeated on a new line, and more variables can be listed. In such a case the second list is appended to the first list, for purposes of generating tables. This appending feature only applies to the keywords for specifying the row, column, control, or filter variables. If other keywords are repeated, the program will print an error message and stop.

BACKWARD COMPATIBILITY

Versions prior to SDA 1.2b used 'vertical' and 'horizontal' to specify the 'rowvar' and 'columnvar' variables in the batch command files. Although the older terminology has been superseded, those keywords are still recognized for now as synonomous with the newer 'rowvar' and 'columnvar' specifications.


EXAMPLES OF BATCH FILES


Basic example

The SDA dataset is assumed to be in the current directory. The results will overwrite the file named after the `savefile=' keyword, if the file already exists.
     row = spend spend2 spend3
     column = ideo, gender

     percents = column
     otherstats = N
     statistics = yes
     savefile = mytables.htm

Using more options

Specify multiple row and column variables, which will generate a table for each combination of the variables.
Also redefine some ranges, and use weight and filter variables.
Also output 95% confidence intervals for the percentages.
     study = /sa/sdatest
     row = spend(1-9) spend2(1-8)
     column = ideo educ
     control= gender

     weight= casewt
     filters= age(18-50) party(1-3)

     percents=row column
     otherstats = confidence(95), WN
     statistics
     text

     savefile= mytables.htm

Specifying some chart options

In addition to the two required chart specifications, request charts in color (instead of grayscale) with percentages printed on or next to each bar.
     study = /sa/sdatest
     row = spend
     column = ideo

     percents = column
     statistics = yes
     savefile = mytables.htm

     tblproperties = /var/www/sda/tmpdir/testing.cht
     ch_url=http://sda.berkeley.edu/sdaweb/charts
     ch_color = yes
     ch_showpcts= yes


CSM, UC Berkeley/ISA
September 10, 2020