SDA 3.5 Documentation for PRECISION


NAME

precision - Specify precision specifications for output of results

DESCRIPTION

Some statistics are so imprecise that they can be misleading. SDA provides a mechanism to suppress the output of certain statistics (like percentages and means) unless they meet certain specifiable criteria of precision. The specifications for precision are described here.

The mechanism for enforcing those specifications is by adding them to the disclosure file. All of the analysis programs check to see if there is a file named ’disclosure.txt’ in the STUDYINF directory of the main SDA dataset. If they find such a file, they enforce the disclosure specifications contained in that file. They also enforce any precision specifications that have been added to that file. Currently only the TABLES and MEANS programs can enforce specified levels of precision.

When a statistic is suppressed, the rest of the table is not affected. Furthermore, summary statistics based on the suppressed statistic(s) are still generated and displayed.


KEYWORDS

The ’disclosure.txt’ file contains specifications for the analysis. These specifications are given in the form "keyword = something" with one keyword per line. Keywords may be given in any order, either in upper or in lower case. See the Disclosure document for information on how to set up the disclosure file. This document only explains the extra keywords for precision-based suppression that can be added to a ’disclosure.txt’ file.

The valid keywords are as follows (all are optional):


Keyword             Meaning
_____________________________________________________________________


TABLES PROGRAM


TABLES=       tail(a) ratio(b) efn(c) mincelln(d) mincellwn(e) min(f) minwt(g)

          Suppress the percentage in a cell if the following specifications
              are not met.  Note that ’p’ is the estimated cell percent converted to a proportion.

              tail:      a = P must be greater than or equal to a and less than (1-a)

              ratio:     b = The ratio of two quantities must be less than or equal to b,
                             where

                             EITHER (when p is less than or equal to .5)

                             the numerator of the ratio is the standard error of p
                             divided by p; and the denominator is the negative of
                             the natural logarithm of p,
                             formula: se(p)/p / -ln(p) <= b

                             OR (when p is greater than .5)

                             the numerator of the ratio is the standard error of p
                             divided by (1 minus p), and the denominator is the negative
                             of the natural logarithm of the quantity (1 minus p).
                             formula: se(p)/(1-p) / -ln(1-p) <= b

               efn:       c = Minimum EFFECTIVE number of cases (n/deff) in the
                              DENOMINATOR of the percentage.

               mincelln:  d = Minimum n of cases in the cell (NUMERATOR of the percent)

               mincellwn: e = Minimum weighted n of cases in the cell (NUMERATOR)

               min:       f = Minimum n of cases in the DENOMINATOR of the percent

               minwt:     g = Minimum weighted n of cases in the DENOMINATOR of the percent


MEANS PROGRAM


MEANS=        min(a) minwt(b) ratio(c) efn(d)

          Suppress the mean in a cell if the following specifications are not met.

              min:   a = Minimum n of cases in the DENOMINATOR of the mean

              minwt: b = Minimum weighted n of cases in the DENOMINATOR of the mean

              ratio: c = Minimum ratio of (mean / SE)

              efn:   d = Minimum effective number of cases (n/deff) in the
                         DENOMINATOR of the mean



MESSAGE TO DISPLAY IF A STATISTIC IS SUPPRESSED

If a statistic is suppressed, an asterisk is placed in the cell, and a note indicates the reason for the suppression. The default message is given below, but it can be modified by inserting a revised message in a language file. It is possible to insert an HTML link in the message, if you want the user to be able to link to some document that explains in more detail what the precision-based suppression rules are and why they have been implemented.

The default message, following the keyword that would be used in a language file, is as follows.

DIS_LOWPRECISION = The calculated statistic has very low precision and is not reported.


EXAMPLE OF PRECISION SPECIFICATIONS ADDED TO A DISCLOSURE FILE

In the following example, note that blank lines and lines beginning with ’#’ are treated as comments, and they are ignored by the SDA programs.

# PRECISION SPECIFICATIONS ADDEDED TO THE ’disclosure.txt’ FILE

# Precision specification for precents produced by the TABLES program

TABLES=  tail(.00005) ratio(.175) efn(68), mincelln(5) mincellwn(.5) min(100) minwt(10)


# Precision specifications for means produced by the MEANS program

MEANS=   min(10) minwt(10) ratio(2) efn(8)


EXAMPLE OF A LANGUAGE FILE (’langan’) WITH EMBEDED LINKS

In the following example, note that blank lines and lines beginning with ’#’ are treated as comments, and they are ignored by the SDA programs.

The words "very low precision" are set up to link to a file that could explain further the precision rules and the reasons for setting them up.

# Alternate error message for low precision, with link to a file DIS_LOWPRECISION = The calculated statistic has very low precision and is not reported.

SEE ALSO

disclosure Disclosure specifications
language Language file specifications


CSM, UC Berkeley
April 12, 2011