If multiple directory pathnames are specified for this dataset (in the SDA Manager or in the ’SDADATA=’ specifications in the HARC file), only one of them (usually the main dataset directory) should have a ’disclosure.txt’ file. The other SDA dataset directories (usually created to hold recoded and computed variables) will have the same disclosure rules applied to them automatically in SDA version 4.

This document describes the possible disclosure rules that may be specified. Additional specifications can be added to the ’disclosure.txt’ file, in order to suppress results from TABLES and MEANS that are considered too imprecise to display. Those extra parameters are discussed in a separate document on precision.

Note that
Quick Tables
and SDA version 3.5 require subsidiary datasets (e.g., for
recoded and computed variables) to have a file named ’disc-
id.txt’ in their STUDYINF subdirectory. This ’disc-id.txt’ file
contains a single ID keyword with the format ’ID=abc’, where
’abc’ is the same ID or name used for this study in the
’disclosure.txt’ file in the main SDA dataset. That file is no
longer necessary in SDA version 4.
**
However, if a dataset created using the SDA Manager is later set
up to be accessed by Quick Tables or by SDA 3.5 analysis
programs, you should put that ’disc-id.txt’ file (manually) in
the STUDYINF subdirectory
**
so that Quick Tables and/or the 3.5 programs will know about the
’disclosure.txt’ file in the main study directory. Otherwise the
disclosure rules will not be applied to analysis runs that do not
use variables in the main study dataset. See the
version 3.5 manual page
for details on the ’disc-id.txt’ file.

The valid keywords are as follows:

Keyword Possible Specification Default (if no keyword) _____________________________________________________________________DISCLOSURE ID FOR THE STUDYID= a unique identifier for the REQUIRED dataset with disclosure rules (one word, only letters or numbers)PREVENT AN ANALYSIS FROM BEING RUNVAREXCLUDE= name(s) of variable that All variables allowed cannot use used in analysis COMBEXCLUDE= pairs of variables that All combinations allowed cannot be used together in the same analysis run and cannot be used at all to recode or compute new variables (see notes below) MAXFILTERS= maximum number of selection Any number of filters OK filter variables that can be used in a single run CONTROLVAR= no, if control variables A control variable is OK cannot used used in tables LISTCASE= no, if the ’listcase’ program Listcase run is OK is not allowed to run SUBSET= no, if the ’subset’ program Subset run is OK is not allowed to runSUPPRESS THE OUTPUT AFTER RUNNING AN ANALYSISMINCELLN= minimum number of cases in a No required minimum cell N table cell to allow a table to be displayed (see notes below) MINCELLWN= minimum number of WEIGHTED No required weighted minimum cases in a table cell to cell N allow a table to be displayed AVGCELLMIN= minimum average cell size to No required average cell N allow a table to be displayed (checks both the mean and the median cell size, excluding cells with no cases) AVGCELLWMIN= minimum WEIGHTED average cell No required weighted average size cell N MINCASEBYIVAR= for regressions, minimum ratio No limit on the number of of valid observations to the independent vars number of independent vars MONITORVAR= varname, (min_values) No special monitored vars (see notes below)SUPPRESS UNWEIGHTED NUMBER OF CASES IN OUTPUTUNWEIGHTEDN= no Show unweighted N’s

For example, you may not want to release analysis results based on cases that are all from the same institution (such as from the same prison). Assuming that there is a variable named ’prison’, you could specify that variable as one to be monitored.

By default the cases must come from at least two distinct categories of the monitored variable(s). However, you can specify a higher required number of categories by giving the desired number of categories in parentheses after the variable name. See the example below.

The default messages, following the keyword that would be used in a language file, are as follows. Notice that one or more variable names or a number will sometimes be output after the given message. Those names or numbers are the values specified with the keywords described above.

DIS_VAREXCLUDE = To preserve confidentiality, analyses are not permitted using the following variable(s):

DIS_COMBEXCLUDE = To preserve confidentiality, analyses are not permitted using the following combination(s) of variables:

DIS_VAREXCLUDE_RECODE = To preserve confidentiality, RECODE and COMPUTE are not permitted using the following variable(s):

DIS_MAXFILTERS = To preserve confidentiality, the number of filter variables cannot be greater than:

DIS_CONTROLVAR = To preserve confidentiality, tables cannot be run with control variables.

DIS_LISTCASE = To preserve confidentiality, the LISTCASE program cannot be used with this dataset.

DIS_SUBSET = To preserve confidentiality, the SUBSET program cannot be used with this dataset.

DIS_AVGCELLMIN = To preserve confidentiality, tables cannot be displayed unless the average number of observations in each cell is at least:

DIS_AVGCELLWMIN = To preserve confidentiality, tables cannot be displayed unless the average weighted number of observations in each cell is at least:

DIS_MINCELLN = To preserve confidentiality, tables cannot be displayed unless the number of observations in each cell is at least:

DIS_MINCELLWN = To preserve confidentiality, tables cannot be displayed unless the weighted number of observations in each cell is at least:

DIS_MINCASEBYIVAR = To preserve confidentiality, regression analyses cannot be shown unless the ratio of valid observations to the number of independent variables is at least:

DIS_MONITORVAR = To preserve confidentiality, analysis results cannot be displayed for any set of observations that has only a very small number of values on certain sensitive variables. In this case the sensitive variable(s) (and the minimum required number of valid values) was:

DIS_UNWEIGHTEDN = To preserve confidentiality, only weighted N’s can be shown.

# DISCLOSURE SPECIFICATIONS FOR DATA FILE # ID FOR THIS DATASET ID = survey25 # A. PREVENTS AN ANALYSIS FROM BEING RUN # Completely exclude these vars from analysis and recoding/computing VAREXCLUDE = CASEID, LOCATIONID # Exclude these combinations of vars (separated by ’;’) from analysis # Also exclude the individual vars from being used by the ’recode’ # and ’compute’ programs COMBEXCLUDE = RACE, GENDER; AGE, RACE # Maximum number of selection filters allowed in an analysis run MAXFILTERS = 2 # No tables with a control variable if set equal to ’no’ CONTROLVAR = no # The LISTCASE program cannot be run if set equal to ’no’ LISTCASE = no # The SUBSET program cannot be run if set equal to ’no’ SUBSET = no # B. SUPPRESS ANALYSIS OUTPUT AFTER RUNNING A PROGRAM # Required average (mean and median) cell sizes - unweighted and weighted AVGCELLMIN = 10 AVGCELLWMIN = 200 # Required size of smallest cell - unweighted and weighted MINCELLN = 5 MINCELLWN = 100 # Ratio of cases to number of independent vars in regression MINCASEBYIVAR = 100 # Check for at least 2 distinct values on the variable ’INSTITUTION’ # and at least 3 distinct values on ’CBSA’. MONITORVAR = INSTITUTION CBSA(3) # Suppress all unweighted N’s if set equal to ’no’ UNWEIGHTEDN = no

DIS_AVGCELLMIN = To preserve confidentiality, tables cannot be displayed unless the average number of observations in each cell is at least: DIS_AVGCELLWMIN = To preserve confidentiality, tables cannot be displayed unless the average weighted number of observations in each cell is at least: DIS_COMBEXCLUDE = To preserve confidentiality, analyses are not permitted using the following combination(s) of variables: DIS_CONTROLVAR = To preserve confidentiality, tables cannot be run with control variables. DIS_LISTCASE = To preserve confidentiality, the LISTCASE program cannot be used with this dataset. DIS_MAXFILTERS = To preserve confidentiality, the number of filter variables cannot be greater than: DIS_MINCASEBYIVAR = To preserve confidentiality, regression analyses cannot be shown unless the ratio of valid observations to the number of independent variables is at least: DIS_MINCELLN = To preserve confidentiality, tables cannot be displayed unless the number of observations in each cell is at least: DIS_MINCELLWN = To preserve confidentiality, tables cannot be displayed unless the weighted number of observations in each cell is at least: DIS_MONITORVAR = To preserve confidentiality, analysis results cannot be displayed for any set of observations that has only a very small number of values on certain sensitive variables. In this case the sensitive variable(s) (and the minimum required number of valid values) was: DIS_SUBSET = To preserve confidentiality, the SUBSET program cannot be used with this dataset. DIS_UNWEIGHTEDN = To preserve confidentiality, only weighted N’s can be shown. DIS_VAREXCLUDE = To preserve confidentiality, analyses are not permitted using the following variable(s): DIS_VAREXCLUDE_RECODE = To preserve confidentiality, RECODE and COMPUTE are not permitted using the following variable(s):

internationalization | Using non-English languages in SDA |

precision | Precision specifications |

QuickTables | Quick Tables documentation |

CSM, UC Berkeley/ISA

September 22, 2015