1972-1994 General Social Survey Cumulative File

Go to Codebook Home Page
1972-1994 General Social Survey Cumulative File

Introduction

The National Data Program for the Social Sciences is designed as a data
diffusion project and a program of social indicator research.  The data
come from the General Social Surveys, interviews administered to NORC
national samples using a standard questionnaire.  Toward the major goal
of functioning as a social indicator program, items which have appeared
on previous national surveys between 1937 and 1978 have been replicated
here.  The search for trend items led us to published reports from
Gallup, Harris, the Detroit Area Study, SRC (Michigan) studies, NORC
files, and Federal Commissions such as those on Violence and
Pornography.

By retaining the exact wording, we hope to facilitate time trend studies as
well as replications of earlier findings.  For the base line items in the
initial 1972 survey, some 105 sociologists and social scientists reviewed
drafts of the questionnaire, suggested revisions and additions, and expressed
their question preference by vote.  Their serious assistance was extremely
helpful in putting together a final version of the questionnaire which would
represent the varied interests of social scientists.  Topic and question
selection is monitored by a Board of Overseers: Karen Campbell, Paul DiMaggio,
Glenn Firebaugh, Robert Hauser, Michael Hout, James Kluegel, Peter Marsden
(Chair), Bernice Pescosolido, Stanley Presser, David Sears, David Williams,
and James Wright.

The items appearing on the surveys are one of three types: Permanent questions
that occur on each survey, rotating questions that appear on two out of every
three surveys (1973, 1974, and 1976, or 1973, 1975, and 1976), and a few
occasional questions such as split ballot experiments that occur in a single
survey.  Starting in 1988 items will not longer rotate across years but appear
on two-thirds of the cases every year.  This design is discussed in Appendix Q.
A detailed layout of the appearance of questions can be found right before the
index to this codebook.

A second objective is the prompt distribution of fresh, interesting, and
high-quality data to a variety of users who are not affiliated with large
research centers.  Pursuant to this end, the Roper Public Opinion Research
Center has agreed to reproduce and distribute the data and codebook.  The
initial survey, 1972, was supported by grants from the Russell Sage Foundation
and the National Science Foundation.  NSF has provided support for the 1973
through 1978, 1980, and 1982 through 1987 surveys.  NSF will continue to
support the project through 1997.  Supplemental funding for 1984-1997 comes
from Andrew M. Greeley.  We welcome your participation in this program.  While
it is not necessary to request permission from NORC before publishing analyses
of these data, we do ask that NORC be cited as the source of your data.  We
also request that copies of reports which utilize the data be sent to the
General Social Survey, NORC, 1155 East 60th Street, Chicago, IL 60637.

DATA

The General Social Surveys have been conducted during February, March, and
April of 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1980, 1982, 1983, 1984,
1985, 1986, 1987, 1988, 1989, 1990, 1991, 1993 and 1994.  There are a total
of 32,380 completed interviews (1,613 in 1972, 1,504 in 1973, 1,484 in 1974,
1,490 in 1975, 1,499 in 1976, 1,530 in 1977, 1,532 in 1978, 1,468 in 1980,
1,506 in 1982, 354 in 1982 black oversample, 1,599 in 1983, 1,473 in 1984,
1,534 in 1985, 1,470 in 1986, 1466 in 1987, 353 in 1987 black oversample,
1481 in 1988, 1,537 in 1989, 1372 in 1990, 1,517 in 1991, 1,606 in 1993, and
2992 in 1994).  The median length of the interview has been about one and a
half hours.  Each survey is an independently drawn sample of English-speaking
persons 18 years of age or over, living in non-institutional arrangements
within the United States.  Block quota sampling was used in 1972, 1973, and
1974 surveys and for half of the 1975 and 1976 surveys.  Full probability
sampling was employed in half of the 1975 and 1976 surveys and the 1977, 1978,
1980, 1982-1991, 1993-1994 surveys (see Appendix A for a detailed description
of the sample design).

The data from the interviews were processed according to standard NORC
procedures.  Cleaning procedures--utilizing a combination of the coding
specifications and the interviewer instructions--were used to check for
inconsistent or illegitimate codes (see Appendix B for interviewer
instructions and Appendix C for general coding instructions).  Some
variables--age, occupation, and occupational prestige--are coded so that
the first digit of the two- or three-digit codes may be used separately.

This cumulative data set merges all 19 surveys into a single file with each
year or survey acting as a subfile.  This greatly simplifies the use of the
General Social Surveys for both trend analysis and pooling.  In addition,
this cumulative data set contains newly created variables (e.g. a poverty
line code).  Finally, the cumulative file contains certain items never before
available (e.g., 1987 module on the impact of the family of the changing labor
force participation of women).

To facilitate the use of the codebook, several terms must be explained.  The
abbreviation "R,' which appears throughout the text and appendices, stands
for "respondent."  The format which we have used in the text of the codebook
is as follows:

95.      Do you think the use of marijuana should be made legal or not?

[VAR: GRASS]

RESPONSE   PUNCH                             YEAR                   col. 377
--------   -----                             ----                   --------
                 1972-82  1982B  1983-87  1987B  1988-91  1993  1994     ALL
                 -------  -----  -------  -----  -------  ----  ----     ---
Should...... 1      1803      0     1156     63      668   234   457    4381
Should not.. 2      5413      0     4654    277     3124   770  1450  15,688
Don't know.. 8       242      0      181     12      136    52    93     716
No answer....9        35      0       17      1       20     1    11      85
DNA.........BK      6133     354    1534      0     1959   549   981  11,510

The format includes the question exactly as it appeared in the questionnaire.
For those few questions that were recoded, the symbol [RECODE] appears
immediately after the question.  For the original question wording, the user
must turn to Appendix D: Recodes.  Question numbering as it appeared on the
actual questionnaire is given in Appendix B.

"[VAR: GRASS]" refers to the variable name.  A mnemonic was assigned to each
question to promote standardization in the use of General Social Survey
variable names and also to meet the eight character limitation imposed by some
computer software systems (e.g., SPSS).

Under the heading "RESPONSE," all possible answers to the questions are
listed.  The questionnaire contains three alternate forms of response
as follows: (1) the answers were read to the respondent (if they were
included in the question); (2) answers were presented to the respondent
on a card (indicated by interviewer instructions); or (3) answers were
marked by the interviewer to best correspond to the answer of the
respondent (also indicated by interviewer instructions).

The term 'PUNCH' represents the code or numerical value which was
assigned to each response.  These are the numbers that the user will
find punched in the columns.  The frequency of occurrence of each of
the punch values appears in the next four columns.  The combined
marginals across the surveys are in the last column headed "ALL."

In most cases, the marginal distributions for all punches are given in
the text.  For a small number of variables -- the two-or-more-column
variables -- frequencies or marginal distributions appear in the
appendices.  Responses are mutually exclusive (i.e., only one code can
appear for each respondent for each question).

The first column under "YEAR,' 1972-1982, gives the combined totals for
the 1972-1982 cross-sections.  In the second column, 1982B, the counts
for the 1982 black oversample appear.  Blacks who were part of the
regular 1982 sample are not part of these figures.  'ne third column,
1983-1987, gives the combined totals for 1983-1987.  'Me fourth column,
1987B, contains the counts for the 1987 black oversample.  The fifth
column, 1988-1991, gives the combined totals for 1988-1991.  The sixth
column, 1993, contains the counts for the 1993 survey.  The seventh
column, 1994, contains the counts for the 1994 survey.  The eighth
column, ALL, contains the total for the preceding seven columns.  For a
discussion of the use of the black oversample see Appendix A- For the
individual yearly totals for 1972-1982 consult the General Social
Survey, 1972-1982: Cumulative Codebook; for 1983-1987 consult the
General Social Su@e s, 1972-1987:  Cumulative Codebook, and for
1988-1991, consult General Social Surveys, 1972-1991: Cumulative
Codebook.  To determine what years or surveys a variable appeared in
see Appendix U.

NEW DEVELOPMENTS

With NSF's renewal of the GSS for 1993-1997, major changes in design are
occurring.  'ne 1993 GSS was the last survey conducted under the old
design.  In 1994 two major innovations were introduced to the GSS.

First, the traditional core is substantially reduced to allow for the
creation of mini-modules (i.e. blocks of about 15 minutes devoted to
some combination of small- to medium-sized supplements).  The
mini-modules space gives us greater flexibility to incorporate
innovations and to include important items proposed by the social
science community.

Second, a new biennial, split-sample design is used.  The sample
consists of two parallel sub-samples of approximately 1,500 cases
each.  The two sub-samples both contain the identical core.  The A
sample also contains a standard, topical module, the mini-modules, and
an ISSP module (on women, work, and the family).  The B sample has a
second topical module, mini-modules, and an ISSP module (on the
environment).  In effect, one can think of the A sample as representing
a traditional GSS for 1994 and the B sample as representing a
traditional GSS for 1995.  Rather than being fielded separately in two
different years they are fielded together.

While we will generally field separate topical, mini-, and ISSP
modules on the A and B samples, we have the option of including some
items on both samples if a larger sample size is needed.  This would
most likely be utilized in the case of the mini-modules.

In 1996 and in subsequent even numbered years the same design described
for 1994 would be repeated.  In addition, in 1994 only there is a
transitional design to calibrate any impact of deletions from the
core.  On Sample A, the old core was administered to respondents
receiving Version 1 (X) and the new reduced core was given on Version 2
(Y).  See Appendix U for further information about specific items.

Abbreviations:

The following abbreviations are used. throughout the text and appendices:

   AIPO             American Institute of Public Opinion (Gallup Poll)
   BK               Blank
   Col(s).          Column(s)
   IISR             International Institute for Social Research
   ISSP             International Social Survey Program
   GO               Gallup Organization
   N                Number
   NAP              Not applicable
   NORC/SRS         National Opinion Research Center/Survey Research Service
   n.e.c.           Not elsewhere classified
   ORCO             Opinion Research Corporation
   POS              Public Opinion Survey (Gallup)
   PSU              Primary Sampling Unit
   Q(s)-            Question(s)
   R                Respondent, except in Appendix C: General Coding
                    Instructions, where R stands for blank.
   Roper            Roper Public Opinion Research Center, University of
                    Connecticut
   ICPSR            Inter-University Consortium for Political and Social
                    Research, University of Michigan
   SRC              Survey Research Center, University of Michigan
   Var.             Variable
   Vol.             Volunteered
   ZUMA             Zentrum fuer Umfragen, Methoden, und Analysen, Germany

Data Identification Numbers

Identification numbers and locations are as follows:

           N = 32,380
           3436 columns per respondent
           -- Year appears in columns 1-2
           -- Respondent identification number in cols. 3-6
Go to Codebook Home Page