How to Set Up an SDA Data Archive on the Web


This document summarizes briefly the various steps and programs involved in setting up an SDA data archive.

The SDA Archive Developer's Guide (in PDF format) describes this material in more detail. Please refer to this document if you want to know more about the technical details of installing and configuring an SDA Web Archive.


  1. Install the SDA software on your Web server.

  2. Prepare the necessary files for each dataset

      For each dataset you will need:

    • An ASCII (text) data file (with each variable in a fixed set of columns)

    • A data description file (DDL file).
      The DDL file includes location information for each variable, labels, and missing-data specifications.

      There are various ways of creating a DDL file:

      • Convert an SPSS system file into DDL and data files by using the MAKEDDL.SPS script in SPSS.
        Note that if you use the DDL file generated by this script, you must also use the data file generated at the same time. The DDL file and data file must be matched so that each variable can be found in the correct columns of the data file.
        Be aware also that the data definitions for variables in SPSS do not include the text of the questions asked in a survey, but you can add that material later into the DDL file.

      • Convert SAS, SPSS, or Stata data definitions into DDL by using the XCONVERT program.
        Note that the data definitions for those systems do not include the text of the questions asked in a survey, but you can add that material later into the DDL file.

      • Convert a DDI (version 2) file into DDL by using the DDI to DDL Conversion Service.
        A DDI file contains XML data definitions following the conventions of the Data Documentation Initiative.

      • Extract the information from a CASES computer-assisted interviewing instrument.
        The new CASES 5.3j release includes a utility that produces DDL directly. If you're using the CASES 4-series, you can use the SDA Q4TODDL program to extract DDL information. (See the CASES to DDL documentation for more information about using Q4TODDL.)

      • Or you can use a text editor to enter the specifications directly onto a file.

  3. Create the SDA Files and Set Them Up on the Web

    The new SDA Archiver procedure (available with SDA version 3.1 and later) was developed to simplify this process. However, the SDA programs can still be run separately in command-line mode if desired.

    1. Use the New SDA Archiver Procedure

      This interactive procedure walks you through the various steps involved in uploading the data and DDL files to your Web server, creating the SDA system files, and setting them up for Web access.

      The data archive at each SDA archive site will provide information on how to access the SDA Archiver.

      The online help file explains the various steps of the process.


    2. Run the SDA Programs in Command-Line Mode

      The various SDA programs can be run as separate procedures. The main steps are as follows:
      • Generate an SDA dataset for each data file.

        • Once you have an ASCII data file and a DDL file that describes it, you convert the data into an SDA dataset by running the MAKESDA program. Other SDA programs can then work on that dataset.

      • Create the HTML codebook files for each dataset using the SDA XCODEBK program.

          For each SDA dataset you will need:

        • A list of the variables, in the order you want them to appear in the online codebook.
          This list can also contain headings for groups of variables. Headings are very helpful to the user if you have more than a few variables. These headings are used for the codebook and for various other SDA procedures.

        • Any introduction or appendix files you want to include in the online codebook.

      • Create the HTML Archive Definition File (HARC file).

        • This is a text file that contains information about which datasets and analysis programs are available, where they are located on your computer system, and a few other items.

      • Add a link to your data archive's HTML page to execute the HSDA program.

        • When a user clicks on that link, SDA starts up, reads the specified HARC file, looks for the requested dataset, and displays the option screens.

        • Based on the specifications and selections made by the user (such as the names of variables to use in a cross-tablulation), the HSDA program sends the appropriate commands to the server, which executes the commands and returns the results to the user's browser. The user can then make new selections and repeat the process.

For more information:

See the SDA summary page for a list of current SDA programs that users can run.

See also the online help files for more detailed information on the various programs.

If you have any questions, please e-mail us at: sda@csm.berkeley.edu