Configuring Quick Tables for a Dataset


In this section we discuss the steps necessary to configure Quick Tables for a particular dataset. This discussion assumes that the system administrator has already installed and configured some servlet "container" (for example, Apache's "Tomcat") and installed the Quick Tables "webapp" in that servlet container. Here we'll discuss the steps that come next.

To set up Quick Tables for a dataset, you need to do three things:

  1. Add a simple Quick Tables "start-up" link to an HTML page. Here's an example:

    http://www.icpsr.umich.edu/quicktables/quickconfig.do?dawn97

  2. Add a "[datasetkey] = [config_file_location]" line to the "location properties" file. Here's an example:

    dawn97 = /icpsr/samhda/dawn/dawn97.xml

  3. Create an XML configuration file that specifies the Quick Tables options for the dataset. Here's a simple example:


<sdaconfig>

     <progspath>/csm/7502docs/qrtesting/bin</progspath> 

     <pagetiles 
          head="/tiles/headSamhda.jsp" header="/tiles/headerSamhda.jsp" 
          menu="/tiles/menuSamhda.jsp" footer="/tiles/footerSamhda.jsp" 
          preresults="/tiles/preresultsIcpsrGeneric.jsp" 
          postresults="/tiles/postresultsIcpsrGeneric.jsp" /> 

     <study 
          studytitle="1997 Drug Abuse Warning Network (DAWN) Survey" 
          studypath="/csm/7502docs/qrtesting/SDAtest" studynumber="2834" 
          studycodebook="http://www.icpsr.umich.edu/SDA12/SAMHDA/dawn97/codebook/2834.htm" 
          studydescription="http://www.icpsr.umich.edu:8080/ABSTRACTS/02834.xml?format=SAMHDA" /> 
 
     <quicktables>
     
      	<table title="Attitudes about Military or Welfare Spending by Party, Race or Age Groups">
      
      	     <rowvars label="Select the type of spending you want to analyze">
                  <var label="Military spending">spend</var> 
                  <var label="Welfare spending">spend4</var> 
     	     </rowvars>
      
             <columnvars label="Select the breakdown that you want - by:">
                  <var label="Political party">party</var> 
                  <var label="Race groups">race</var> 
             </columnvars>
      
             <filter label="Limit table to one gender (optional):">
                  <var label="(Both genders)">##none</var> 
                  <var label="Men">gender(1)</var> 
                  <var label="Women">gender(2)</var> 
             </filter>
      
       	</table>
     
     </quicktables>

</sdaconfig>
You can see that the first two tasks are trivial. The last task is also quite simple from a technical point of view, but requires some thought in determining what constitutes meaningful Quick Tables options for a given dataset's variables.


Now let's take a closer look at each of these tasks in turn.

  1. Add a Quick Tables start-up link

  2. Add a line to the "Location Properties" file

  3. Create an XML configuration file for the dataset


1. Add a Quick Tables start-up link

The hyperlink that starts up Quick Tables for a particular dataset can be embedded in any HTML page. However, the link would probably be placed in a list of datasets -- and accompanying options -- something like the following:

Datasets

Abstract

Quick Tables

Data Analysis

Drug Abuse Warning Network (DAWN) Survey, 1997

Abstract

Quick Tables

Analysis

(Another dataset ...)

Abstract

Quick Tables

Analysis

In this example the "Abstract" link just goes to a static HTML page. The "Analysis" link is a typical start-up link to the SDA Analysis suite; it doesn't go to a static HTML page, but instead invokes an SDA CGI program -- "hsda". The "Analysis" link looks like this:

http://www.icpsr.umich.edu/cgi-bin/SDA12/hsda?samhda+dawn97

Note that the "Analysis" link not only invokes the "hsda" CGI program, but also passes along two pieces of additional information to that program: "samhda+dawn97". The "samhda" part names a particular "HARC" configuration file; the "dawn97" part specifies a particular dataset within that "HARC" file.

Now let's look at the "Quick Tables" start-up link. The "Quick Tables" link looks like this:

http://sdatest.berkeley.edu:8080/quicktables/quickconfig.do?dawn97

(First, and most obvious, this link goes to a server at Berkeley instead of ICPSR. But that's just temporary of course. In production, Quick Tables will run on an ICPSR server. The difference in this part of the URL is insignificant for the discussion below)

Like the "Analysis" link, the "Quick Tables" link doesn't go to a static HTML page but invokes a program. In this case, it isn't a CGI program but a Java "servlet" Web application called -- what else? -- "quicktables". Since this link is the entry point to the quicktables Web application, we need to invoke a specific "action" from the application -- the "quickconfig.do" (Quick Tables configuration) action.

Also, the link needs to pass along one extra piece of information -- the dataset "key" that tells the application which XML configuration file it should read to handle this request. Note that in this case, unlike the "Analysis" start-up link, the XML configuration file name isn't specified directly. Instead, "dawn97" is just a lookup-key; the actual name and location of the XML configuration file is contained in the "location properties" file we'll examine in step #2 below. You'll always be invoking the same servlet "action" when you start up Quick Tables. So the only difference in your various Quick Tables start-up links will be the dataset key you add at the end of the URL. So on the ICPSR server various start-up links might look something like this:

http://www.icpsr.umich.edu/quicktables/quickconfig.do?dawn97
http://www.icpsr.umich.edu/quicktables/quickconfig.do?ncrp9398p1
http://www.icpsr.umich.edu/quicktables/quickconfig.do?lsa8490

The lookup-key names are completely arbitrary (as long as they conform to a few simple rules we discuss below); they can be anything you find meaningful and convenient. The keys just have to match the keys in the "location properties" file we'll examine next.


2. Add a line to the "Location Properties" file

There is only one "location properties" file for the quicktables application and its format is very simple. Each line (other than blank or comment lines) is of the form: "[datasetkey] = [config_file_location]". So, for example, the "dawn97" dataset key might be paired with its location as follows:

dawn97 = /icpsr/samhda/dawn/dawn97.xml

The "location properties" file just provides the connection between a dataset key used in a Quick Tables start-up link and the actual location of the XML configuration file for that dataset. That's really all there is to it -- although we'll discuss the rationale for this design and some technical points below.

Rationale for the design: You might wonder why the Quick Tables application uses a location properties file to maintain the location of individual configuration files since the SDA Analysis suite doesn't use a similar scheme for storing the location of its "HARC" files. There are a number of reasons.

First, although SDA allows the use of multiple HARC files, the number actually used by a particular organization on its Web server is usually fairly small. In fact, many organizations use a single HARC file since they can put configuration information for all their datasets together without creating an unduly large file. However, the verbosity of the Quick Tables configuration files makes it impractical to put the configuration information for multiple datasets together in a single, mammoth file. Instead, the Quick Tables application uses a different configuration file for each dataset's group of tables. So there will be as many XML configuration files as there are datasets set up for Quick Tables.

Also, although the SDA Analysis "HARC" file(s) are typically all kept in the same "cgi-bin" directory as the CGI programs, the Quick Tables configuration files should probably be stored in different directory locations (and perhaps different accounts) for each topical archive. That way each archive can have easy access to its own configuration files without any danger of unintentionally altering other sensitive files. (Creating Quick Tables is a somewhat "iterative" process where you try something out and see how it looks, then tweak things a bit and try again.)

Taking all this into account, the location properties file gives a number of advantanges:

Technical note on specification of the location properties file: The location properties file can have any name and location that the system administrator desires. This information is registered in the servlet container's configuration file ("server.xml" for Apache's Tomcat) and is thereby made available to the Quick Tables application. The following snippet from an example "server.xml" file shows how it's done. The specification of the "location.properties" file is done in the <Environment> element by setting the environment variable "locproperties" to the path/name of the file on your particular system:

    <Context path="/quicktables" docBase="/csm/7507webapps/quicktables.war" 
            debug="0" reloadable="true" crossContext="true">
	    
          <Environment name="locproperties" type="java.lang.String" 
	          value="/csm/7507webapps/location.properties"/>
    </Context>

Technical note on file format: the "location properties" file is, in fact, a Java "properties" file and must follow the simple conventions of that file format. You can read the Javadocs for the "Properties" class if you really want to know the details, but here's the story in a nutshell:

Blank lines are ignored. Also, any line whose first non-whitespace character is a '#' or '!' is considered a comment line and ignored. For all other lines, the "key" starts with the first non-whitespace character and continues up to the first whitespace, '=', or ':' character. Technically, you don't need to insert a '=' (or ':') character between the dataset key and the location specification; you can just separate them with some whitespace. However, we recommend, for readability, that you separate the dataset key from the location with a '='. Here's an example location properties file with some comments on formatting:

# Any line, like this one, that starts with a '#' is considered a comment.
# The following lines are both OK, although the second one doesn't follow the 
# recommendation to use a '=' between the dataset key and the location.

dawn94 = /icpsr/dawn/dawn94.xml
lsa8490 /icpsr/lsa/lsa8490.xml

# The following lines are NOT OK.

# Internal whitespace in a dataset key is illegal.
dawn 97 = /icpsr/dawn/dawn97.xml

# Also, DON'T use the delimiter characters '=' or ':' in the dataset key.
dawn:97 = /icpsr/dawn/dawn97.xml


3. Create an XML configuration file for the dataset

DTD and Example

Now that you've specified the location of the dataset configuration file (in the "location properties" file) you just need to create the dataset configuration file itself. It is an XML file and must follow the conventions of XML.

The DTD for the configuration file -- which spells out exactly what the file must contain and can be used to validate your XML configuration files -- is available at http://sda.berkeley.edu/info/sdaconfig.dtd. (If your browser cannot display the DTD directly, here's a version formatted as HTML .) Important: you should make a local copy of the sdaconfig DTD and refer to that local copy in your own XML configuration files. If the DTD is on a remote server that is currently inaccessible, parsing will fail and Quick Tables will not run correctly. Even if the remote server is accessible, accessing the DTD over the network may hurt performance. Ideally, the DTD should be located on the same server as the Quick Tables application.

Here's a link to a an example XML configuration file. The file is heavily commented and a little more complicated than the example at the top of this page. It should give you a good idea of what's involved in creating a Quick Tables XML configuration file.

The <pagetiles> Element

One specification that needs a little more explanation is the "pagetiles" element. Each Quick Tables HTML page is made up of a set of "tiles" -- pieces of the page such as the header, footer, menu, etc. These tiles surround the "core" part of the page and together define a certain "look and feel". Each ICPSR special archive (or other group) can define its own page tile set and refer to those tiles in its configuration files. We've already created sets of tiles for the four archives that currently use SDA: SAMHDA, NACJD, NACDA and IAED. You can see the effect of these pagetile sets on the Quick Tables live demo page. It's these "pagetiles" that control the "style" for each archive.

Here's the "pagetiles" specification you'd use for a SAMHDA dataset:

<pagetiles 
    head="/tiles/headSamhda.jsp" 
    header="/tiles/headerSamhda.jsp" 
    menu="/tiles/menuSamhda.jsp" 
    footer="/tiles/footerSamhda.jsp" 
    preresults="/tiles/preresultsIcpsrGeneric.jsp" 
    postresults="/tiles/postresultsIcpsrGeneric.jsp" /> 

And here's the "pagetiles" specification you'd use for a NACJD dataset:

<pagetiles 
    head = "/tiles/headNacjd.jsp"
    header = "/tiles/headerNacjd.jsp"
    menu = "/tiles/menuNacjd.jsp"
    footer = "/tiles/footerNacjd.jsp"
    preresults = "/tiles/preresultsIcpsrGeneric.jsp"
    postresults = "/tiles/postresultsIcpsrGeneric.jsp" /> 

First, note that all the tiles are in the application's "/tiles" subdirectory and all the files have a ".jsp" extension. Second, note that the "head", "header", "menu" and "footer" attributes all use the same naming scheme -- ending with the name of the archive. The NACDA and IAED tile sets use the same naming scheme, ending with "Nacda" and "Iaed" respectively. The "preresults" and "postresults" attributes are special cases. These tiles are used only on the Quick Tables "results" page to explain the ouput. These tiles are the same for all four archives and therefore should always be specified as they are in the two examples above.

Using Temporary Recodes

You can use SDA's temporary recode syntax when you specify variables in a dataset's configuration file. For example:

   <var label = "Race groups">
      race( r:1 [White]; 2 [Black]; 3 [Hispanic]; 5 [Asian]; 4,6-* [All Others] )
   </var>

The only difference from the usual recode syntax is the specification of the (optional) category labels. In dataset configuration files you must surround the category labels with square brackets rather than double-quotes. Note that in the above example we use [Hispanic] rather than "Hispanic".

You'd typically use temporary recodes to alter a variable's coding scheme -- perhaps to collapse an unwieldy number of categories into a smaller number. But there's another use for recodes that isn't obvious at first but can be very useful: the temporay recode mechanism allows you to change or improve a variable's category labels without re-creating the variable by altering the DDL file and re-running MAKESDA.

We'll illustrate this with a real example that came up in the TEDS 2000 dataset. One of the category labels in the "MARSTAT" (Marital Status) variable is problematic: "DIVORCED/WIDOWED". When it's used as a column variable it won't "break" at the slash so "DIVORCED/WIDOWED" produces a very wide and unattractive column heading in the resulting table. It turns out that Internet Explorer won't use a '/' as a "break-point" to split up a heading in a table. It will use a blank -- like that in "NEVER MARRIED" or "NOW MARRIED" -- or a hyphen. But, for whatever reason, it doesn't use a "slash". Therefore, we'd like to replace the slash with a hyphen so the label will "break" between the words and produced a narrower column heading. However, we don't want to recreate the variable with MAKESDA to do it.

We can use the SDA temporary recode mechanism to do the trick. We just re-specify the label, not the codes by using the following entry in the configuration file:

 <var label="Marital status">
   MARSTAT(r:1 [NEVER MARRIED]; 2 [NOW MARRIED]; 3 [SEPARATED]; 4[DIVORCED-WIDOWED])
 </var>        

We haven't changed the original codes but we've changed the unattractive label. Now when the Quick Table is displayed the "DIVORCED-WIDOWED" column heading is split up between the words like the other labels.

You can use this same technique to shorten or just improve any category labels in Quick Tables without altering the SDA variables themselves -- a very handy tool.

Using XML's "External General Entities" for Repeated Information

When creating a Quick Tables configuration file, you often need to repeat certain passages of XML. It's time consuming and error prone if you just copy the same "clump" of XML from one place to another. Suppose, for example, you've just made ten copies of a certain convoluted passage of XML and you suddenly discover you need to change something in each copy. Ugh -- not a pretty picture. You'd like to keep a single copy of a passage like this and simply "include" it using a short-hand reference wherever it's needed in a file. Luckily the XML standard provides exactly that feature. It's called an "External General Entity".

Here's an example usage of an "External General Entity" taken from a Quick Tables configuration file created for the GSS Cumulative Dataset. For GSS Quick Tables we wanted to provide a "year of interview" filter for every table. You can view the XML text for this filter here. The text resides in an "external" file called "filtyear.xml" -- hence the name "external general entity". We want to "include" the contents of this external file in several places in our main configuration file. Here's how we do it:

  1. Declare the external entity as part of the DOCTYPE declaration near the top of the main configuration file:
    <!DOCTYPE sdaconfig SYSTEM "http://sda.berkeley.edu/info/sdaconfig.dtd" [
    <!ENTITY FilterYear SYSTEM "filtyear.xml">
    ]>
      
  2. Now, whenever you want to include this external entity/file in your main configuration file, you just refer to it like this:
     &FilterYear; 
    The contents of the "filtyear.xml" file will be included wherever the "&FilterYear;" reference is made.

Here's a more complicated example that uses a number of external general entities:

For more information on using external general entites you can see this brief tutorial from IBM's developerWorks or refer to almost any XML reference.


Last modified: 7/1/03