SDA 4.1 Documentation for SDALOG


NAME

sdalog - Generate a report of SDA usage

USAGE

sdalog -g filename [options]

DESCRIPTION

SDALOG reads the SDA logfile and generates a report on SDA usage. Note that the logfile read by SDALOG is the special file written by SDA -- not the access log maintained by the web server software. The name of the logfile is specified in the SDA Manager as a Global Specification that applies to a group of SDA datasets.

OUTPUT FORMATS

The default output format reports the following information:

An optional format (used if a '-c' option is specified) breaks down usage by client address (rather than by dataset).

If the '-c all' option is used, this style of report will include the full addresses (the hostname, if available, or else the numeric IP).

If the '-c 1', '-c 2', or '-c 3' option is used, then only the last one, two, or three final segments of the hostnames (if available) are displayed. For example, the last segment of the hostname is the top level domain name -- like 'COM' or 'EDU'. If the '-c 1' option is specified, a summary of usage by those top level domains will be generated. With these options, the numeric IP addresses are not displayed separately; rather, they are combined together and reported as a group.

Note that the ability to display all or part of a hostname assumes that Tomcat has been configured (when SDA was installed) to resolve IP addresses (usually possible only for a subset of client addresses). See the SDA Installation Guide for more information on configuring Tomcat to show hostnames in the SDA log file.


OPTIONS

The following command-line options are recognized. Some options affect the logfile -- the pathname of the file, and which records in the logfile should be included in the report. Other options affect the output format -- whether to produce the default output or the optional client address output. The only required option is the specification of the name of the logfile.

Log File Options

-g filename
The specified filename is the pathname of a logfile maintained by SDA. (REQUIRED)

-r range_of_dates_filter
The report can be limited to a range of dates (or a single date). A date must be in the format MM/DD/YYYY. For example: 12/31/2023. The year must be the full four digits. However, single digit months or days do not need to be filled with a leading zero. For example: 1/5/2024. A range of dates must be separated by a hyphen. For example: 6/1/2023-12/31/2023. A date range specification cannot contain any spaces. This option cannot be repeated.

-s study_name_filter
The report can be limited to specified study name(s). The match is case-insensitive. Multiple study names can be matched in one specification by using an asterisk (*) as a wildcard. The asterisk will match any characters (of any number). Also, multiple asterisks can be used within a study name. For example, a study name specification of 'anes*' will match 'anes', 'anes2020', 'anes-current', etc. A study name specification of '*nes*' will match 'nes', 'nes2000', 'anes', 'anes2004', etc. If wildcards are not used, then the specification must match the full name of the dataset in the SDA log file. Note that only one study specification can be used with each -s option. But the -s option can be repeated.

-a address_filter
The report can be limited to one or more client addresses (hostname or numeric IP). The match is case- insensitive. An asterisk (*) can be used as a wildcard to match any characters (of any number). Multiple asterisks can be used within a specified address. For example, a specification of '*berkeley.edu' will match 'airbears.berkeley.edu', 'reshall.berkeley.edu', etc. (Note that due to the specificity of client addresses it is often necessary to use wildcards to get useful results.) Note that only one address specification can be used with each -c option. But the -c option can be repeated.

Output Options

-o filename
Output from SDALOG will be written to this file. If this option is not specified, output will be routed to the user's screen (standard output).

-c all
The report will list the full client addresses (hostnames, if available, or numeric IP addresses if not) of the computers used by the SDA users (instead of the default output format). The number of procedures executed by each client will also be reported. This is the only option that lists numeric IP addresses individually. The other client-based options combine all the numeric IP addresses into a single group.

-c 3
All 3 segments of hostnames will be reported. Example: airbears.berkeley.edu

-c 2
The last 2 segments of hostnames will be reported. Example: berkeley.edu

-c 1
Only the last segment (top level domain) of hostnames will be reported. Examples: 'edu' or 'com'

Miscellaneous Options

-x filename
Write lines with badly formed log entries (if any) into this file. This option is for diagnostic purposes.

-u
Print out a list of options (but do not execute the program)

Deprecated Options

The addition and/or enhancement of various options has made some older options obsolete. The -f, -F, and -e options have been deprecated and will be removed entirely from a later version of SDALOG.

EXAMPLES

Basic example
sdalog -g SDAlog -o logreport.txt

Filter for a specific dataset (GSS2020)
sdalog -g SDAlog -s gss2020 -o logreport.txt

Get the top level domains of users
sdalog -g SDAlog -c 1 -o logreport.txt


CSM, UC Berkeley/ISA
June 6, 2024