Computing Tools
 Data Resources
 Staff Services  
 Contacts   
 Home

Techniques for Accessing ICPSR Holdings on a UNIX System

Accessing ICPSR holdings on a UNIX system entails the following steps:

  1. Locate the study Data Information Sheet obtained from the Duke search mechanism or from the notification you received regarding a new order. This sheet details the specifics of the files comprising the study including: file content, the physical file name, the record length (which is particularly important for data files) and a record count (often equivalent to the number of observations).
     
  2. Login to the UNIX system you will be using. If you are using the ACPUB system, you must login to godzilla.acpub.duke.edu to access the archive directory. The archive directory is accessible from each of the central UNIX compute servers and those used by Sociology and Economics.
     
  3. Confirm the availability of the data by changing to its directory location:
      cd /opt/archive/icpsr/s####
     
    where: #### = the ICPSR study number

  4. List the files in the directory and compare them against your data info sheet:
      ls -l
  5. Review the documentation and determine what portions of the study are of interest. Depending on the complexity of a study, data management issues can become very involved, so make sure you carefully think through the issues.
     
  6. For each raw data file requiring access, you will need to include in the program that reads it a file reference statement that instructs the UNIX operating system to decompress the data on the fly and pass the decompressed data to the statistical package being used to extract a workfile. Illustrated below are the techniques used for SAS. [Details for performing the same operations in SPSS and Stata will be added at a later date.] The matter is further complicated by differences in procedure for different UNIX operating systems.
     
    For systems running Tru64 UNIX (Sociology), one simple format applies in all instances:
      filename form1 pipe 'zcat /opt/archive/icpsr/s2939/da2939.form1.gz' lrecl=1237;
    The pipe option instructs SAS t


Webmaster:socsciweb@aas.duke.edu
l>>