Computing Tools
 Data Resources
 Staff Services  
 Contacts   
 Home

Organization of ICPSR Holdings

The organization of local data holdings addresses several needs:

  • It provides a central storage location with distributed access. Social scientists typically access these data from UNIX systems in Arts and Sciences. However, anyone at Duke can access the holdings through their ACPUB account by login to gateway host godzilla.acpub.duke.edu.

  • As local holdings accumulate through various requests, a growing body of data are becoming available, the content of which is user driven.

  • The use of ICPSR data on UNIX systems does not require the user to copy the often large archival files into personal directory space.

  • The archives are directly accessed on UNIX systems, so the researcher need only to be concerned about space for the workfile extracts.

The machine-readable files distributed by ICPSR fall into three broad categories:

  • Raw Data - consisting of alphanumeric text files stored in compressed format. With compression the data occupy the minimal possible disk space. (With most compressed files there is a savings on space of 80% or more.) On a UNIX system the data can be decompressed and piped on the fly to several different statistical packages. Data are rarely archived in the system file format of a statistical package because such files are several orders of magnitude larger than compressed raw data.
     
  • Statistical Package Control Statements - many studies now include sets of SAS or SPSS control statements used for reading in the raw data, defining variable names, value labels and missing values. When available, the user should copy the control statements to their directory and modify them into the program that extracts the desired subset from the archival data.
  • Documentation - most commonly in codebook format. For many years ASCII text codebooks were distributed, but of late the standard has shifted to the Portable Document Format (PDF), which requires an Adobe Acrobat reader. PDF files are larger, but allow for better quality documents (including pretty-formatted survey instruments, schematics, diagrams, and the like) and the facility to selectively print from them. In general, a much wider variety of documentation is becoming available with new study releases, but the cost of printing is shifting to the user.

One or more data files is associated with each study. Control statement and documentation files are optional and less likely to be found with older studies.

 

(Top of page)

 



Webmaster:socsciweb@aas.duke.edu