Computing Tools
 Data Resources
 Staff Services  
 Contacts   
 Home

Approaches to the Management and Analysis of ICPSR Data

Use of ICPSR data requires facility with a statistical package. The most widely used packages in the social sciences are SAS, Stata and SPSS. After selecting your data management and analysis tool, you typically use it to subset the relevant variables and cases from the original data into a "workfile" in your user space. Several approaches are possible.

  • Do data management and analysis on a UNIX system - workfiles are extracted from the archival data files and then analyzed in place.
     
  • Do data management on a UNIX system, download your workfiles to a PC and complete the analysis on a PC - in moving the extracted workfiles, care must be exercised to ensure that they will be usable in the PC environment. FTP downloads of system files must typically be done in binary mode. In some cases, UNIX system files are not usable in a PC environment and you must put them in a "portable" format before downloading them.
     
  • Do all data management and analysis on a PC - archival data files are stored in compressed format, so copies of the compressed files must be made available on the PC in an uncompressed format. Two types of compression are in use. One type results in files with .Z extensions. These files must be decompressed before they are downloaded. This is done with the UNIX uncompress command ["man uncompress" provides an online usage reference] and requires you to write the uncompressed version to your UNIX file space. The other type results in files with .gz extensions. These files can be downloaded directly with binary FTP operations and decompressed on a PC with the WinZip utility. All new data orders are being processed with gzip compression to facilitate PC downloads, and any existing holdings in .Z format will be transformed to gzip compression upon request.

 

(Top of page)

 



Webmaster:socsciweb@aas.duke.edu