Approaches to the Management and Analysis of ICPSR Data
Use of ICPSR data requires facility with a statistical package. The most
widely used packages in the social sciences are SAS, Stata and SPSS. After
selecting your data management and analysis tool, you typically use it
to subset the relevant variables and cases from the original data into
a "workfile" in your user space. Several approaches are possible.
- Do data management and analysis on a UNIX system
- workfiles are extracted from the archival data files and then analyzed
in place.
- Do data management on a UNIX system, download your workfiles
to a PC and complete the analysis on a PC - in moving the extracted
workfiles, care must be exercised to ensure that they will be usable
in the PC environment. FTP downloads of system files must typically
be done in binary mode. In some cases, UNIX system files are not usable
in a PC environment and you must put them in a "portable" format before
downloading them.
- Do all data management and analysis on a PC - archival
data files are stored in compressed format, so copies of the compressed
files must be made available on the PC in an uncompressed format. Two
types of compression are in use. One type results in files with .Z
extensions. These files must be decompressed before they are downloaded.
This is done with the UNIX uncompress command ["man
uncompress" provides an online usage reference] and requires you to
write the uncompressed version to your UNIX file space. The other type
results in files with .gz extensions. These files can
be downloaded directly with binary FTP operations and decompressed on
a PC with the WinZip utility. All new data orders are being processed
with gzip compression to facilitate PC downloads, and any existing holdings
in .Z format will be transformed to gzip compression upon request.
(Top of page)
|