SAS (Statistical Analysis System)
Cute Girl • onInformation 11 years ago • 4 min read


The SAS (Statistical Analysis System) is a software package for data manipulation and statistical analysis. The user writes a SAS program to perform the desired tasks. A SAS program is composed of 2 fundamental components:

DATA step(s) -- the part of the program in which a structure for the data to be analyzed is created. This structure exists while the program is running. Variables corresponding to the various elements of the data set are defined, and the data are assigned to the variables. Data may be input manually in the body of the program, or they may be read in from a file. We will not get very fancy in this course with data input and manipulation, but be aware that it is possible to perform very complex data manipulation and to analyze very large data sets using SAS. It is also possible to create permanent data sets and files for use with later SAS runs.

PROCs (PROCedures) -- the SAS language is organized into a series of procedures, or PROCs, each of which is dedicated to a particular form of data manipulation or statistical analysis to be performed on data sets created in the DATA step. In the example programs below, we will consider several different PROCS:

PROC MEANS: computes means, standard deviations and other summary statistics for some or all of the variables in a data set.

PROC TTEST: computes the 2-sample t-test for comparing the means of 2 treatments.

PROC REG: performs regression analysis using the method of least squares.

PROC GLM: constructs the analysis of variance desired by the user, with associated F statistics, constrasts, etc.

PROC PLOT: constructs plots of the data as specified by the user.

PROC PRINT: prints the contents of a data set.

There are many other PROCs to perform data and other kinds of analyses; The SAS ocumentation describes these. We will discuss features of the PROCs above and some additional PROCs in lecture, demonstration labs, and in homework assignments.

A SAS program consists of one or more DATA steps to get the data into a format that SAS can understand and one or more calls to PROCs to perform various analyses on the data.

Several example SAS programs using each of these PROCs are available. The programs are meant to illustrate the features of data input using the DATA step that will be useful to us in the course as well as the syntax used in SAS PROCs. The programs are discussed below and can be viewed and run on SICL. They all reside in the directory


The example programs are mneumonically named: analysis of variance of a randomized complete block design using PROC GLM simple linear regression analysis using PROC REG and PROC PLOT summary statistics using PROC MEANS 2 sample t-test using PROC TTEST paired comparison t-test using PROC MEANS

The programs are annotated using comment statements (see below) with descriptions of the functions being executed by each line. The first two programs are discussed in detail below.

To view each program, get into SAS and use the SAS command

include '/pub/st512/md/nameofile'

on the command line of the PROGRAM window. Here, nameoffile refers to one of the 5 names above. The program will appear in the window. You may scroll through the program on the screen or print a hard copy to look at according to the instructions in the manual. You may then submit (execute) the program and view the results in the OUTPUT window. A hard copy of the results may also be printed out or written to a file as described in the manual. You may also get a hard copy of the program.

If you wish to practice editing, you may copy the program file to a file in your own directory by using the command

file 'nameyouchoose'

on the command line of the PROGRAM window. For example, you may wish to save '' to your own directory under the name '' To do so, simply type

file ''

No directory information is needed if the file is to be in your own directory; it will automatically be placed there. When the include command is used later to retrieve it, no directory information is required.

This will save a copy of the file to the name you choose. Then, clear the window and include the renamed file in the PROGRAM window so that you will be working with it rather than the original. Now, you may practice editing and may augment the program if you wish.


Login to add comments on this post.

  • Guest 10 years ago

    How to determine the executing program name and path programatically:

    Oftentimes, I was asked to keep the name and path of the executing program in the FOOTNOTE of the generated table or listings.I have always created a macro variable using the %let statement and, then I called the Macro variable in the footnote statement to get the name of the program. Eventhough it is simple.. it may not be suitable when we write application which need to self document...

    Read more at:

    How to check if a variable exist or not: In SAS sometimes, we need to check whether the variable is exist in the dataset or not, we usually run the proc contents program and physically check if the variable exist in the dataset or not.If we want to check it programmatically, then use the following code....

    Read more at:

    How to check if the File is exist or not in SAS: Guys… Let me try explaining how to check if the file exist in the library or directory using SAS.Here I am writing a macro to check if the file exist in the directory or not.Here is the way to

    check it…

    SAS programming errors we make..... can be deadly sometimes The errors I will list here will be very few in number. They are errors that you will likely make at some time if you do not remain alert. These errors could have serious consequences so that is why I have described them as "deadly errors".

    Read more at:

    Proc Sort NODUP vs NODUPKEY

    Somany times people from my orkut community asked me what is the real difference between the Proc sort Dodup and the Proc Sort nodupkey. I always wanted to answer the question in a better way…. and folkes… here is the answer ..... A common interview question for SAS jobs is "What is the difference between proc sort nodup and proc sort nodupkey?". The answer the interviewer is expecting is usually "proc sort nodup gets rid of duplicate records with the same sort key but proc sort nodupkey gets rid of other records with the same sort key". However, this is not correct.

    Read more at:

    PROC TRANSPOSE: How to Convert Variables(columns) into Observations(ROWS) and Observations(ROWS) into Variables(Columns)

    During my early days as a SAS programmer, I get confused very easily with PROC TRANSPOSE. I mean, I get confused with what are the variables that I need to include in BY statement and ID statement as well as in VAR statement.

    Read more at…:

    Displaying the Graphs (Bar charts) using PROC GCHART in SAS Displaying the Graphs (Bar charts) using PROC GCHART in SAS :Just a day ago, I have received a question on Graphs in my orkut community, which prompted me to create following examples.Below are 5 different types of Graphs were produced (including 3d graphs) using Proc Gchart.

    Read more at:

    Change all missing values of all variables into zeros/putting zeros in place of missing values for variables

    I always wondered how do I convert missing values for all the variables into zeros and In this example the I have used array to do the same. The variable list includes ID and Score1 to score6.Using simple array method we can change all the missing value for the variables score1 to score6 to 0.

    Read more at:

    How to Save LOG file in the required location:

    Here is the simple code which allows us to save the log file in the required location. Use Proc printto procedure to save or print the log file. filename dsn ‘C:\Documents and Settings\zzzzzzzzzzzz\Desktop\LOGfile.lst'

    Read more at:

    Calculating group totals and the counts within each group Sample 25217: Calculating group totals and the counts within each group This example uses the SUM() function to sum the AMOUNT column, creating a new column named GRPTOTAL with a COMMA10. format. The COUNT() function counts the number of occurrences of STATE within each group. The GROUP BY clause collapses multiple rows for each group into one row per group, containing STATE, GRPTOTAL and the COUNT.

    Read more at:

    How to customize page numbers in RTF output Usage Note 24439: In SAS 9.1, are there easier ways to customize page numbers in RTF output? direct link here Yes, beginning with SAS 9.1, page numbers can be customized in the RTF destination by using an escape character and the {thispage} function, {lastpage} function, {pageof} function, or all three:

    Read more at:

    How to calculate number of years and number of days between 2 dates; How to calculate number of years and number of days between 2 dates;Exploring the yrdif and datdif functions in SAS as well as INTCK function:There are several ways to calculate the number of years between two dates and out of all the methods, YRDIF function results the most accurate value.

    Read more at:

    How to create a comma separated file (.csv) of a SAS dataset? IN SAS programming, we often require outputting the dataset in different formats like EXCEL and CSV etc and here are the five different ways to export the SAS dataset into .csv file.

    Read more at:

    How to Import Excel files into SAS

    Reading from Excel Spreadsheets:Microsoft Excel spreadsheets can be read from SAS in several ways. Two of these will be demonstrated here. First, PROC IMPORT allows direct access to Excel files through SAS/Access to PC File Formats or access to Comma-Separated (CSV) files through Base SAS. The second method uses the Excel LIBNAME engine.

    Read more at:

    How to store a number more than 8 digits for a numeric variables Q&A: numeric variables length more than 8? We all know that the default length of the numeric variables in SAS is 8 and if suppose I want to store a number lets say (12345678910, which has a length 11 to numeric variable) to variable total, what should I do?

    Read more at:

    Options VALIDVARNAME=UPCASE VALIDVARNAME= V7 UPCASE ANYVALIDVARNAME= option is generally used in SAS whenever we want to control the SAS variable names in the dataset.

    Read more at:

    How to merge data sets with a common variable?

    Here is the simple way of merging the data sets with a common variable if the datasets has the same prefix name.For example: col1-col10, dsn1-dsn 7 , or data1 to data10 with common variable of ID.Considering we have 10 datsets and all of them having the same prefix data;

    Read more at:

    Merging the data sets with a common variable if the datasets has the same prefix name?

    For example: col1-col10 dsn1-dsn 7 data1 to data6 with common variable of is the example, I have 7 datasets i need to merge and each of them having the common variable(usubjid) to merge, and all the datasets having the same prefix dsn(dsn1 to dsn7).

    Read more at:

    when to use &,&&,and &&&,how do we distinguish(Multiple Ampersands) and Diff. Between single dot and double dots in macros Here are the 2 important questions always comes up in our minds,(& vs && vs &&& and single dot and double dots) when we are dealing with macros for the first time and here are the answers for them.I did find a very good regarding the above topics in the one of the SAS forums and IAN WHITLOCK explained it very clear.

    Read more at:

    How can I count number of observations per subject in a data set?

    We always have this question in mind, while we do the SAS programming and here is the simple answer for that, we just need to use SUM statement and the FIRST.variable in the SET statement and then the RETAIN statement to calculate the observations count per subject.

    Read more at:

    How to remove the duplicate observations in the dataset using PROC SQL, DATASTEP/PROC SQL/or PROC SORT etc?

    Before using a particular step to remove the duplicate observations, we should understand that the duplicate observations are pertaining to the key variables like usubjid, treatment, patientno. etc, which are unique or exact duplicates( 2 or more observations has the duplicates with respect to all the variables in the dataset).If the observations are exact duplicates with respect to all the variables in the dataset, we can remove the exact duplicates by:

    Read more at:

    How to scan more than 20 records to determine variable attributes in EFI:

    In Versions 7 and 8 of the SAS System, by default the Import Wizard, PROC IMPORT and the External File Interface (EFI) scan 20 records to determine variable attributes when reading delimited text files.Changing the default setting can only be done for EFI in Version 7, Release 8 and Release 8.1. Beginning in Release 8.2 changing the default setting is applicable to the Import Wizard, PROC IMPORT and EFI.

    Read more at: