Browse by Domains

SAS Tutorial: A Beginner’s Guide

OVERVIEW

SAS is the leading analytical software used in a variety of business domains such as Insurance, Healthcare, Pharmacy, Telecom etc. Any professionals or students with little or no programming experience can learn SAS, as it’s easy to learn and also over the years SAS has added numerous solutions in big data analytics, Predictive analytics, Fraud management, Health science analytics which makes SAS as an asset in many jobs markets. 

In this article, you will get to understand the flow of SAS programming and its important concepts which will help the new users, beginners or any professional to acquire quick knowledge of Base SAS programming steps in less time.

INTRODUCTION

SAS (Statistical Analysis Software) is a software developed in the year 1960 by the SAS Institute. It is one of the most popular software in the field of analytics. Useful for the following purposes:

  • Data management, Data mining 
  • Statistical analysis, business modelling.
  • Report writing and Graphics.
  • Data extraction
  • Data transformation
  • Data update and modification
  • Business Intelligence
  • Operations Research and Project Management 

SAS software suite has more than 200 components some important and in-demand SAS components and their usage listed below:

SAS componentsUsage
Base SASData management, data analysis and reporting.
SAS/STATStatistical analysis and modelling
SAS/GRAPHGraphs, Charts and Plots
SAS/ETSEconometrics and Time Series Analysis
SAS/PHClinical trial analysis
Enterprise MinerData mining
Enterprise GuideGUI based code editor & project manager

SAS INSTALLATION

SAS Software is available for free both offline and online to learn Base SAS programming.

The following two ways you can access SAS software.


SAS University Edition
Installation RequiredRun on local machine (laptop / Desktop)No Internet Required






Link is here:https://www.sas.com/en_in/software/university-edition/download-software.html
Please read the requirement details before you start downloading.

SAS OnDemand for Academics
No Installation Required. Set up your account by registering yourselfYou can access it from anywhere as it is run on cloudInternet RequiredIt is available for everyone not just college students. Link is here:https://www.sas.com/en_in/software/on-demand-for-academics.html  

SAS can run on any operating system either Linux or Windows. SAS is driven by SAS programmer who use several sequences of operations on the SAS datasets to make proper reports for data analysis. Mostly SAS uses on Windows in organization or institute but some organizations prefer Linux but there is no graphical user interface and you have to write code for every query. Windows SAS have a lot of utilities which help programmers very much in writing codes and in saving time.

SAS INTERFACE AND LIBRARY

In this article, I will be using SAS windowing environment interface to make you understand the SAS programming structure and syntax. You might get SAS studio interface after installing SAS university edition in your system or in online SAS academic cloud-based access. There are little differences in interface between SAS windows and SAS studio else all the syntaxes and functionality is the same. You can follow all the below programs in SAS studio environment with the same syntaxes.

After installation of SAS software you will get the working windows which will help you to write the programs, to view the program execution and results. 

Let’s understand what are those working windows that SAS provides to work on.   

3.1 SAS interface have five important windows:

  1. Code/Editor window

Editor window is that part of SAS where we write all the codes. 

  1. Log window

A log window is like an execution window where we can check the execution of the SAS program. In this window we can check the errors also. It is very important to check every time the log window after running the program. So that we can have proper understanding about the execution of our program.

  1. Output window

Output window is the result window where we can see the output of our program.

  1. Result window

It is like an index to all the outputs. All the programs that we have run in one session of the SAS are listed there and you can open the output by clicking on the output result. But these are mentioned only in one session of the SAS. If we close the software and then open it then the Result Window will be empty.

  1. Explore window

Here all the libraries listed. You can also browse your system SAS supported files from here.

3.2 SAS Library:

SAS libraries are the storage in SAS which can be Temporary storage or Permanent storage. In short SAS libraries are of two type 1- Temporary library (Work library or Default library) and 2- Permanent library.

Let’s get dive more in detail about SAS libraries:

Temporary or Work Library
It’s by-default library in SAS. All the datasets are stored in the work library, if we do not assign any library (libref) name.All the store dataset in work library will be temporary. i.e. if we end the SAS session the dataset get deleted. Can check work library in the explore window. Dataset which is in work library will be either one-level name or two-level name. Example: ABCD [one-level name] or Work.ABCD [Two-level name]Here work is library name, ABCD is dataset   name. If we don’t define Work as a library SAS will automatically store ABCD dataset in work library. Must use (.) period sign between library and dataset name.
Permanent Library 

Permanent library can be assigned by using key word LIBNAME.User have to create library by using SAS utilities or by writing the codes in the editor window.After you end your SAS session, the contents of a permanent library still exist in the physical location.Permanent library dataset name will always have Two-level name. Example: NewLib.ABCD [Two-level name]  Here NewLib is a permanent library (libref) assign to the ABCD dataset.Must use (.) period sign between library and dataset name.

SAS Library naming rules:

  • LIBNAME = Key word for assigning a library.
  • libref = a valid SAS beginning with either a letter or an underscore and having not more than 8 characters. 
  • SAS-library= the path and name of the directory enclosed in quotes.
  • End the statement with semicolon (;)
  • Run is an optional, as Libname is a global statement so it can be executed without run. 

SAS TERMINOLOGY AND DATA STRUCTURE 

  • DATASET: Dataset or Table is a combination of rows and columns. It is a collection of observations.  
  • OBSERVATION: In a dataset an observation or row is a data values associated with a particular record.
  • VARIABLE: In a dataset a variable or column is a set of data values that describes a given attribute. Two main types of variable in SAS: numeric and character
  • Use $ sign for character variable identification 
  • DATA VALUE: The basic unit of information. 
  • In dataset the default value for numeric variables is (.) period sign.
  • In dataset the default value for character variables is ( ) blank space.

Sample Dataset

SAS NAMING RULES AND SYNTAX

Like other programming language, the SAS language also has its own rules of syntax to create the SAS programs. The Three components of any SAS program- Variables, Datasets and SAS statements follows the rules on syntax.

Let’s understand the SAS variable, dataset naming rules and basic of SAS language before we write our first SAS code.

SAS Variables Naming: 

SAS variables represent column in SAS dataset. The following rules must follow in naming variables.

  • Must begin with a letter (A-Z, a-z) or an underscore (_). 
  • Can continue with any combination of number or letter like – (A-Z, a-z), (_) or (0-9).
  • It can be up to or between 1 – 32 characters long. 
  • Characters after the first may be letters, digits or underscores. Example: NAME, _NAME_, FILE1, _NULL_ etc.
  • Variable names are case-insensitive.

SAS Dataset Naming: 

The following rules must be in mind before giving the name to a dataset.

  • Must begin with a letter (A-Z, a-z) or an underscore (_). 
  • Can continue with any combination of number or letter like – (A-Z, a-z), (_) or (0-9).
  • It can be up to or between 1 – 32 characters long
  • Examples: _File1234, Newlib._File1234, PG_Rated_Movies.
  • In example dataset name without library and (.) period sign indicates a temporary dataset name whereas dataset name with library (Newlib) and (.) period sign indicates permanent dataset name.

SAS Language: 

We must understand the basic of SAS language before jump into writing any SAS statements.

  • SAS statements are free format, means that it can begin and end anywhere either in one line or in multiple line.
  • SAS statements end with a semicolon (;) which marks the end of the statement. 
  • SAS statement began with any key word like- DATA, PROC, LIBNAME, TITLE etc.
  • SAS keywords are case-insensitive.
  • SAS statement are of two types: 1- SAS statements that are used in DATA step or PROC step. 2- SAS statement that are global in scope and can be used anywhere in a SAS program like- LIBNAME, TITLE, OPTIONS, FOOTNOTE.
  • Must use the RUN statement at the end of the SAS statement.

SAS File Extensions

The SAS programs, data files and the results of the programs are saved with various extension’s system.

  • *.sas- It represents the SAS code file which can be edited using the SAS editor.
  • *.log- It represents the SAS log file which contains information’s such as error, warning and dataset details for a submitted SAS program.
  • *.sas7bdat- It represents SAS data file which contains as SAS dataset including variable, observations, labels.

SAS Comments:

Comments in SAS is a message to brief about the programming done by the programmer. Comments in SAS code are specified in two ways. These two formats are below.

  • *comment here;   Comment can be mention in the form of *comment;  it can be in multiple lines and can be of any length.

Example- *My First Programming; 

  • /*comment here*/ – Comment in the form of /*comment here*/ is more frequently used it can also be in multiple lines and can be of any length

Example- /*This is a comment for the SAS program*/ 

                /*This is my first SAS program */ 

Comments in SAS

SAS PROGRAMMING FLOW

A SAS programs is a sequence of steps which can be DATA step, PROC step or combination of DATA and PROC step to get the desire output. The following steps perform in any successful execution of a SAS program.

  • A program first involves in reading raw data (.xlsx, .csv, .txt, .access), dataset (.sas7bdat) etc.
  • After import of data files now comes the role of DATA step which used to create new SAS datasets for manipulating, evaluating and computing data.
  • Now the newly created dataset is ready which have our desire and require data values only after validation and manipulation.
  • PROC step is a procedure step are used to generate list and summary report. 

Below flowchart shows the steps for SAS program execution. Every SAS program must have all these steps to complete reading the input data file, analyzing the data and giving the require output of the analysis.

SAS programming Flow 

Importing raw data file, managing data by Data step and generating reports by Proc step is a broad topic. In this article I will discuss the important and useful SAS import procedure, data and proc step syntaxes by programming example. 

So, let’s get started by writing SAS program!!  I will be using “Italic font” for SAS keywords.

 READING RAW DATA

           Import raw data by IMPORT procedure:

           The basic of “PROC IMPORT”– 

  • The import procedure reads data from an external data source and writes it to a data set.
  • The import procedure can import delimited flies (blank, comma, or tab), Excel files.
  • The import procedure generates the specified output SAS data set and writes information about the import to the SAS log file 

 SYNTAX- 

PROC IMPORT DATAFILE= ‘filename’ OUT=SAS-dataset DBMS=identifier     REPLACE;

GETNAME=YES/ NO;

  • In “filename” specify the complete path for the input PC file.
  • In    SAS-dataset the output will be gets generated, must specify the one-level or two-level (library and data set) SAS name.
  • In DBMS specifies the type of data to import like DBMS= XLSX, DBMS=CSV,  DBMS=DLM
  • REPLACE option overwrites an existing SAS data set if the same name data set present in the library.         
  • GETNAMES=Yes / No – Is an optional statement to declare first row in the output dataset as a variable name. 

Example 1: Importing an Excel file

proc import datafile=’E:\PRACTICAL_EXAMDATA\INPUT\boot.xlsx’ out=newlib.bootData

dbms=excel replace;

getnames=yes;

run;

Importing Excel file data in SAS dataset

In Example-1 Proc import procedure executed to import “boot.xlsx” excel data file in permanent library Newlib by using DBMS= excel type data with option Replace and Getnames=Yes in editor window. After the successful program runs and executed, its process information get generated in the log window.

Example 2: Importing a CSV file 

proc import datafile=’E:\PRACTICAL_EXAMDATA\INPUT\new_hires.csv’ 

out=newlib.hireData

dbms=CSV replace ;

getnames=yes;

run;

Importing CSV file data in SAS dataset

In Example-2 Proc import procedure executed to import “new_hires.csv data file in permanent library Newlib.hiredata dataset by using DBMS=CSV data type with option of Replace and Getnames=Yes in editor window. The successful program runs and get executed, its process information generated in the log window.

DATA STEP STATEMENT

      The basic of “DATA STEP”– 

  • DATA STEP is a key word use to create SAS dataset 
  • Use to read and modify data.  
  • Use to defining the variables
  • Reads input files
  • Creating new variables
  • Assigning values to the variables

8.1- Creating new Dataset

SYNTAX –

DATA  dataset name; 

SET  dataset name;    

sas statement (option)

Run ;

Where

  • dataset name in DATA statement is the name of the SAS dataset to be created (Output Data set).
  • dataset name in SET statement is the name of the SAS dataset to be read (Input Data set / Source data set).
  • SAS statement (option) is an optional statement can be use in creating, modifying, updating etc.

Example 3: Creating new data set using DATA step and SET statement 

Data New_Hires;       

Set  Newlib.hiredata; 

run;

In Example-3 the DATA statement tells SAS to name the new data set, New_Hires and store it in the temporary library work. The SET statement in the DATA step specifies the input data set newlib. Hiredata (two-level name) and the result output stores in the New_hires in work library.

8.2 Using Subsetting IF statement

The Subsetting IF statement causes the DATA step to continue processing only those observations that meet the conditions of the expression specified in the IF statement. The resulting SAS dataset contain a subset of the original external file or SAS dataset.

SYNYAX-

DATA  data-set name; 

SET  data-set name;    

IF expression ….;

Run ;

  • When IF expression is true, the DATA step continues to process that observation.
  • When IF expression is false, no further statement are processed for that observation, and control returns to the top of the DATA step.
  • IF statement works for both Character and Numeric variables.
  • Can use the Comparison operators or symbol with IF statement. 
  • Operators like = (equal to), ^= (not equal), > (greater than), < (less than), >= (greater than or equal to), <= (less than or equal to).
  • Enclose the character value in quotation marks. Like- IF name=’John’; 
  • No need to use the quotation marks with numeric values. Like- IF income>20000;
  • Must use the value exactly as it appears in the data set.
  • For selecting multiple observations that meet conditions you can use Logic Operators AND (&), OR (|).
  • OR    operator use if either expression is true, then expression is true.
  • AND operator use if both expressions are true, then expression is true.

Example 4: IF statement 

Data India_Hires;       

Set  Newlib.hiredata;

If country =’India’; 

run;

In example-4 the DATA statement tells SAS to name the new data set, India_Hires and store it in the temporary library work. The IF statement selects only observations where country is ‘India’ from data set NewLib.Hiredata and store in the India_Hires data set.

Example 5: Selecting multiple observation using IF with OR statement in DATA step.

In Example-5 the DATA statement tells SAS to name the new data set, India_Fiji and store it in the temporary library work. The IF statement selects only observations where country is ‘India’ or country is ‘Fiji’ from data set NewLib.Hiredata and store in the India_Fiji data set. OR operator works as a logical condition which select both the observations because the given expression is true.

Example 6: Selecting multiple observation using IF with AND statement in DATA step.

In Example-6 the DATA statement tells SAS to name the new data set Fiji and store it in the temporary library work. The IF statement selects only observations where country is ‘Fiji’ with company name Magna Ut Corporation’ (highlighted in image) from the data set NewLib.Hiredata and store in the Fiji dataset. Remaining company name with country ‘Fiji’ will not store in the output data set Fiji. Remember While using AND operator both the condition should be true otherwise the dataset result will show 0 observation in the output data set.

8.3 Creating Variables 

Creating, assigning, calculating and transforming variables in any DATA step is done by using an assignment statement. The assignment statement is a combination of variable and expression which do not begin with any keyword. 

SYNTAX-

Variable=expression;

Where, 

  • Variable is any name of new or existing variable
  • Expression is any valid SAS expression
  • Operators which can be use- Comparison operators, Logical operators discussed above and Arithmetic operators like- * multiplication, / division, + addition, – subtraction, ** exponentiation.   
  • Value used with an Arithmetic operator is missing, the result of the expression is missing.  
  • You can use the parentheses to control the order of operations.

8.4   LENGTH statement 

Before assigning variables let’s understand first about how to specify lengths for variables to avoid truncate of variable values.

SYNTAX-

LENGTH variable (s) $ length;

Where,

  • LENGTH is a key word
  • Variable (s) is the name of variable or variables to be assigned a length
  • $ (dollar sign) while creating character variable.
  • length is an integer value to specify the length of the variable.   
  • LENGTH statement always to be use before the assignment statement in the DATA step
  • By default, length for a numeric variable is 8 and for character variable is set at the first occurrence of the variable.

Example 7: Creating a new Variable 

In Example-7 the DATA statement tells SAS to name the new data set India_visit and store it in the temporary library work. The LENGTH declare for character variable country is 6 bytes. Now the assignment statement used below to create a new variable Country and the value assigned for the country variable is ‘INDIA’ which is six character long and appears in the output data set Work. India_visit.

Example 8: Calculations using an Assignment statement

In Example-8 the DATA statement tells SAS to name the new data set TotalMarks and store it in the temporary library work. The LENGTH declare for numeric variable Total and Average is 5 (By default length is 8). Next, the assignment statement used to create a new variable Total and calculated the sum of all scores, similarly the second variable Average is created and calculated the mean score for each student by using the arithmetic operator, result for total scores and average scores shown in the output data set Work.totalmarks.

PROC STEP STATEMENT

The basic of “PROC STEP”– 

  • PROC STEP is a key word use to create list report 
  • Use to create summary report of the analyzed data 
  • Generates Output
  • Produce Plots and Charts
  • Variables and observations appear in output report in the order in which they occur in the data set.
  • Can control the order of the variables in report output

SYNTAX –

PROC PRINT DATA=data set name;

Run;

Where,

  • PROC PRINT is the key word use to generate report.
  • data set name is the name of the data set to be printed as a report and want to see as an output.
  • RUN statement is the end of proc statement.

Example 9: PROC PRINT step

In Example-9: The PROC PRINT statement call on the PRINT procedure and specifies the dataset Totalmarks to be printed as output report in SAS output window. All observations and variables are printed in the same order in which it appears in the data set below.

Example 10: Selecting Variables in PROC PRINT statement

In Example-10: The PROC PRINT statement print the result of the variable TOTAL and AVERAGE only.

9.1 WHERE statement in PROC PRINT

SYNTAX –

PROC PRINT DATA=data set name;

WHERE where-expression;

Run;

Where,

  • WHERE statement is used for selecting observation as required by specifying conditions.
  • WHERE statement works for both Character and Numeric variables.
  • WHERE statement can use with Comparison Operators and Logic Operators.
  • Enclose the character value in quotation marks. Like- where name=’Caston’; 
  • No need to use the quotation marks with numeric values. Like- where income>10000;
  • Must use the value exactly as it appears in the data set.

Example 11: Selecting Observations in PROC PRINT statement with WHERE statement

In Example-11: The PROC PRINT statement print the specify data set Totalmarks as an output report in SAS output window with the only observations where name is ‘KATHY’ along with all the variables present in the data set.

Example 12: Selecting multiple Observations by WHERE statement

In Example-12: The PROC PRINT statement print the specify data set Totalmarks as a report in SAS output window with the observations where name is ‘KATHY’ and ‘MICHAEL’ resultant two observations get printed, because OR operator works as a logical condition which allow to select the observations of both conditions specified in statement if true. 

Example 13: Selecting Observations by WHERE statement and Comparison Operator

In Example-13: The PROC PRINT statement print the specify data set Totalmarks as a report in SAS output window with the observations where average is less than 60. As a result, two observations get printed.

9.2 – Sorting Data

The basic of “PROC SORT”– 

  • PROC SORT is a key word use to sort the data
  • Sorting rearrange the observations in a data set
  • Sorting can do in ascending or descending order
  • Ascending is by default order to sort the observation.
  • Can sort by multiple variables
  • In sort procedure missing values is treated as the smallest possible values
  • PROC SORT does not generate printed output you have to use PROC PRINT to generate the output.

SYNTAX –

PROC SORT DATA=data set name   OUT=data set name;

By (descending) <By- variable>;

Run;

Where,

  • PROC SORT is the key word use to sort the data
  • DATA= data set to be read
  • OUT= data set is an output data set that contains the data in sorted order
  • By variable in the required BY statement specifies one or more variables whose values are used to sort the data. 
  • Descending option in BY statement sorts observations in descending order

Example 14Sorting Data by Ascending order

 Newlib.Review2018 Dataset used for Sorting                              Sorted data Output in ascending order

In Example 14: PROC SORT step sort the permanent SAS data set Newlib.Review2018 by the values of the variable Site (default Ascending order) and the OUT= option creates the temporary data set AscendingSortData and stores the sorted data. The PROC PRINT step print the ascendingsortdata dataset in output window

Example 15:  Sorting Data by Descending order

Newlib.Review2018 Data set used for Sorting                              Sorted data Output in descending order

In Example 15: PROC SORT step sort the permanent SAS data set Newlib.Review2018 by the values of the variable Site in Descending order and the OUT= option creates the temporary data set DescendingSortData and stores the sorted data. The PROC PRINT step print the descendingsortdata dataset in output window

9.3- Grouping Observations in SAS

SYNTAX –

PROC PRINT DATA=data set name;  

By <By- variable>;

Run;

Where,

  • PROC PRINT is the key word use to print the data
  • DATA= data set to be read
  • By variable specifies the variable that the procedure uses to form BY groups. 
  • BY variable must be sorted before grouping the observation.

Example 16: Grouping observation using BY variable

Example 16: Two SAS statement is performed 1- PROC SORT, 2- PROC PRINT. In PROC SORT step, sorts the permanent SAS data set Newlib.Review2018 by the values of the variable Site and the OUT= option creates the temporary data set SortData and stores the sorted data. 

Now, the PROC PRINT step print the SortData set in groups by using variable Site and generate the report in the output window.

SAS STATISTICS 

Statistical procedures in SAS are used to perform the Descriptive and Inferential statistics to analyze the data by Mean, Standard deviation, Frequency distributions, amount of missing data, Skewness, variability, TTest, Correlation, Linear regression etc.

Let’s explore some important statistical procedure useful for data analysis.

10.1 PROC MEANS 

The basic of “PROC MEANS”– 

  • PROC MEANS is a keyword use to compute descriptive statistics for numeric variables across all observations and within the group of observations. 
  • Can calculate mean, median, mode, standard deviation, minimum, maximum, total observations, missing values, variance, kurtosis, skewness, confidence limit.
  • If you do not specify any statistics-keyword Options with proc means, MEANS will print the N– number of non-missing values, Mean– average, STD-standard deviation, Min-minimum and Max-maximum values of all numeric variables, which is default simple statistics mean procedure.
  • Different statistics options you can request with MEANS procedure like-

N– Number of non-missing values                            MEAN– mean

MEDIAN- the median                                                  STDDEV-standard deviation

MIN-minimum                                                         MAX-maximum

SUM– Sum                                                                RANGE-range value

NMISS– Number of missing value                           MODE– mode

SYNTAX –

PROC MEANS DATA=data set name <statistics-keyword Options>;  

CLASS;

VAR;

Run;

Where,

  • PROC MEANS is the key word use to calculate the descriptive statistics.
  • DATA= data set to be read
  • CLASS statement uses for grouping the observation (optional). 
  • VAR is the variable name from the data set (optional).

Example 17:  SAS Means procedure

  • In Example 17: PROC MEANS procedure is used to compute summary statistics for all numeric variables presented in the dataset Newlib.helath.
  • PROC MEANS procedure printed the N– number of non-missing values, Mean– average, STD-standard deviation, Min-minimum and Max-maximum values of all numeric variables, which is default statistics by PROC MEANS.

Example 18:  SAS Means procedure with statistical keyword option.

Summary statistics 

  • In Example 18: PROC MEANS procedure is used to get Mean, Median, Standard deviation, Minimum, Maximum and NMiss value of specify numeric variables of interest presented in SAS dataset Newlib.health. 
  • The statistical keywords suppress the default output of PROC MEANS (N, mean, std, min and max) and calculated the output of define statistics Mean, Median, Standard deviation, Minimum, Maximum and NMiss. 
  • Where Maxdec=2 keyword used to get only 2 digits after decimal place in the output of define statistics.

Example 19:  SAS Means procedure by Class

  • In Example 19: PROC MEANS procedure is used to get only Mean of the specify variables age and income which is grouped by sex variable
  • Result window shows the output of PROC MEANS in which variable Sex is having category of 1 and 2 with 4 observations in each.
  • Mean value of variable Age and Income in each sex group is calculated and the numeric mean value having the maximum digits after decimal place is 2.

10.2- SAS Frequency Procedure

SAS frequency distribution is a table shows the frequency or count of the data points in a data set.  SAS provides a procedure called PROC FREQ to calculate the frequency distribution of data points in a data set. frequency distribution is of two type 1- One-way frequency distribution, 2- Two- way frequency distribution also called as cross tabulation method. 

The basic of “PROC FREQ”– 

  • PROC FREQ is a keyword use to compute the Frequency, Percentage, Cumulative frequency and Cumulative percentage of Categorical variables.
  • To specify the variable to be processed by the FREQ procedure, include Table statement.
  • To suppress the display of cumulative frequencies and cumulative percentage can add NOCUM keyword option to table. 

SYNTAX –

PROC FREQ DATA=data set name;  

TABLE variable-combination / Option;

Run;

Where,

  • PROC FREQ is the keyword use to calculate the Frequency count and Percentage of the data point.
  • TABLE= Determines the order of the variables.
  • Options can be use with freq procedure are: 

CHISQ – Request chi-square tests

TREND– Request for Cochran-Armitage test for trend analysis 

EXACT– Request for Fisher’s exact test

PLCORR-Request polychoric correlation coefficient. 

NOCUM– Suppresses printing cumulative frequencies and cumulative percentage

Example 20: One-way frequency distribution 

Frequency count of the dataset 

In Example 20: PROC FREQ SAS statement produces frequency table of the data set Newlib.Heart which shows the number of observations, percentage, cumulative frequency and cumulative percentage for each categorical value of variable Survive. You can see the count, percent of patients died and survive in heart data set.

Example 21: Two-way frequency distribution 

Two-way frequency output

In Example 21: PROC FREQ statement produces two- way frequency distribution or cross-tabulation for the data set Newlib.Heart which shows the number of observations for each combination of variable Sex by Survive. Inside each cell, SAS prints the frequency, percentage, percentage for that row and percentage for that column. As you can see in the category Sex-1 the count for Died patients is 4 and its percentage is 20 whereas Surv is 5 with percentage 25 similarly in category Sex-2 the count for Died patients is 5 and its percentage is 30 whereas Surv is 5 with percentage 25.

You can use options as mentioned above with the table statement to modify your analysis.

SAS ODS

ODS stands for Output Delivery System. It is mostly used to format the output data in appropriate output destination as specified by the SAS user like HTML, PDF, RTF, CSV, WORD etc. It also helps in sharing the output with other platforms and software and can also combine the results from multiple PROC statements in one single file

SYNTAX-

ODS outputfileType FILE = “Path with Output filename” STYLE = StyleName;

PROC step statement;

ODS outputfileType CLOSE;

Where,

  • ODS is the key word for along with the file type name.
  • FILE= file delivery path and file name information
  • STYLE=File formatting option which represents one of the in-built styles available in the SAS environment.
  • PROC step statement 
  • ODS filetype with Close statement for closing and generating the output

Example 22: Creating HTML Output

SAS report in .html format open via internet explorer

In Example 22: SAS code is executed and creates HTML output using the ODS HTML SAS statement. In the statement ODS html key word used along with delivering physical file path and name ‘Report.html’ with style available in SAS system. In between ODS html file and ODS html close PROC step statement used to generate the output report of the data set Newlib.review2018 in .html format.

Example 23: Creating PDF Output

SAS Report in pdf 

In Example 23:  SAS code is executed and creates PDF output using the ODS PDF SAS statement. In the statement ODS pdf key word used along with delivering physical file path and name ‘Report.PDF’ with style available in SAS system. In between ODS pdf file and ODS pdf close PROC step statement used to generate the output report of the data set Newlib.review2018 in .pdf format. Along with the Proc step TITLE and FOOTNOTE key words are used to mention the title of the report and endnote.

This brings to the end of the blog on SAS Tutorial. Hope this article helps you in understanding the concept of SAS programming. For a detailed course experience, visit Great Learning Academy where you will find free courses on SAS and more.

Keep Learning!

Avatar photo
Great Learning Team
Great Learning's Blog covers the latest developments and innovations in technology that can be leveraged to build rewarding careers. You'll find career guides, tech tutorials and industry news to keep yourself updated with the fast-changing world of tech and business.

Leave a Comment

Your email address will not be published. Required fields are marked *

Great Learning Free Online Courses
Scroll to Top