Subhro provides valuable and informative content on SAS, offering a comprehensive understanding of SAS concepts. We have been creating SAS tutorials since 2019, and 9to5sas has become one of the leading free SAS resources available on the internet.
Data manipulation and management are vast fields in which SAS stands out as an incredibly versatile and potent tool. Where Statements in SAS are the key to its functionality, allowing users to refine their data operations with surgical precision.
Do you fully utilize its potential, however? Our post “Advanced Techniques: Mastering the Where Statement in SAS” will ensure just that. Learn how to make the most of this essential SAS statement, discover expert tricks, and elevate your expertise.
In this article, we will help you understand the nuances of this potent statement, helping you unlock the keys to enhancing data management.
You can use the WHERE statement in SAS to filter data when reading from a SAS dataset.
Example:
You may use a WHERE or a subsetting IF statement in this example. There are advantages to using a WHERE statement in SAS instead of a subsetting IF statement.
You have a larger alternative of operators that can be used with a WHERE statement, and if the input data set is indexed, the WHERE statement in SAS is likely more efficient.
You may also use a WHERE statement in a SAS procedure to subset the data being processed.
Note: IF statements are not allowed inside SAS procedures.
In SAS, the WHERE statement can accommodate multiple conditions by using logical operators such as AND , OR , and NOT .
Let’s create a dataset named ‘Employee_data’ in SAS with ‘Name’ and ‘Salary’ as the variables.
Let’s consider an example where we want to extract data for employees earning more than $50,000 but less than or equal to $80,000 from the ‘Employee_data’ dataset.
50000 AND Salary
In this case, the WHERE statement is combined with the AND operator. Only those records from ‘Employee_data’ where ‘Salary’ is greater than $50,000 and less than or equal to $80,000 will be selected and saved to the ‘selected_data’ dataset.
Similarly, the OR operator can be used to select records based on any of the conditions being met. For example, if we want to select employees with salaries either less than $30,000 or more than $70,000, the WHERE statement would be as follows:
70000; RUN;
The NOT operator can be used to exclude records that meet certain conditions. If we want to select all employees who are not high earners (with salaries greater than $70,000), we can use the WHERE statement as follows:
70000); RUN;
Below is the list of operators that you can use with the WHERE statement in SAS.
The IS MISSING and IS NULL operators can handle character or numeric variables in WHERE, ON, and HAVING expressions.
It results in true if the expression results are missing and false if it is not missing.
They also work with the NOT operator.
The BETWEEN-AND operator in SAS is a useful operator for filtering data within a specified range. It can be applied to both numeric and date variables, making it highly versatile for various data analysis tasks. Typically, the BETWEEN-AND operator is used within a WHERE statement to filter data that falls between two values.
Consider a simple example where we have a dataset called ‘sashelp.class’. Suppose we want to filter out names with weight between 60 and 90. The SAS code with BETWEEN-AND would look like this:
The NOT operator can be used with the WHERE statement in many other ways:
Multiple WHERE statements cannot be used in a DATA step as IF statements. If SAS encounters a second WHERE statement, it will replace the first.
The WHERE AND also known as WHERE ALSO will let you use multiple where statements in SAS to add more restrictions.
12 ; run;
The SAS IN operator is a comparison operator that searches for character and numeric values equal to one of the values from a given list of values.
The SAS where in list of values must be in parentheses, with each character value in quotation marks and separated by either a comma or blank.
For example, suppose you want to filter for SUVs, sedans or Wagon cars. You could specify the values as:
Note the use of the strip and Upcase function in the type variable. This ensures that any leading or trailing spaces are removed, and the values will be converted to Uppercase before comparison.
In addition to this, you can use the logical operator NOT to exclude values that are on the list.
You can also use a shorthand notation to specify a range of numbers to search. The range is specified by using the syntax M: N as a value in the list to search, where M is the lower bound and N is the upper bound.
You can replace the below statement in the form of WHERE IN shorthand notation.
The CONTAINS operator returns a true value if the character string on the left side of the operator contains the string on the right side.
^= (or NOT CONTAINS) returns a true value if the character string on the left side of the operator does not contain the string on the right side.
The “contains” operator (?) and the “not contains” operator (^?) match a substring that appears anywhere in the target character variable.
LIKE operator is frequently used for pattern matching, that is, evaluating whether a variable value equals, begins with a specified character, or sounds like a specified value or pattern.
The LIKE expression uses two wildcard operators. When using the LIKE operator, the underscore(_) wildcard takes the place of a single character, whereas the % sign might be substituted for a string of any length (including a null string).
Double quotes (“) are used here to include the apostrophe in “Women’s Dress”. Single quotes can be used when the text string does not contain an apostrophe or other single quotes.
The percentage sign (%) wildcard searches for character values that begin with, end with or even contain certain character strings. Below are some of the examples using wildcards.
The above expression returns all names that begin with “a” and are followed by any characters of any lengths
_ indicates any character at start followed by “a” and ending with any char of any length.
At the beginning of a search, two underscores indicate that any two characters followed by “n” should be filtered.
The above expression would return character values with two characters at the beginning followed by n and any one character at the end.
The SOUNDS-LIKE operator is based on the SOUNDEX algorithm for identifying words that sound alike. The SOUNDEX algorithm is English-biased, so it’s not useful for languages other than English.
You can download the dataset employees.sas7bdat used in the program.
Using the WHERE statement in SAS and a WHERE= dataset option in a DATA step is not a good practice. SAS ignores the WHERE statement when both are used in the same dataset.
Additionally, a WARNING appears in the SAS log indicating that the WHERE statement cannot be applied.
In SAS, both the IF and WHERE statements are used to filter data based on certain conditions. However, they operate at different points in the data processing stage and thus have distinct functionalities and uses.
For example, if you have a dataset named ‘Employee_data’ with ‘Name’ and ‘Salary’ as the variables, and you want to select employees who earn more than $50,000, you could use a WHERE statement or an IF statement as follows:
Using a WHERE Statement:
50000; RUN;
Using an IF Statement:
50000; RUN;
What is the WHERE statement in SAS?
The u003ccodeu003eWHEREu003c/codeu003e statement in SAS is a data step statement used to select specific observations from a dataset based on certain conditions.
Can I use multiple conditions in a SAS WHERE statement?
Yes, the u003ccodeu003eWHEREu003c/codeu003e statement in SAS can handle multiple conditions using logical operators like u003ccodeu003eANDu003c/codeu003e, u003ccodeu003eORu003c/codeu003e, and u003ccodeu003eNOTu003c/codeu003e.
How can I select a range of values using a WHERE statement in SAS?
The u003ccodeu003eBETWEEN-ANDu003c/codeu003e operator can be used in a u003ccodeu003eWHEREu003c/codeu003e statement in SAS to select a range of values. For example, u003ccodeu003eWHERE Salary BETWEEN 40000 AND 60000;u003c/codeu003eTo clarify, the query will retrieve all records where the ‘Salary’ field is within the range of $40,000 to $60,000, inclusive.
What does the ^= operator do in a WHERE statement?
The u003ccodeu003e^=u003c/codeu003e operator is the u0022not containsu0022 operator in SAS. It is used to select observations where a certain character string is not present in a variable. For example, u003ccodeu003eWHERE Position ^= ‘Manager’;u003c/codeu003e would select all records where the ‘Position’ does not contain the word ‘Manager’.
Can I use the WHERE statement with date values in SAS?
Yes, you can use the u003ccodeu003eWHEREu003c/codeu003e statement with date values in SAS. For example, u003ccodeu003eWHERE HireDate BETWEEN ’01JAN2015’d AND ’31DEC2020’d;u003c/codeu003e would select all records where the ‘HireDate’ falls within the specified date range.
Can you use a WHERE statement in proc freq?
Yes, you can use a u003ccodeu003eWHEREu003c/codeu003e statement in u003ccodeu003ePROC FREQu003c/codeu003e in SAS to limit the analysis to a subset of data that meets the specified condition(s). The u003ccodeu003eWHEREu003c/codeu003e statement is used before the u003ccodeu003eTABLESu003c/codeu003e statement in the u003ccodeu003ePROC FREQu003c/codeu003e call.
In conclusion, mastering the Where Statement In SAS is pivotal for anyone seeking to harness the full potential of SAS programming for efficient data management. The ability to accurately apply and interpret WHERE statements, whether used independently or in combination with logical operators, is an indispensable skill in today’s data-driven world. This article has provided detailed insights into the application of the WHERE statement with multiple conditions, enabling you to implement complex logical operations with ease and precision.
Furthermore, we explored the difference between IF and WHERE statements, highlighting the unique strengths of the latter. The utility of the WHERE statement in various PROC procedures like PROC FREQ was also discussed, showing the breadth of its application. We trust that these advanced techniques have enriched your understanding and will empower your journey in SAS programming.
Subhro provides valuable and informative content on SAS, offering a comprehensive understanding of SAS concepts. We have been creating SAS tutorials since 2019, and 9to5sas has become one of the leading free SAS resources available on the internet.
This site uses Akismet to reduce spam. Learn how your comment data is processed.