Susan Slaughter

Archive for the ‘SAS’ Category

What is SAS Viya?

In Everything, Guest Blog, SAS on January 10, 2023 at 4:40 pm

I asked Matthew Slaughter to briefly explain what SAS Viya is. Here is his answer:

There are many different products marketed under the label of “Viya,” and it’s not always clear how they relate to each other or how SAS Viya is different from previous versions of SAS. In a nutshell, SAS Viya is the version of SAS which runs on CAS. This begs the next question….

What is CAS?

CAS stands for “Cloud Analytic Services” and is the name of a massively parallel-processing platform for fast in-memory analytics on big data, with dynamic spillover to disk when data sets (also called data tables) get larger than available memory.

Traditional SAS reads data to and from disk row by row, which is potentially slow, but minimizes the amount of memory needed at any point in time which is why it scales well. (People comparing Viya to traditional SAS often refer to “SAS 9,” but SAS has always worked this way.)

Competing analytical tools are often designed to load entire data sets into memory all at once, which can be faster but will slow down and eventually fail as the size of a data set grows and exceeds the available memory.

CAS provides a “best of both worlds” approach, where large data sets can be loaded entirely into memory but will transition to disk-based processing automatically as needed.

The many faces of SAS Viya

The common element of all of the products included in Viya is that they all have the ability to use CAS as the underlying analytic engine. This includes point-and-click products such as SAS Visual Analytics and SAS Visual Statistics, but it also includes traditional SAS code running in CAS. In many cases the only change needed to make a SAS program take advantage of Viya is to point it at a CAS data set. Viya also includes R and Python packages which allow programmers to manipulate CAS data sets using those languages. Viya has its own environment for instructors and students called SAS Viya for Learners which is separate from SAS OnDemand for Academics which uses SAS 9.

Details, Details: Updates to The Little SAS Book

In Everything, Little SAS Book Series, Publishing, SAS on November 10, 2022 at 11:26 am

Earlier this year we quietly updated The Little SAS Book, Sixth Edition.  While these changes didn’t get a lot of attention, they are, in our opinion, critical to keeping The Little SAS Book useful and accurate. This is especially important for beginners who can’t be expected to know the history of SAS software or how it is evolving.

The updates include countless small changes, but mostly fall into a few broad categories:

  • References to SAS University Edition have been deleted.  When we wrote the Sixth Edition not so long ago, we had no way of knowing that SAS University Edition would soon be relegated to the great bit-bucket in the sky.  As of August 21, 2021, SAS University Edition is no longer supported by SAS Institute.  Microsoft forced this change when they stopped supporting virtual machines in which SAS University Edition ran.  Fortunately for people learning SAS, there is another option: SAS OnDemand for Academics is a cloud-based version of SAS that is free for non-commercial use.
  • We also clarified discussions of data set names, filenames, and paths.  The SAS language is not sensitive to case.  This is still true.  What is less obvious is that some parts of  SAS programs are not technically part of the SAS language. Filenames and paths, and even data set names, depend on your operating environment.  This doesn’t matter much if you are using an operating environment (such as Windows) that is also case insensitive. But it can matter a lot in operating environments (such as UNIX and Linux) that are sensitive to case.  It is possible to run SAS and not know which operating environment you are using. For example, SAS OnDemand for Academics runs on UNIX even if you are accessing it from another type of computer such as a Windows PC. So we took a hard look at the way we describe data set names, filenames, and paths and reworded them for clarity.
  • While we were making changes, we couldn’t resist another small one.  We added the very useful SCAN function to our table of character functions in Section 3.4.  There was just one small problem. Because there was no surplus space, we had to  remove something else to make room for SCAN.  That’s why the ANYALNUM function is now gone. However, this section still includes ANYALPHA, ANYDIGIT, and ANYSPACE so the ANY family of functions is still well represented.

So how can you know if your copy of the Sixth Edition is the original version or the updated one? One easy way is to check the index to see if it includes an entry for the SCAN function.

A more technical way is to look at the back of the title page where the copyright notices appear. Near the bottom of the page, if it says

“October 2019”

then you have the original version. If it says

“Originally published October 2019 Revised March 2022”

then you have the updated version.

It’s in the Details: Keeping The Little SAS Book Accurate

In Enterprise Guide, Everything, Little SAS Book Series, SAS on November 24, 2020 at 9:25 am

The Little SAS Book, Sixth Edition is now a year old.  I have already written posts about What’s New in this edition, and the very cool XLSX LIBNAME engine.  So what more is there to say?  A lot, it turns out.  The Sixth Edition was our biggest rewrite since the Second Edition introduced the new (at the time) Output Delivery System. This post covers a few of the changes you probably didn’t notice:

  • Default output has changed. You probably are aware that the default output has long been HTML (or SASREPORT in Enterprise Guide).  What most SAS users don’t know is that technically it was HTML4, but is now HTML5 (including in Enterprise Guide).  A few years ago, the default changed from HTML3 to HTML4.  If you didn’t notice the change from 3 to 4, then you probably won’t care about the change from 4 to 5 either. But it was a big deal to us because HTML4 stored images in separate files while HTML5 embeds images in the same files as text. This required us to completely rewrite section 8.12 with its discussion of saving graphics output. That complicated our lives, but it simplifies life for you if you use SAS to create HTML pages with graphics.  No more need to worry about links to your graphics files; now your graphics will be saved inside your HTML pages.
  • Footnotes are gone (except the few ones under tables and chapter quotes).  In an effort to maximize readability, we worked important information into the text and deleted the rest.
  • Default ODS style for PDF output has changed from PRINTER to PEARL.
  • Some built-in styles have disappeared entirely including one we used a lot, D3D.  To see the current built-in styles on your system, run this: PROC TEMPLATE; LIST STYLES; RUN;
  • ODS HTML statement requires PATH= in some situations when it didn’t before.  It’s complicated so just include a PATH= option, ok?
  • PROC REPORT no longer requires the NOWINDOWS option to avoid opening a Report window.
  • Some ODS style attribute options have new names.  For example, the option FONT_SIZE= has changed to FONTSIZE=, and BACKGROUND= changed to BACKGROUNDCOLOR=.  The old option names still work, but the new ones are considered more correct.
  • Ellis Island National Monument merged with Statue of Liberty National Monument.  Also Hawaii Volcanoes National Park lost one of its museums.  (The Jaggar Museum was damaged by the eruption of Kilauea.)  We updated our data accordingly.
  • The SAS family grew.  The number of SAS installations increased from 60,000 sites in 134 countries to 83,000 sites in 147 countries.

I admit that most of these are minor changes for most SAS users, but we pride ourselves on impeccable attention to detail because for programmers sometimes the details matter very much.

Data in a Time of Pandemic

In Everything, SAS on July 29, 2020 at 8:00 am

With COVID-19 spreading worldwide, accurate data have become more important than ever. In this blog, I share some of my favorite sources:

The Economist Reported deaths often underestimate actual deaths. One way to get at the real numbers is to compare total deaths from all causes versus the typical death rate. This “mortality tracker” plots excess deaths which is a more reliable measure than reported deaths.

Johns Hopkins University This interactive dashboard by the Coronavirus Resource Center at the Bloomberg School of Public Health shows detailed data about the pandemic worldwide.

National Public Radio These interactive graphics by NPR focus on the pandemic in the US.

Avi Schiffmann This webpage may be the most impressive effort by an individual person, and shows that tabular data can be profoundly thought-provoking too.

These articles are also highly recommended:

The Risks–Know Them–Avoid Them This article explains in plain language how COVID-19 spreads and how to keep yourself safe. Share this with your family.

COVID-19 Superspreader Events in 28 Countries: Critical Patterns and Lessons This fascinating article compiles data about superspreader events (SSEs) and reveals a lot about how this virus is spread.

Temporary reduction in daily global CO2 emissions during the COVID-19 forced confinement Finally, something positive: an article about the reduction in CO2 emissions due to the pandemic.

Knowledge is power. Working together we can all stay healthy.

Are you good at debugging SAS code?

In Everything, Little SAS Book Series, SAS on June 11, 2020 at 10:20 am

I have always believed that good debuggers are good coders for the simple reason that once you understand a bug, you will be better able to avoid it in the future. You can test your debugging skills with this blog I wrote for SAS Press.

 

Happy Birthday SAS Press!

In Everything, Little SAS Book Series, Publishing, SAS on June 11, 2020 at 10:14 am

Do a little time travel with this short video showing highlights from the last 30 years for SAS Press and the tech world.

Accessing Excel Files Using LIBNAME XLSX

In Enterprise Guide, Everything, Little SAS Book Series, SAS on March 12, 2020 at 1:51 pm

If you have been using SAS for long, you have probably noticed that there is generally more than one way to do anything. The Little SAS Book has long covered reading and writing Microsoft Excel files with the IMPORT and EXPORT procedures, but for the Sixth Edition we decided it was time to add two more ways: The ODS EXCEL destination makes it easy to convert procedure output into Excel files, while the XLSX LIBNAME engine allows you to access Excel files as if they were SAS data sets.

With the XLSX LIBNAME engine, you can convert an Excel file to a SAS data set (or vice versa) if you want to, but you can also access an Excel file directly without the need for a SAS data set. This engine works for files created using any version of Microsoft Excel 2007 or later in the Windows or UNIX operating environments. You must have SAS 9.4M2 or higher and SAS/ACCESS Interface to PC Files software. A nice thing about this engine is that it works with any combination of 32 bit and 64 bit systems.

The XLSX LIBNAME engine uses the first line in your file for the variable names, scans each full column to determine the variable type (character or numeric), assigns lengths to character variables, and recognizes dates and numeric values containing commas or dollar signs. While the XLSX LIBNAME engine does not offer many options, because you are using an Excel file like a SAS data set, you can use some standard data set options. For example, you can use the RENAME= data set option to change the names of variables, and FIRSTOBS= and OBS= to select a subset of rows.

Reading an Excel file as is 

Suppose you have the following Excel file containing data about magnolia trees:


With the XLSX LIBNAME engine, SAS can read the file, without first converting it to a SAS data set. Here is a PROC PRINT that prints the data directly from the Excel file.

* Read Excel spreadsheet using XLSX LIBNAME;
LIBNAME exfiles XLSX ‘c:\MyExcel\Trees.xlsx’;
PROC PRINT DATA = exfiles.sheet1;
   TITLE ‘PROC PRINT of Excel File’;
RUN;

Here are the results of the PROC PRINT. Notice that the variable names were taken from the first row in the file.

Converting an Excel file to a SAS data set 

If you want to convert an Excel file to a SAS data set, you can do that too. Here is a DATA step that reads the Excel file. The RENAME= data set option changes the variable name MaxHeight to MaxHeightFeet. Then a new variable is computed which is equal to the height in meters.

* Import Excel into a SAS data set;
DATA magnolia;
SET exfiles.sheet1 (RENAME = (MaxHeight = MaxHeightFeet));
MaxHeightMeters = ROUND(MaxHeightFeet * 0.3048);
RUN;

Here is the SAS data set with the new variable:

Writing to an Excel file 

It is just as easy to write to an Excel file as it is to read from it.

* Write a new sheet to the Excel file;
DATA exfiles.trees;
   SET magnolia;
RUN;
LIBNAME exfiles CLEAR;

Here is what the Excel file looks like with the new sheet. Notice that the new tab is labeled with the name of the SAS data set TREES.

Another nice thing about the XLSX LIBNAME is that it only locks a spreadsheet while SAS is accessing it. So generally speaking, it’s not necessary to issue a second LIBNAME statement to clear the libref. However, I did find, when I ran this in SAS Enterprise Guide, that I could not open the Excel spreadsheet unless I cleared the libref. So you can probably skip the LIBNAME CLEAR statement if you are using Display Manager or SAS Studio.

The XLSX LIBNAME engine is so flexible and easy to use that we think it’s a great addition to any SAS programmer’s skill set.

For more about the XLSX LIBNAME engine, I recommend this blog by Chris Hemedinger.

The Little SAS Book 6.0: What’s New

In Enterprise Guide, Everything, Little SAS Book Series, SAS on November 7, 2019 at 2:37 pm

Six editions is a lot! If you had told us, back when we wrote the first edition of The Little SAS Book, that someday we would write a sixth; we would have wondered how we could possibly find that much to say. After all, it is supposed to be The Little SAS Book, isn’t it? But the developers at SAS Institute are constantly hard at work inventing new and better ways of analyzing and visualizing data. And some of those ways turn out to be so fundamental that they belong even in a little book about SAS.

Interface independence

One of the biggest changes to SAS software in recent years is the proliferation of interfaces. SAS programmers have more choices than ever before. Previous editions contained some sections specific to the SAS windowing environment (also called Display Manager). We wrote this edition for all SAS programmers whether you use SAS Studio, SAS Enterprise Guide, the SAS windowing environment, or run in batch. That sounds easy, but it wasn’t. There are differences in how SAS behaves with different interfaces, and these differences can be very fundamental. In particular, the system option that sets the rules for names of variables varies depending on how you run SAS. So old sections had to be rewritten, and we added a whole new section showing how to use variable names containing blanks and special characters.

New ways to read and write Microsoft Excel files

Previous editions already covered how to read and write Microsoft Excel files, but SAS developers have created some great new ways. This edition contains new sections about the XLSX LIBNAME engine and the ODS EXCEL destination.

More PROC SQL

From the very first edition, The Little SAS Book always covered PROC SQL. But it was in an appendix and over time we noticed that most people ignore appendices. So for this edition, we removed the appendix and added new sections on using PROC SQL to

  • Subset your data
  • Join data sets
  • Add summary statistics to a data set
  • Create macro variables with the INTO clause

For people who are new to SQL, these sections provide a good introduction; for people who already know SQL, they provide a model of how to leverage SQL in your SAS programs.

Updates and additions throughout the book

Almost every section in this edition has been changed in some way. We added new options, made sure everything is up-to-date, and ran every example in every SAS interface noting any differences. For example, PROC SGPLOT has some new options, the default ODS style for PDF has changed, and the LISTING destination behaves differently in different interfaces. Here’s a short list, in no particular order, of new or expanded topics in the sixth edition:

  • More examples with permanent SAS data sets, CSV files, or tab-delimited files
  • More log notes throughout the book showing what to look for
  • LIKE or sounds-like (=*) operators in WHERE statements
  • CROSSLIST, NOCUM, and NOPRINT options in PROC FREQ
  • Grouping data with a user-defined format and the PUT function
  • Iterative DO groups
  • DO WHILE and DO UNTIL statements
  • %DO statements

Even though we have added a lot to this edition, it is still a little book.  In fact, this edition is shorter than the last—by twelve pages! We think this is the best edition yet.

Now Available: The Little SAS Book, Sixth Edition

In Little SAS Book Series, SAS on October 21, 2019 at 11:46 am

I am excited to announce that the sixth edition of The Little SAS Book is now available. We spent over a year rewriting and updating, and this may well be the best edition yet.

You can download a sample chapter or purchase e-book versions (PDF, EPUB or Kindle) by visiting SAS Press’ site.

If, like me, you like to be able to flip the pages and make notes in the margin, then you can get a hard copy (in paperback or hardback!) from Amazon.

SAS Global Forum 2019

In Everything, SAS, SAS Global Forum on April 24, 2019 at 1:57 pm

I’m excited because in a couple days I will fly to Dallas for SAS Global Forum 2019, the biggest SAS conference of the year, attended by thousands.

If you are coming, I hope you will say hello to me.  If you can’t make it to Dallas, you’ll be glad to know that many presentations will be livecast. Here is the schedule

A few highlights:

Sunday, April 28, 7:00-8:30 pm CT–Opening Session

Monday, April 29, 8:30-10:00 am CT–General Session: Technology Connection

Tuesday, April 30, 3:00-4:00 pm CT–Career Advice We’d Give to Our Kids: A Panel Discussion

Wednesday, May 1, 10:30-11:30 am CT–The Good, the Bad, and the Creepy: Why Data Scientists Need to Understand Ethics

These presentations may not be available after the conference so check the schedule and make sure to tune in at the right time.