Data in the Time of Pandemic

With COVID-19 spreading worldwide, accurate data have become more important than ever. In this blog, I share some of my favorite sources:

The Economist Reported deaths often underestimate actual deaths. One way to get at the real numbers is to compare total deaths from all causes versus the typical death rate. This “mortality tracker” plots excess deaths which is a more reliable measure than reported deaths.

Johns Hopkins University This interactive dashboard by the Coronavirus Resource Center at the Bloomberg School of Public Health shows detailed data about the pandemic worldwide.

National Public Radio These interactive graphics by NPR focus on the pandemic in the US.

SAS Institute This interactive dashboard gives a different look at global COVID-19 data.

Avi Schiffmann This webpage may be the most impressive effort by an individual person, and shows that tabular data can be profoundly thought-provoking too.

These articles are also highly recommended:

The Risks–Know Them–Avoid Them This article explains in plain language how COVID-19 spreads and how to keep yourself safe. Share this with your family.

COVID-19 Superspreader Events in 28 Countries: Critical Patterns and Lessons This fascinating article compiles data about superspreader events (SSEs) and reveals a lot about how this virus is spread.

Temporary reduction in daily global CO2 emissions during the COVID-19 forced confinement Finally, something positive: an article about the reduction in CO2 emissions due to the pandemic.

Knowledge is power. Working together we can all stay healthy.

Are you good at debugging SAS code?

I have always believed that good debuggers are good coders for the simple reason that once you understand a bug, you will be better able to avoid it in the future. You can test your debugging skills with this blog I wrote for SAS Press.


Happy Birthday SAS Press!

Do a little time travel with this short video showing highlights from the last 30 years for SAS Press and the tech world.

Accessing Excel Files Using LIBNAME XLSX

If you have been using SAS for long, you have probably noticed that there is generally more than one way to do anything. The Little SAS Book has long covered reading and writing Microsoft Excel files with the IMPORT and EXPORT procedures, but for the Sixth Edition we decided it was time to add two more ways: The ODS EXCEL destination makes it easy to convert procedure results into Excel files, while the XLSX LIBNAME engine allows you to access Excel files as if they were SAS data sets.

With the XLSX LIBNAME engine, you can convert an Excel file to a SAS data set (or vice versa) if you want to, but you can also access an Excel file directly without the need for a SAS data set. This engine works for files created using any version of Microsoft Excel 2007 or later in the Windows or UNIX operating environments. You must have SAS 9.4M2 or higher and SAS/ACCESS Interface to PC Files software. A nice thing about this engine is that it works with any combination of 32 bit and 64 bit systems.

The XLSX LIBNAME engine uses the first line in your file for the variable names, scans each full column to determine the variable type (character or numeric), assigns lengths to character variables, and recognizes dates, and numeric values containing commas or dollar signs. While the XLSX LIBNAME engine does not offer many options, because you are using an Excel file like a SAS data set, you can use some standard data set options. For example, you can use the RENAME= data set option to change the names of variables, and FIRSTOBS= and OBS= to select a subset of rows.

Reading an Excel file as is 

Suppose you have the following Excel file containing data about magnolia trees:

With the XLSX LIBNAME engine, SAS can read the file, without first converting it to a SAS data set. Here is a PROC PRINT that prints the data directly from the Excel file.

* Read Excel spreadsheet XLSX LIBNAME;
LIBNAME exfiles XLSX ‘c:\MyExcel\Trees.xlsx’;
PROC PRINT DATA = exfiles.sheet1;
   TITLE ‘PROC PRINT of Excel File’;

Here are the results of the PROC PRINT. Notice that the variable names were taken from the first row in the file.

Converting an Excel file to a SAS data set 

If you want to convert an Excel file to a SAS data set, you can do that too. Here is a DATA step that reads the Excel file. The RENAME= data set option changes the variable name MaxHeight to MaxHeightFeet. Then a new variable is computed which is equal to the height in meters.

* Import Excel into a SAS data set;
DATA magnolia;
SET exfiles.sheet1 (RENAME = (MaxHeight = MaxHeightFeet));
MaxHeightMeters = ROUND(MaxHeightFeet * 0.3048);

Here is the SAS data set with the renamed and new variables:

Writing to an Excel file 

It is just as easy to write to an Excel file as it is to read from it.

* Write a new sheet to the Excel file;
DATA exfiles.trees;
   SET magnolia;

Here is what the Excel file looks like with the new sheet. Notice that the new tab is labeled with the name of the SAS data set TREES.

Another nice thing about the XLSX LIBNAME is that it only locks a spreadsheet while SAS is accessing it. So generally speaking, it’s not necessary to issue a second LIBNAME statement to clear the libref. However, I did find, when I ran this in SAS Enterprise Guide, that I could not open the Excel spreadsheet unless I cleared the libref. So you can probably skip the LIBNAME CLEAR statement if you are using Display Manager or SAS Studio.

The XLSX LIBNAME engine is so flexible and easy to use that we think it’s a great addition to any SAS programmer’s skill set.

For more about the XLSX LIBNAME engine, I recommend this blog by Chris Hemedinger.

The Little SAS Book 6.0: What’s New

Six editions is a lot! If you had told us, back when we wrote the first edition of The Little SAS Book, that someday we would write a sixth; we would have wondered how we could possibly find that much to say. After all, it is supposed to be The Little SAS Book, isn’t it? But the developers at SAS Institute are constantly hard at work inventing new and better ways of analyzing and visualizing data. And some of those ways turn out to be so fundamental that they belong even in a little book about SAS.

Interface independence

One of the biggest changes to SAS software in recent years is the proliferation of interfaces. SAS programmers have more choices than ever before. Previous editions contained some sections specific to the SAS windowing environment (also called Display Manager). We wrote this edition for all SAS programmers whether you use SAS Studio, SAS Enterprise Guide, the SAS windowing environment, or run in batch. That sounds easy, but it wasn’t. There are differences in how SAS behaves with different interfaces, and these differences can be very fundamental. In particular, the system option that sets the rules for names of variables varies depending on how you run SAS. So old sections had to be rewritten, and we added a whole new section showing how to use variable names containing blanks and special characters.

New ways to read and write Microsoft Excel files

Previous editions already covered how to read and write Microsoft Excel files, but SAS developers have created some great new ways. This edition contains new sections about the XLSX LIBNAME engine and the ODS EXCEL destination.


From the very first edition, The Little SAS Book always covered PROC SQL. But it was in an appendix and over time we noticed that most people ignore appendices. So for this edition, we removed the appendix and added new sections on using PROC SQL to

  • Subset your data
  • Join data sets
  • Add summary statistics to a data set
  • Create macro variables with the INTO clause

For people who are new to SQL, these sections provide a good introduction; for people who already know SQL, they provide a model of how to leverage SQL in your SAS programs.

Updates and additions throughout the book

Almost every section in this edition has been changed in some way. We added new options, made sure everything is up-to-date, and ran every example in every SAS interface noting any differences. For example, PROC SGPLOT has some new options, the default ODS style for PDF has changed, and the LISTING destination behaves differently in different interfaces. Here’s a short list, in no particular order, of new or expanded topics in the sixth edition:

  • More examples with permanent SAS data sets, CSV files, or tab-delimited files
  • More log notes throughout the book showing what to look for
  • LIKE or sounds-like (=*) operators in WHERE statements
  • Grouping data with a user-defined format and the PUT function
  • Iterative DO groups
  • DO WHILE and DO UNTIL statements
  • %DO statements

Even though we have added a lot to this edition, it is still a little book.  In fact, this edition is shorter than the last—by twelve pages! We think this is the best edition yet.

What is Data Literacy?

It has recently become fashionable to talk about data literacy. This is an important idea so I’m glad to see people discussing it.

To me, data literacy means understanding that data are not dry, dusty, abstract squiggles on a computer screen, but represent living things: people, plants, animals. Having a deep understanding of data enables people to engage with data, to explore data, to interpret data, and to use data to impact their lives and work.

Data literacy necessarily comes with a degree of skepticism, recognizing that data can be not only used, but also misused. In this age of “alternative facts,” it is important to recognize when assertions are supported by data, and when they are not.

Everyone knows that technology is becoming more and more a part of everyday life. Without data literacy, people become passive recipients; with data literacy, you can actively engage with technology.

You know you are fluent in a foreign language when you are comfortable speaking it and can communicate what you want to say. The same is true for data literacy; it is about reaching a level of comfort, about seeing the meaning behind the data, and about being able to communicate what you want to say.

Here is how Wikipedia defines data literacy.

SAS Global Forum 2019

I’m excited because in a couple days I will fly to Dallas for SAS Global Forum 2019, the biggest SAS conference of the year, attended by thousands.

If you are coming, I hope you will say hello to me.  If you can’t make it to Dallas, you’ll be glad to know that many presentations will be livecast. Here is the schedule

A few highlights:

Sunday, April 28, 7:00-8:30 pm CT–Opening Session

Monday, April 29, 8:30-10:00 am CT–General Session: Technology Connection

Tuesday, April 30, 3:00-4:00 pm CT–Career Advice We’d Give to Our Kids: A Panel Discussion

Wednesday, May 1, 10:30-11:30 am CT–The Good, the Bad, and the Creepy: Why Data Scientists Need to Understand Ethics

These presentations may not be available after the conference so check the schedule and make sure to tune in at the right time.





Everybody Needs Career Development

This year I’ve had the honor of helping to recruit speakers for the Career Development area at SAS Global Forum. We have some fantastic presentations that everyone can benefit from whether you are a student, a new graduate, or a mid-career professional.

I particularly recommend the panel discussion (Career Advice We’d Give to Our Kids) Tuesday April 30, 3:00-4:00 in Level 2, Ballroom C4. The panelists (Shelley Blozis, AnnMaria De Mars, Paul LaBrec) are all great so this should be both informative and entertaining.

The following presentations are listed in order by day and time. As you scroll through this list, you may notice that most (but not all!) of these presentations are in Level 1 Room D168.

Poster (available every day)
Tips to Ensure Success in Your New SAS Project
Flora Fang Liu

Tuesday, April 30, 2019

10:00-11:00 Level 1, D168
Don’t Just Survive, Thrive! A Learning- Based Strategy for a Modern Organization Centered Around SAS
Jovan Marjanovic

11:00-12:00 Level 1, D168
The Power of Know-How: Pump Up Your Professional Value by Refining Your SAS Skills
Gina Huff

1:00-1:15 Level 2, Exhibit Hall D, Super Demo 12
SAS Programming Exam Moves to Performance-Based Format
Mark Stevens

1:30-2:00 Level 1, D168
The Why and How of Teaching SAS to High School Students
Jennifer Richards

2:00-2:30 Level 1, D168
Puzzle Me, Puzzle You: How a Thought Experiment Became a Rubik’s Cube Among a Set of Fun Puzzles
Amit Patel, Lewis Mitchell

2:30-3:00 Level 1, D168
How to Land Work as a SAS Professional
Charu Shankar

3:00-3:15 Level 2, Exhibit Hall D, Super Demo 12
Take SAS Certification Exams from Home Online Proctored
Terry Barham

3:00-4:00 Level 2, Ballroom C4
Panel Discussion: Career Advice We’d Give to Our Kids
Shelley Blozis, AnnMaria De Mars, Paul LaBrec

3:00-4:00 Level 1, D168
How To Be an Effective Statistician
Alexander Schacht

4:00-5:00 Level 1, D168
Stories from the Trenches: Tips and Techniques for Career Advancement from a SAS Industry Recruiter
Molly Hall

5:00-5:30 Level 1, D168
How to HOW: Hands-on- Workshops Made Easy
Chuck Kincaid

Wednesday, May 1, 2019

10:00-11:00 Level 2, Ballroom C3
Tell Me a Data Story
Kat Greenbrook

10:00-11:00 Level 2 Ballroom C4
The Good, The Bad, and The Creepy: Why Data Scientists Need to Understand Ethics
11:00 Jennifer Priestley

11:30-12:00 Level 1, D168
New to SAS? Helpful Hints for Developing Your Own Professional Development Plan
Kelly Smith

Tips for Learning SAS

New to SAS?  Here are tips from the translator of The Little SAS Book, Fifth Edition.

Hongqiu Gu, Ph.D. works at the China National Clinical Research Center for Neurological Diseases at the National Center for Healthcare Quality Management in Neurological Diseases at Beijing Tiantan Hospital, Capital Medical University.

He shared these important tips to learn SAS well:

1.  Read SAS Documentation

I have not counted the number of SAS books I have read; I would estimate over 50 or 60.  The best books to give me a deep understanding of SAS are in the SAS Documentation, including SAS Language Reference Concepts, SAS Functions and CALL Routines Reference, SAS Macro Language Reference, and so on.  There are lots of excellent books published by SAS Press, and usually they are concise and suitable for quick learners.  However, when I realized that SAS could give me a powerful career advantage, I needed to learn SAS systematically and deeply.  I believe the SAS Documentation provides the most authoritative and comprehensive learning materials.  Besides, the updated SAS Documentation is free to all readers.

2.  Use the SAS Help and Documentation frequently

No one can remember all the syntaxes or options in SAS.  However, don’t worry, SAS Help and Documentation is our best friend.  I use the SAS Help and Documentation quite often.  Even as an experienced SAS user, there are still many situations in which I need to ask for help from SAS Help and Documentation. Every time I use it, I learn something new.

3.  Solve SAS related questions in SAS communities

As the saying goes, practice makes perfect.  Answering SAS related questions is a good way to practice.  Questions can come from daily work, from friends around you, or from other SAS users on the web.  From 2013 to 2015, I spent a lot of time in the largest Chinese SAS online  community answering SAS related questions and I learned many practical skills in a short period.

4.  Make friends with skilled SAS programmers

Learning alone without interacting with others will lead to ignorance.  I have learned a lot from other experienced SAS users and SAS developers.  We share our ideas from time to time, and benefit a lot from the exchange.



The Little SAS Book in China

Recently The Little SAS Book reached a major milestone.  For the first time ever, it was translated into another language.  The language in this case was Chinese, and the translator was Hongqiu Gu, Ph.D. from the China National Clinical Research Center for Neurological Diseases at the National Center for Healthcare Quality Management in Neurological Diseases at Beijing Tiantan Hospital, Capital Medical University.

To mark this achievement, I asked Hongqiu a few questions.

Susan:  First I want to say how honored I am that you translated our book.  It must have been a lot of work.  Receiving a copy of the translation was a highlight of the year for me.  How did you learn SAS?

Hongqiu:  How did I learn SAS?  That is a long story.  I had not heard of SAS before I took an undergraduate statistics course in 2005.  The first time I heard the name “SAS,” I mistook it for SARS (Severe Acute Respiratory Syndrome).  Although the pronunciations of these two words are entirely different for native English speakers, most Chinese people pronounced them as /sa:s/.  At that time, I was not trying to learn SAS well, and I simply wanted to pass the exam.  After the exam, all I had learned about SAS was entirely forgotten.  However, during the preparation of my master’s thesis, I had to do a lot of data cleaning and data analysis work with SAS, and I began to learn SAS enthusiastically.

Susan:  Why did you decide to translate The Little SAS Book?

Hongqiu:  Although I highly recommend the SAS Documentation for learning SAS, most beginners need a concise SAS book to give them a quick overview of what SAS is and what SAS can do.  There is no doubt that The Little SAS Book is the best one as the first SAS book for SAS beginners.  However, it was not easy for a Chinese SAS beginner to get a hardcopy of The Little SAS Book because it was not available in the Chinese market and the price was too high if they shopped overseas.  Another barrier is the language.  Most beginners still want an elementary book in their mother language. Besides, lots of R books had been introduced and translated into Chinese.  Therefore, I believed there was an urgent need to translate this book into Chinese.  So I tried several times to contact SAS press to get permission to translate it into Chinese, but no reply.  Things changed when manager Frank Jiang from SAS China found me after my book, The Romance of SAS Programming, was published by Tsinghua University Press.

Susan:  How long did it take you to translate the book?

Hongqiu:  First, I must state that the Chinese version of The Little SAS Book is a collaborative work.  Manager Frank Jiang from SAS China together with managing editor Yang Liu from Tsinghua University Press did much early-stage work to start this project.  We began the translation in early April 2017 and finished the translation in July 2017.  After that, we took more than three months to complete the two rounds of cross-audit to make sure the translation was correct and typo errors were minimized.

Members of the translation team include Hongqiu Gu, Adrian Liu, Louanna Kong, Molly Li, Slash Xin, Nick Li, Zhixin Yang, Amy Qian, Wei Wang, and Ke Yang.

Members of the audit team include Silence Zeng, Mary Ma, Wei Wang, Jianping Xue, and Sikan Luan.

Susan:  What was the hardest part of translating it?

Hongqiu:  The book is written in plain English and easy to understand.  We did not find any particular part that hard to translate.

Susan:  Are there a lot of SAS users in China?

Hongqiu:  There are a lot of SAS users in China.  I’ve no idea what the exact number of SAS users in China is.  With the increasing need for SAS users in medicine, life science, finance and banking industries, SAS users will become more and more prevalent.

Susan:  Thank you for sharing your experiences.  Perhaps someday we can meet in person at SAS Global Forum.