As many laments the shortage of data science talent, it should be more concerning that the vast majority of U.S. workers are increasingly unprepared to navigate today’s tsunami of data. In 2012, the Program for the International Assessment of Adult Competencies (PIAAC) examined the literacy, numeracy, and problem-solving skills of adults aged 16-65 across 23 countries. While countries like Japan and Finland led the pack on numeracy skills, the United States ranked a disappointing 21 out of 23 participating countries. Rather than waiting for technology or our educational system to solve this problem, each company will have to look at what can be done to close the data literacy gap for its own managers and employees.
At Utah’s Loveland Living Planet Aquarium, Chris O’Meara, SVP of Operations, discovered people knew what business questions to ask, but they were unsure how data could provide the answers they needed. To address this problem, he introduced an easier-to-use, self-service BI solution (Domo) as well as a data skills training program that featured casual but informative lunch-and-learn sessions.
While most people had been exposed to statistics in varying degrees in college, it was anywhere from a couple of years to decades since they graduated. Through the lunch-and-learn sessions, O’Meara was able to refresh their statistical knowledge, standardize on new data terminology, and instill more data-driven behaviors. For example, a few years ago its marketing department relied on almost no data to manage its marketing efforts. It has now become one of the non-profit organization’s most data-driven departments relying on robust data strategies for measuring and optimizing its marketing efforts.
4 Keys to Minimum Viable Data Literacy
Data literacy can encompass a wide spectrum of skills so it’s important to establish a functional baseline for this type of skill. Just like people don’t need an advanced English degree to be literate, your employees don’t need advanced statistical knowledge and programming skills in Python or R to be data literate. Reading and writing skill levels are often defined by what people can or can’t accomplish in their everyday life—we must do the same for data literacy. For example, someone may be considered illiterate if they struggle to read a food label or complete a job application.
When it comes to basic data literacy, someone should be able to adequately analyze and interpret a standard data table or chart. They should be comfortable with any of the common charts such as the line, bar, area, pie and scatterplot graphs that are found in most business applications, dashboards and news reports today. Ideally, it would be great if everyone knew how to produce their own charts and perform their own analyses, but in my mind, that’s not the minimum standard we’re looking for. At a minimum, we need people to be able to consume and interpret data effectively. To do so, they will need skills in the following four areas:
1. Data knowledge.
Each company, discipline (marketing, finance) and industry (retail, healthcare) will have its own set of unique data terms and datasets. The more your employees understand your company’s data from a business perspective, the better positioned they’ll be to apply it. For example, if you were an online marketer, you should be familiar with basic metrics such as page views, sessions, unique visitors and bounce rate. In addition to knowing the data, you will need some ability to work with numbers or numeracy. It may surprise some people that much of what data scientists focus on is just arithmetic, and the vast majority of analytics (80%) centers around sums and averages. As well, a basic understanding of statistical concepts and terms will be helpful such as knowing what correlation is and the difference between quantitative and qualitative data.
2. Data assimilation.
When you’re presented with new data for interpretation, you need to be adept at orientating yourself to unfamiliar data before consuming it. At this stage, you’re not analyzing or passing any judgment on the data—you’re just assimilating what’s in front of you. You should be accustomed to inspecting the following elements of tables or charts and seeking further clarification if any items are ambiguous or missing:
- Title & labels: Is the table or chart titled and labeled descriptively and clearly?
- Timeframe: What is the date range(s) for the data being presented?
- Data source: Do you know where the data came from?
- Unit(s) of measurement: Do you clearly understand what the metrics in the tables or charts represent?
- Scales: Are the scales of the graph axes clear and effective?
- Calculated metric(s): For ratios, rates, and other formulas do you have a clear understanding of how they are calculated?
- Dimensions: Are the dimensions or categories used to organize or segment the data clear and meaningful?
- Filters: Is it clear whether any specific filters have been applied to the data set (e.g., all customers vs new customers)?
- Sorting: If different values have been sorted or ranked, is it clear what criteria were used?
- Targets: If goals or targets have been added to the charts, is it clear what they represent?
3. Data interpretation.
After you’ve familiarized yourself with the data, you should then be able to analyze and interpret it. Depending on the type of data and its presentation format, it may be examined in many different ways. In general, you should be accustomed to making the following types of observations in charts:
- Trends: What direction is a trended metric heading (up, down, flat)?
- Patterns: What repeatable patterns or cycles are you seeing in the data (e.g., seasonality)?
- Gaps: Are there any obvious gaps or omissions in the dataset?
- Clusters: Are some values bunched closely together in certain areas?
- Skewness: Are values noticeably concentrated or skewed more to one side than another?
- Outliers: Is there a data point that is detached or far removed from the rest of the data points?
- Focus: Has something in the chart or table been emphasized to draw attention to it? Is it obvious why part of the data was highlighted?
- Noise: Is there any extraneous data included that detracts from the main message of the chart?
- Logical: Does the data help to answer a specific business question? Does the data support a proposed conclusion or argument?
4. Data skepticism & curiosity.
In addition to analyzing and interpreting the data, you must also think critically about it. Too often data is accepted at face value. However, it is important to be able to step back and weigh other less obvious factors that may be influencing the results and its interpretation:
- Collection method: Could the method or way in which the data was collected influence the results?
- Credibility: How credible or reliable is the source of the data?
- Bias: Is there potential bias from either the data producer or you as the consumer?
- Truthful: Is the data being manipulated in a way—intentionally or inadvertently—that misrepresents its true meaning?
- Assumptions: Are there any implied assumptions that could be affecting how the numbers are interpreted?
- Context: Is there additional context or background information that is missing and needed to properly understand the data?
- Comparisons: If supplemental data is included for comparison purposes (e.g., period-over-period data), does it offer a fair and relevant comparison? Alternatively, is an obvious comparison missing?
- Causation: Are you potentially confusing correlation with causation, which represents a direct pattern of cause and effect?
- Significance: If the data is statistically significant, is it also practically significant?
- Outliers: Is an outlier important or is it unnecessarily skewing the overall results?
- Quality: Are you able to distinguish between data that is unusable or that which is still directionally helpful?
Scenario: Data Literacy in Action
To illustrate these data literacy principles in action, I’ll use the following scenario of a digital marketer who has just received a chart (see below) from an analyst on her team. Anne’s first task is to understand what she’s looking at (data assimilation). She notices that the chart has two metrics: page views and bounce rate. They are trended on separate y-axes over a recent 28-day period. Anne is familiar with both of the web metrics displayed in the chart (data knowledge!). However, Anne is unsure whether the data is for their entire website (unlikely due to the low volume of page views) or a specific webpage. She reaches out to the analyst for further clarification.
After confirming the data is for the company’s main marketing landing page, Anne is ready to interpret the data. Her first observation is that the page views appear to be trending up over time. Second, she notices that the page view volume is cyclical based on the days of the week with less volume on weekends. Third, she observes that the bounce rate is usually fairly steady at around 45%. Fourth, she notices a large increase in the bounce rate on Feb 12th, which is concerning as it means visitors are immediately abandoning the site after viewing the landing page.
As Anne ponders what is happening, she suspects the landing page is probably not optimized, and it may need to be redesigned. However, she also knows her team made no significant changes to the page and that something else must have caused the spike in the page’s bounce rate (data skepticism & curiosity). With a little investigation and added context, she discovers her team is testing a new product configurator that is hosted by a third-party partner on its own web domain. Because the partner’s pages were not properly tagged for their web analytics tool, it looks as though visitors are abandoning the website. Because the additional page views weren’t being captured, it created a false spike in the bounce rate even though many visitors were actually trying the new product configurator. Using her data literacy skills, Anne is able to identify a problem and take corrective action to address it (fix her partner’s web analytics tags).
Today Data Literacy Skills are Needed More Than Ever
Just as literacy lifts civilizations out of ignorance and poverty, greater data literacy can similarly enlighten and enrich your organization. Unfortunately, too many business users still have the wrong impression that data is primarily someone else’s job. In today’s data-rich business environments, it’s quickly becoming everyone’s responsibility to understand, use and communicate data effectively—not just the data experts. Bridging the data literacy divide at your company will accelerate your employees’ ability to embrace the growing amounts of data being placed in front of them.
Interestingly, the repercussions of lingering data illiteracy are graver than just how it can impact businesses. Amidst the cynical cries of fake news and the proposition of alternative facts, it’s critical that society—not just companies—foster greater data literacy so people can better distinguish between fact and fiction. We don’t just need citizen data scientists—we need more data citizens. The best protection we have against deceptive data is a public that is immunized against its negative influence through greater data literacy.
**This article was originally published on Forbes.com on March 9, 2017.