Like it or not, ALL businesses are in the business of data science. The sheer volume of digitally stored information, its exponential growth, and its potential to not just transform businesses, but to reshape the economic landscape borders on the unfathomable. A few numbers to try to wrap your mind around, for example:
IBM estimates that 90% of the data in the world today has been created in the past two years
In 2000, only 25% of stored information was digital – by 2013, stored information reached 1,200 exabytes, with only 2% being non-digital
Stored information grows 4x faster than the world economy
No one would deny that collecting, managing, interpreting, and putting data to use is now mission-critical for all organizations. And to do it, we need data scientists. But what exactly is data science? What do data scientists do? How do we effectively attract and retain the right type of specialists? How will we know when we have the right mix of technical skill and know-how to truly make use of the data that’s available?
More to the point, how do you ensure that you’re fostering the right amount of data fluency within your team, even if that just means knowing which data scientist to talk and when for any given data-related query?
Our clients come to us with these questions all the time. Check out our recent post about the Data Science Drought and how it may be affecting your organization. But first, keep reading below, where we’ve done some of the legwork for you and summarize a few of the most widely used models for defining data science.
Defining by Specialization
The simplest and most common approach is to distinguish by specialization with Type A (“Analysis”) and Type B (“Building”). Type A data scientists are primarily concerned with making sense of data – they interpret it and make it useful – and they work with data in a fairly static state. The result of their work usually comes in the form of a presentation, a report, or an article.
Type B, on the other hand, are the builders of the data science world. They probably have a statistics background and are usually very strong coders, maybe even software designers. They design and build machine learning models that allow data to perform as needed.
The Type A/Type B paradigm is a helpful way to start to parsing through the breadth of work that falls underneath the banner of data science, but these specialization distinctions do not necessarily translate into the day-to-day data life of practitioners. While some of the shinier, “fetishized” applications of data science like machine learning and deep learning may be reserved to certain specialists, there are core functions that are shared amongst even the most diverse set of data scientists:
“…Working data scientists make their daily bread and butter through data collection and data cleaning; building dashboards and reports; data visualization; statistical inference; communicating results to key stakeholders; and convincing decision makers of their results.”
Defining by Functionality
There are even finer delineations to be made beyond the binary Type A/Type B distinction. Data Science consultant and expert, Jonathan Nolis, offers three functional components of data science that help to explain: business intelligence, decision science, and machine learning.
The first of these according to Nolis is really about “taking data that the company has and getting it in front of the right people.” Think data extraction, aggregation, and visualization. A next level is decision science, which uses that organized data to then help a company make a decision or set of decisions (predictive analytics, decision support, customized targeting). The final step, then, is asking machine learning to “take data science models and put them into continuous production. Combining this functional distinction with the Type A/Type B specialization paradigm starts to paint a picture of the data scientist within the organization.
Defining by how data scientists fit into the organization
McKinsey & Co. drill down further into functional and specialization distinctions across a three-part diagram. It offers a third way of defining data scientists by explaining how they fit into the larger team. It also emphasizes how skills interrelate to allow for meaningful use of data science throughout the organization – weaving business, analytical, and data skills together.