Defining Data Science: Demystifying today's fastest growing skill set
Like it or not, ALL businesses are in the business of data science. The sheer volume of digitally stored information, its exponential growth, and its potential to not just transform businesses, but to reshape the economic landscape borders on the unfathomable. A few numbers to try to wrap your mind around, for example:
- IBM estimates that 90% of the data in the world today has been created in the past two years
- In 2000, only 25% of stored information was digital – by 2013, stored information reached 1,200 exabytes, with only 2% being non-digital
- Stored information grows 4x faster than the world economy
No one would deny that collecting, managing, interpreting, and putting data to use is now mission-critical for all organizations. And to do it, we need data scientists. But what exactly is data science? What do data scientists do? How do we effectively attract and retain the right type of specialists? How will we know when we have the right mix of technical skill and know-how to truly make use of the data that's available?
More to the point, how do you ensure that you're fostering the right amount of data fluency within your team, even if that just means knowing which data scientist to talk and when for any given data-related query?
Our clients come to us with these questions all the time. Check out our recent post about the Data Science Drought and how it may be affecting your organization. But first, keep reading below, where we've done some of the legwork for you and summarize a few of the most widely used models for defining data science.
Defining by Specialization
The simplest and most common approach is to distinguish by specialization with Type A (“Analysis”) and Type B (“Building”). Type A data scientists are primarily concerned with making sense of data – they interpret it and make it useful - and they work with data in a fairly static state. The result of their work usually comes in the form of a presentation, a report, or an article.
Type B, on the other hand, are the builders of the data science world. They probably have a statistics background and are usually very strong coders, maybe even software designers. They design and build machine learning models that allow data to perform as needed.
The Type A/Type B paradigm is a helpful way to start to parsing through the breadth of work that falls underneath the banner of data science, but these specialization distinctions do not necessarily translate into the day-to-day data life of practitioners. While some of the shinier, “fetishized” applications of data science like machine learning and deep learning may be reserved to certain specialists, there are core functions that are shared amongst even the most diverse set of data scientists:
“...Working data scientists make their daily bread and butter through data collection and data cleaning; building dashboards and reports; data visualization; statistical inference; communicating results to key stakeholders; and convincing decision makers of their results.”
Defining by Functionality
There are even finer delineations to be made beyond the binary Type A/Type B distinction. Data Science consultant and expert, Jonathan Nolis, offers three functional components of data science that help to explain: business intelligence, decision science, and machine learning.
The first of these according to Nolis is really about “taking data that the company has and getting it in front of the right people.” Think data extraction, aggregation, and visualization. A next level is decision science, which uses that organized data to then help a company make a decision or set of decisions (predictive analytics, decision support, customized targeting). The final step, then, is asking machine learning to “take data science models and put them into continuous production. Combining this functional distinction with the Type A/Type B specialization paradigm starts to paint a picture of the data scientist within the organization.
Defining by how data scientists fit into the organization
McKinsey & Co. drill down further into functional and specialization distinctions across a three-part diagram. It offers a third way of defining data scientists by explaining how they fit into the larger team. It also emphasizes how skills interrelate to allow for meaningful use of data science throughout the organization – weaving business, analytical, and data skills together.
Understanding what data scientists do and how they fit into your organization is critical in a data-centric world and, more importantly, is essential in harnessing the potential 'big data' has to offer - at least for now. As Hugo Browne-Anderson of DataCamp explains, the necessity of data scientists may actually change the way in which we attract and depend upon them:
The data science revolution across industries and society at large has just begun. Whether the title of data scientist will remain the “sexiest job of the 21st century,” will become more specialized, or will become a set of skills that most working professionals are simply required to have is unclear.
If your organisation is looking for help in using Data Science to drive performance, please contact us or follow the link to some of our case studies to see how we help clients.
For more on data science, consider some of the following articles & posts:
What data scientists really do according to 35 data scientists from Harvard Business Review
5 reasons to invest in data science from the Knowlton Group