By Guest Contributor Richard Benjamins
Author of A Data-Driven Company, Richard Benjamins, explains if AI is included in the data organization or should it be separate.
Big data entered business organizations at large scale beginning in 2011, when McKinsey Global Institute wrote the seminal report, ‘Big data: The next frontier for innovation, competition, and productivity’ (Manyika et al., 2011). A similar report about analytics that runs on top of big data was published by McKinsey in 2017 (Henke et al., 2016), and the think tank’s third report, also in 2017, examined the impact of AI (Bughin et al., 2017). This phased business-readiness of these technologies is today reflected in how organizations set up related areas.
The first new role and corresponding department that found its way into corporations was the CDO, whose main responsibility was a mixture of setting up big data plat- forms and performing some kind of analytics on the data, for delivering insights and use cases. A few years later, the role of the Chief Analytics Officer (CAO) was introduced, and there were even hybrid Chief Data and Analytics Officer (CDAO) appointments. With the rise of AI, there are now also plenty of Chief AI Officers (CAIOs).
Many organizations that have embarked on their data and AI journey, as well as those that are about to start, wonder how to organize all these different departments and what their relationships and reporting lines should be. Depending on whether they’re starting from scratch or already on the journey, the decision may be different. However, the underlying concepts and principles remain the same.
CONSIDERATIONS FOR ORGANIZING DATA, ANALYTICS AND AI-RELATED AREAS
First of all, data is the basis for analytics and is fuelling many AI applications. Indeed, the dominant AI paradigm in business today is data-driven AI, also known as machine learning. Machine learning takes as input large amounts of historical data and generates a program or model that’s able to make predictions on what will happen, for instance, or perform a classification based on new data that the model has never seen before. It might be used to determine which customers will switch to the competition, what components will break down in an industrial plant, what movie to recommend, or whether a particular patient is suffer- ing from a certain disease. Machine learning is therefore dependent on the availability of quality data.
AI is broader than just machine learning. There is NLP, Knowledge Representation & Reasoning, Planning and Robotics. And while several of these areas have enjoyed breakthroughs in the past few years thanks to machine learning, they aren’t only about machine learning.
For the machine learning type of AI, it’s important that the AI department is not too far from the data department (the CDO), especially at the start of the data journey, when frequent interactions are necessary. In this sense, the same principles behind the choices for data and IT dis- cussed in the previous chapter also apply to how to organize data, analytics and AI. That means that these three areas should report into the same senior executive, and preferably have the same direct manager. The reason is simple: if there are issues between the different areas, the escalation process is straightforward and fast, speeding up resolution. Conversely, having the data, analytics and AI in organizations that report to different senior executives is a recipe for problems and even failure. Different senior execs likely have different tangible objectives, and what might be essential for the CMO might be less relevant for the CIO at a particular time. That might put differing emphasis on the priorities of one team or the other, and they may be out of sync. Solving such a problem requires escalating it all the way up to the board, which takes time and effort, generates frustration and delays progress on the digital transformation.
Of course, if the focus of an AI organization is not on machine learning, but on the other, non-data intensive areas, it doesn’t hurt to have it in a separate organization, as there is less dependency. As soon as data starts becomIng the main driver, though, they should be put together.
Another reason these three areas (data, analytics, AI) should be close together is that these technologies are still relatively new in the business world, and organizations lack experience in how they can work together. As such, mistakes may be made in each distinct area, requiring backtracking on earlier decisions; this backtracking has an impact on what was communicated earlier to the other areas.
THE RELATION WITH DATA MATURITY
As we will see, for many decisions, the data maturity of an organization is an important factor in making the best decision. In other words, depending on how data-mature an organization is, how a particular decision is made is less or more relevant. For relatively immature organizations — those that have recently started this journey — it’s crucial to have the AI and data departments close together, as we have argued here. However, for more mature organizations with significant experience, this is less of an issue. Data-mature organizations have data management and governance in place, ensuring a certain quality and frequency of the data. If the data is good, AI and analytics areas can work on their own. In more data-immature organizations, data availability and quality are not well-organized processes, and much faster, more agile interactions are needed to obtain results.
DO YOU NEED DATA SCIENTISTS OR DATA ENGINEERS?
The rise of big data has been accompanied by the ascent of the data scientist, which has been called the sexiest job of the 21st century (Davenport and Patil, 2012). This motivated many data professionals to profile themselves as data scientists, and has led to the hiring of an abundance of data scientists by organizations looking to strengthen their data team. In reality, however, the majority of effort in a data project is dedicated to accessing and understand- ing the data and verifying its quality, with a much smaller part dedicated to the analytics or machine learning.
This is a logical consequence of the overall low data maturity of organizations in the past few years. Only when data is fully managed as an asset can full attention be devoted to value creation with analytics and machine learning. But today, such organizations (mostly big tech companies) rep- resent only a small percentage. This trend has led to many professionals hired as data scientists working at accessing and manipulating data for the sake of data, rather than creating value, and this has led to much frustration.
The lesson learned here is that, in practice, organizations need to hire the right balance of data engineers and data scientists. At the beginning of the data journey, there should be more data engineers than data scientists, and with increasing data maturity the balance can shift in favour of the data scientists … but not earlier!
This phenomenon also has important implications for groups within the different data departments. For organizations starting their data and AI journey, it would be a mistake to separate the data and analytics/AI areas. The data department would do its best to collect, store, organize and make data available, but given the early stage, this process will likely take a long time and generate doubtful data quality. This leads to frustration in the analytics/AI team: they’re waiting for data, and once it is there, it has a lot of problems and they need to refer back to the data department. The further away the departments are in the organizational structure, the more frustration there will be, which will negatively impact the collaboration. As discussed in the previous chapter, the optimal positioning is to have the departments report to the same line man- ager. If this isn’t possible, one of the alternate solutions we suggested may make sense, such as co-location or dotted line reporting.
How to organize the respective data departments (data, analytics and AI) depends on where the organization is on its data journey. Especially in early phases, it is important to keep them as close together as possible. Fully data-mature organizations have more flexibility in where to put the different teams, as data will be part of the BAU processes. It’s always better to view the association between the different departments as ‘partner’ and not ‘client-provider’ relationships.
ABOUT THE AUTHOR
RICHARD BENJAMINS is Chief AI & Data Strategist at Telefonica. He was named one of the 100 most influential people in data-driven business (DataIQ 100,2018). He is also co-founder and Vice President of the Spanish Observatory for Ethical and Social Impacts of AI (OdiselA). He was Group Chief Data Officer at AXA, and before that spent a decade in big data and analytics executive positions at Telefonica. He is an expert to the European Parliament’s AI Observatory (EPAIO), a frequent speaker at AI events, and strategic advisor to several start-ups. He was also a member of the European Commission’s B2G data-sharing Expert Group and founder of Telefonica’s Big Data for Social Good department. He holds a PhD in Cognitive Science, has published over 100 scientific articles, and is author of the (Spanish) book, The Myth of the Algorithm: Tales and Truths of Artificial Intelligence.
Are you planning to start working with big data, analytics or AI, but don’t know where to start or what to expect? Have you started your data journey and are wondering how to get to the next level? Want to know how to fund your data journey, how to organize your data team, how to measure the results, how to scale? Don’t worry, you are not alone. Many organizations are struggling with the same questions.
This book discusses 21 key decisions that any organization faces when travelling its journey towards becoming a data-driven and AI company. It is surprising how much the challenges are similar across different sectors. This is a book for business leaders who must learn to adapt to the world of data and AI and reap its benefits. It is about how to progress on the digital transformation journey of which data is a key ingredient.