Enterprises have been using analytical machine learning techniques for years to solve business problems related to making predictions on raw data. Today, perception-based techniques driven by deep learning and neural networks are gaining traction around understanding vision and language, both of which have applications within enterprise settings.
When people speak of enterprise artificial intelligence (AI), it is quite common to treat it as a generic term or one large entity without being specific about what technology is being used. This can lead to misunderstandings around what AI can and cannot do, the software and hardware that is required, or even the talent needed to develop the AI solution.
According to Tractica’s definition of AI, statistical ML techniques like random forests, support vector machines, naive Bayes and linear regression, among others, fall under the umbrella term of AI. However, these techniques should be treated separately from the deep learning branch of AI. Deep learning seems to get the lion’s share of artificial intelligence attention due to its association with large internet companies like Google, Facebook and Amazon, as well as the cloud-based AI frameworks that have been in the news. So to understand the issues that are specific to machine learning, we decided to question machine learning developers about the challenges they are facing, the software and hardware tools they use and the application markets where they see most activity.
Our resulting survey – conducted recently by Tractica in collaboration with ITPro Today – has uncovered specific trends related to data science and machine learning development within the enterprise. The survey garnered responses from 50 machine learning developers.
Bottlenecks to Enterprise ML Development
Among bottlenecks faced by ML developers, data preparation ranks at the top, followed by external code integration and enterprise back-end integration. This is consistent with what we have been hearing from enterprises of all sizes. Cleaning data, labeling data and checking for bias in data are all part of the hard plumbing that is required today to make sure ML-driven AI processes are working. At the same time, the ability to bring code from external environments continues to be challenging, though vendors claim that this is not the case.
Bottlenecks around back-end integration of ML platforms are also an issue, as enterprises want a smoother flow between models being built and then being deployed in the IT infrastructure. As ML scales in the enterprise, back-end integration will be a bigger issue than it is now.
Interestingly, visualization is a lower-priority issue. However, from a vendor perspective, this is an area of differentiation; most standard solutions have “drag and drop” model development capabilities. The fact that visualization is a low-priority issue suggests that enterprises are not interested so much in drag-and-drop features and like to work at the code level when it comes to ML.
While model portability has been cited as an issue by some vendors and developers, in this survey, it is the lowest barrier, and, as previously mentioned, external code integration is ranked higher as a bottleneck. This suggests the possibility that ML developers like to use open source platforms to build models and then bring them into their enterprise proprietary platforms, rather than use proprietary platforms to build and port models.
Most Popular Software Tools for ML
IBM Watson is the most popular software tool for ML, according to the survey, followed by SAP, Anaconda and SAS. These (except for Anaconda, which is an open source distribution) are from traditional enterprise analytics vendors that have made inroads into the AI/ML space. Many of the newer players in the market, such as H2O.ai with its Driverless AI product, have gained prominence. Driverless AI borrows from Google’s AutoML. Also, some of the other players, like Databricks, RapidMiner and Alteryx, have been receiving good feedback, as they have been able to innovate more quickly compared with the larger vendors. Over time, Tractica expects the ML software market to consolidate as the hyperscale platforms like Google Cloud, Microsoft Azure and AWS encroach upon the ML developer space and possibly end up acquiring some ML software vendors.
Most Popular Hardware Tools for ML
While machine learning development has largely treated hardware as an afterthought, shifts are occurring in the market. NVIDIA’s launch of data science solutions like Titan RTX, Quadro RTX and Rapids are meant to accelerate graphics processing unit (GPU) adoption for ML and take away share from Intel’s CPU dominance. As the survey shows, Intel Xeon-based CPUs are the most preferred option for ML developers. However, GPUs are not far behind – especially when considering both NVIDIA and other GPU platforms, which were represented as separate options in our survey. Over time, we expect ML workloads to be more evenly split between GPUs and CPUs. Users will choose a solution based on their performance requirements and budget. GPUs give users better performance speed-up, while CPUs cost less. Another consideration is the growing relevance of ARM-based processor solutions, which can be targeted at solving ML workloads.
Industries Using ML
Finally, the survey found interesting choices of industries where ML is being deployed. Public sector (government) and business services (HR, customer relationship management [CRM], etc.) seem to be the most popular areas for machine learning development, followed by healthcare and manufacturing. In the “other industries” segment, finance and security are the most popular ones.
The public sector surprisingly stands out and is much bigger in terms of its activity level than expected. Traditionally, the public sector has been slow at deploying ML solutions due to the slow progress in digital transformation and pervasive bureaucracy. Unfortunately, the survey does not cover the use cases involved. We will need to dig into the ML activity within the public sector to better understand what this represents or whether this is an anomaly. Healthcare and manufacturing also stand out within Tractica’s latest Artificial Intelligence Market Forecasts analysis as sectors that are adopting ML-based AI solutions, specifically around patient data analysis and predictive maintenance use cases.