Domino Data Lab, NVIDIA, NetApp Team Up to Help Manage AI/ML Workloads

Domino Data Lab announced on Sept. 20 integration between NVIDIA GPUs and NetApp data management and storage to more easily allow enterprises to run artificial intelligence (AI) and machine learning (ML) workloads in either data centers or AWS without refactoring them.

Domino's Nexus offering lets customers retarget a workload from a cloud resource to an on-premises resource or other cloud resource with zero code changes.

With data sizes increasing and training workloads requiring more compute, customers are looking for more flexibility as to where they run their AI/ML workloads, according to Thomas Robinson, vice president of strategic partnerships and corporate development at Domino Data Lab.

"That means customers can push workloads to the compute of their choice to localize workloads, distribute to the edge, or save costs by running in an on-premises data center — all without requiring data scientists to refactor code and without DevOps work to manage and push workloads to multiple compute planes," Robinson said.

How Domino, NVIDIA Reference Architecture Benefits Customers

In support of its hybrid MLOps vision, Domino and NVIDIA created an integrated MLOps and on-premises GPU reference architecture.

The reference architecture serves as a blueprint for organizations needing MLOps solutions on accelerated hardware with high performance storage.

This saves customers from needing to develop their own architectures as they strive to create a Center of Excellence (CoE) for data science — and to achieve related proven benefits, Robinson said.

Benefits include:

Greater knowledge sharing across teams
Increased efficiency in data science initiatives
Better alignment of data science and business strategy
Improved talent acquisition

"This reference architecture also allows vendors to provide out-of-the-box support for those deployments," he added.

The architecture has been validated by both technology providers, which are enabling joint ecosystem solution partners such as Mark III Systems to build AI platforms, systems, and software.

"Enterprise customers value integrated solution stacks that have been certified by partners to deliver peak performance and guaranteed compatibility," Robinson said.

In addition, NetApp, a provider of AI data management solutions, validated Domino Nexus as a solution supporting the Domino Enterprise MLOps Platform on Amazon FSx for NetApp ONTAP.

Supporting evolving hybrid workload requirements, the AWS Managed Service (AMS) solution will simplify deployment and management of large-scale applications in hybrid real-time environments.

"We've all heard the analogy that big data is the new oil in our AI economy," Robinson said. "Well, data science models are the engine of that AI economy."

He noted that NetApp is in use at many of Domino's existing enterprise customers to provide the large storage volumes and high throughput that demanding AI and deep-learning workloads require.

"Since both storage and the MLOps layer are required to develop these models, having our products be certified to work together helps customers with their entire stack for ML," he explained.

Robinson pointed out that NetApp leads the industry with its hybrid- and multicloud-ready products, which provide for data access and data movement across and between public and private clouds.

'Playtime Is Over for Data Science'

Having data where it is needed is important, as data scientists use Domino to run their workloads in their infrastructure of choice, he said.

"Five years ago, many of our customers were concerned with ML program support, access to data, and getting models to production," he said, explaining that at the time only 20% of companies were investing in AI.

"But now, playtime is over for data science," he said. "Many of those customers have proven out the impact ML can have on their business. They're now seeing scaling challenges rather than 'getting started' challenges."

For customers who are leading with ML, as they have moved from building a few models to prove early value to having larger teams, more compute and storage, and a need for model governance, new challenges have emerged, Robinson said.

"Top of mind is developing a hybrid and multi-cloud strategy to help manage compute costs, deal with data gravity and data sovereignty, and avoid vendor lock-in with the hyperscalers," he notes.

Second is having a comprehensive system of record where teams and divisions can collaborate on work and establish best practices.

Third, appropriate enterprise governance and security controls are needed to ensure models making decisions for companies are well-monitored and well-controlled.

"Given these trends, we think we'll see evolution in the MLOps market to address hybrid, system-of-record, and enterprise model governance," Robinson said.

About the author

Nathan Eddy is a freelance writer for ITPro Today. He has written for Popular Mechanics, Sales & Marketing Management Magazine, FierceMarkets, and CRN, among others. In 2012 he made his first documentary film, The Absent Column. He currently lives in Berlin.

Comments

Plain text