Enterprise companies find MLOps critical for reliability and performance

2:28 PM PDT • May 6, 2020

**Image Credits:** Peter Cade (opens in a new window) / Getty Images

Rish Joshi

Contributor

Rish is an entrepreneur and investor. Previously, he was a VC at Gradient Ventures (Google’s AI fund), co-founded a fintech startup building an analytics platform for SEC filings and worked on deep-learning research as a graduate student in computer science at MIT.

The rise of MLOps

As enterprises adopt auto-ML workflows, one of the issues they’re commonly seeing is that many of the models built by data scientists never make it into production. There are a number of issues that can stop deployment, including models that underperform in pre-production environments, incompatibilities between production environments and the model-training environment, or inconsistencies with production infrastructure.

This is where MLOps comes in.

The world of MLOps has been shaped a fair bit by the evolution of DevOps, which has rocketed to popularity the past few years. The role of DevOps is to efficiently integrate and deploy source code, and it’s typically managed by a DevOps engineer who works as a bridge between IT and developers.

MLOps is similar, but focuses on the ML model and data sets as opposed to code. These days, data engineers run MLOps, but it’s likely the specialized role of MLOps engineer will come about soon.

There are four components to the modern MLOps workflow:

Continuous Integration: In DevOps, this refers to synchronizing new code with the existing code base, whereas in MLOps, this process refers to synchronizing the data and models. This involves checks such as confirming that a model mathematically converges, making sure it does not result in data-type errors, and running tests on sub-methods within the model to ensure they’re working as expected.

Continuous Deployment: In DevOps, this refers to moving code into production, and it’s the same with MLOps, except with models instead of code. This involves checks such as ensuring that the libraries required for a model to run exist in the production environment, testing the model with sample input data to verify it’s producing the expected outputs and testing performance metrics in pre-production.

Monitoring: Once a model has been deployed, it needs to be actively evaluated to ensure that it’s working as desired, both in terms of accuracy and runtime speed. MLOps solutions look at metrics such as data drift (assessing whether a model is losing its accuracy as input data changes) and performance around run time and latency.

Governance: For an enterprise company that would likely have many algorithms in production at once, issues can crop up requiring a data scientist to look into what’s causing a model to not work as expected. Having an end-to-end system that enables tracking by model of which data it was trained on, who built the model and when, and other such factors, can be helpful. Further, maintaining this data is helpful for compliance purposes.

How companies like DataRobot have driven the need for MLOps

DataRobot’s enterprise AI platform helps customers streamline the full ML life cycle across data preparation, model building and model deployment. H20.ai offers a similar solution to DataRobot called H20 Driverless AI, which provides end-to-end automated AI capabilities. One of the key differences between the two platforms lies with their target users, as H20.ai tends to cater to more technical users, whereas DataRobot serves business and IT folks along with data scientists.

Beyond end-to-end AI workflow platforms, the auto-ML market has been flooded with many companies providing tools for various parts of the enterprise AI stack. Cloud providers, including Amazon, Microsoft and Google, have innovated by developing auto-ML capabilities for cloud customers. Specialized platforms such as Domino Data Lab offer solutions for advanced users, and many tools such as TensorFlow and pre-built classifiers are readily accessible to developers for model building.

In the case of end-to-end AI workflow platforms such as DataRobot, some of the key benefits for enterprises have included the automation of various parts of the workflow, particularly around feature engineering and model generation, and the efficiency that comes with consolidating the entire workflow onto a single platform.

That’s perhaps a lot of buzzwords, so let’s consider the case of a security team at a credit card company assessing fraud risk for users. Let’s assume the input data consists of rows pertaining to end customers, with each row containing metadata including the day the customer’s card was activated, the day it expired and the number of fraudulent events identified in that time frame.

In order to effectively model the fraud risk, the security team would need to take the difference between the card activation and card expiration days and tie that to the number of fraudulent events identified. This is called feature engineering, which involves combining the input features in such a way that helps an ML model learn the underlying patterns as best as possible.

This may look simple, but problems often have a large number of input data columns that can greatly increase the number of combinations one has to try — and the relations between different data points may not be easy to discern, either.

Automated feature engineering makes this process simpler by auto-testing many different combinations of input features, quickly and at scale, to help the user pick the best one.

Once a user has finalized the set of features, DataRobot’s automated model generation capability lets them run many different types of models on the data, and see which ones perform best. This saves users the time of building models from scratch, and also gives them the benefit of seeing how different models perform.

Moreover, in situations where the data is rapidly changing, it gives users the ability to rerun the full set of models and re-determine which ones work best based on new data. In the case of the security team at the credit card company, consider a model that was developed in a particular region. If the security team is tasked with understanding fraud risk in another region and further receives some new data columns specific to that region, it’s possible the initial models won’t perform as well as new models that take all the available data into account.

The consolidation of the entire workflow into a single platform also provides several benefits for users. On the model building side, the coupling of data to a variety of models can make experimenting easier and help debug any issues that come up with the models much quicker. On the model deployment side, it helps with tracking source data and model attributes for models in deployment, both for any changes that become necessary and for governance.

Though companies like DataRobot and H20.ai offer end-to-end AI workflow platforms, the drive toward automating these workflows has not solely been confined to a single vendor solution. Given the modularity between data prep, feature engineering and model development, enterprises are often using permutations of a number of different solutions to satisfy their requirements.

In DataRobot’s case, use of their products alongside Snowflake and Tableau has been a popular ask by customers. Customers commonly tend to use ML tools offered by cloud providers in conjunction with DataRobot and H20.ai’s products as well, and both of them provide tight integration with the major cloud providers.

The rapidly expanding MLOps solutions market

The market for MLOps solutions has been growing over the past year as enterprises focused their efforts on model deployment and governance following the widespread adoption of auto-ML tools.

DataRobot recently acquired ParallelM, one of the early entrants in the MLOps space back in 2017, which enables customers to deploy models to infrastructure such as Kubernetes and Spark, either on-premise or on one of the major cloud providers. H20.ai partnered last year with ParallelM’s MLOps solution, as well.

The MLOps space is also seeing open-source solutions prop up. KubeFlow is an open-source tool that enables MLOps capabilities for deploying to Kubernetes, and, similar to TensorFlow, it began as a project based on Google’s internal ML pipelines. DataBricks has released an open-source tool called MLFlow, which provides full life cycle workflows for ML development, including MLOps with deployment capabilities to Apache Spark.

The major cloud providers have also made their own forays into this category. Amazon SageMaker has introduced MLOps capabilities by helping customers leverage AWS Lambda and Step Functions for deploying models. Microsoft Azure has enabled tight integration between its auto-ML platform Azure Machine Learning and its Azure DevOps platform to enable MLOps functionality. Google Cloud has similarly moved to providing MLOps capabilities by outlining use of TensorFlow and KubeFlow along with Google Build.

Enterprises deciding on which MLOps solution to use will likely consider the following two factors: the auto-ML platform they’re using, and the orchestration framework to which they plan to deploy. For enterprises using a cloud auto-ML platform such as Amazon SageMaker, the default choice will likely be to use the associated integrations from the cloud provider and string together an MLOps workflow. The same will likely be true for standalone platforms such as DataRobot, which provide auto-ML tools with an associated MLOps capability.

Kubernetes has increasingly been a popular scalable orchestration platform for ML workloads. MLOps solutions such as KubeFlow, which help deploy to Kubernetes, and ParallelM’s MCenter product, which also supports Kubernetes, are likely to see growing adoption, given the widespread use of Kubernetes. Another advantage of Kubernetes is its ability to help streamline hybrid deployments across on-prem and cloud, which many companies demand, such as OpenAI, which uses Kubernetes across on-prem, and Microsoft Azure.

The MLOps market will not likely be a winner-take-all. We’ll likely see continued effort on part of auto-ML providers to create tight integrations that enable MLOps capabilities for their customers, and we’ll also see select deployment practices such as the use of Kubernetes continue to grow as developers begin to prioritize deployment possibilities from the outset as they consider different ML workflow platform providers.

More TechCrunch

Venture

Major Stripe investor Sequoia confirms $70B valuation, offers its investors a payday

Mary Ann Azevedo

11 hours ago

Payments giant Stripe has delayed going public for so long that its major investor Sequoia Capital is getting creative to offer returns to its limited partners. The venture firm emailed…

Major Stripe investor Sequoia confirms $70B valuation, offers its investors a payday

Security

Google’s Kurian approached Wiz, $23B deal could take a week to land, source says

Ingrid Lunden

Aisha Malik

16 hours ago

Deezer is the latest music streaming app to introduce an AI playlist feature. The company announced on Monday that a select number of paid users will be able to create…

Deezer chases Spotify and Amazon Music with its own AI playlist generator

Fintech

Caliza lands $8.5 million to bring real-time money transfers to Latin America using USDC

Anna Heim

17 hours ago

Real-time payments are becoming commonplace for individuals and businesses, but not yet for cross-border transactions. That’s what Caliza is hoping to change, starting with Latin America. Founded in 2021 by…

Caliza lands $8.5 million to bring real-time money transfers to Latin America using USDC

Adaptive builds automation tools to speed up construction payments

Kyle Wiggers

17 hours ago

Adaptive is a platform that provides tools designed to simplify payments and accounting for general construction contractors.

Adaptive builds automation tools to speed up construction payments

Transportation

How VanMoof’s new owners plan to win over its old customers

Rebecca Bellan

21 hours ago

When VanMoof declared bankruptcy last year, it left around 5,000 customers who had preordered e-bikes in the lurch. Now VanMoof is up and running under new management, and the company’s…

How VanMoof’s new owners plan to win over its old customers

Climate

Mitti Labs aims to make rice farming less harmful to the climate, starting in India

Jagmeet Singh

1 day ago

Mitti Labs aims to transform rice farming in India and other South Asian markets by reducing methane emissions by 50% and water consumption by 30%.

Mitti Labs aims to make rice farming less harmful to the climate, starting in India

Security

How to tell if your online accounts have been hacked

Lorenzo Franceschi-Bicchierai

1 day ago

This is a guide on how to check whether someone compromised your online accounts.

Robotics

Meet the soft robots that can amputate limbs and fuse with other robots

Anthony Ha

2 days ago

Roboticists at The Faboratory at Yale University have developed a way for soft robots to replicate some of the more unsettling things that animals and insects can accomplish — say,…

Enterprise companies find MLOps critical for reliability and performance

Rish Joshi

More posts from Rish Joshi

The rise of MLOps

How companies like DataRobot have driven the need for MLOps

The rapidly expanding MLOps solutions market

More TechCrunch

Tags

Google backs Indian open-source Uber rival

At last, Apple’s Messages app will support RCS and scheduling texts

Here are all the devices compatible with iOS 18

TikTok glitch allows Shop to appear to users under 18, despite adults-only policy

Lhoopa raises $80M to spur more affordable housing in the Philippines

Trump’s VP candidate JD Vance has long ties to Silicon Valley, and was a VC himself

TechCrunch Space: Space cowboys

Without Apple Intelligence, iOS 18 beta feels like a TV show that’s waiting for the finale

Apple’s public betas for iOS 18 are here to test out

Fisker has one major objector to its Ocean SUV fire sale

Major Stripe investor Sequoia confirms $70B valuation, offers its investors a payday

Google’s Kurian approached Wiz, $23B deal could take a week to land, source says

Bird Buddy’s new AI feature lets people name and identify individual birds

YouTube Music is testing an AI-generated radio feature and adding a song recognition tool

Elon Musk confirms Tesla ‘robotaxi’ event delayed due to design change

Moon cave! Discovery could redirect lunar colony and startup plays

Disrupt Deal Days are here: Prime savings for TechCrunch Disrupt 2024!

Deezer chases Spotify and Amazon Music with its own AI playlist generator

Caliza lands $8.5 million to bring real-time money transfers to Latin America using USDC

Adaptive builds automation tools to speed up construction payments

How VanMoof’s new owners plan to win over its old customers

Mitti Labs aims to make rice farming less harmful to the climate, starting in India

How to tell if your online accounts have been hacked

The AI financial results paradox

Google reportedly in talks to acquire cloud security company Wiz for $23B

Hank Green reckons with the power — and the powerlessness — of the creator

Synapse’s collapse has frozen nearly $160M from fintech users — here’s how it happened

Helixx wants to bring fast-food economics and Netflix pricing to EVs

India clings to cheap feature phones as brands struggle to tap new smartphone buyers

Meet the soft robots that can amputate limbs and fuse with other robots

Enterprise companies find MLOps critical for reliability and performance

Rish Joshi

More posts from Rish Joshi

The rise of MLOps

How companies like DataRobot have driven the need for MLOps

The rapidly expanding MLOps solutions market

More TechCrunch

Get the industry’s biggest tech news

Tags