Table of Contents
    Home / Definitions / MLOps
    Definitions 3 min read

    MLOps (Machine Learning Operations) is an engineering function of a team composed of programmers, data scientists, and DevOps engineers tasked to train, deploy, and monitor machine learning (ML) models in production.

    Implementing artificial intelligence (AI) and ML models requires continuous integration and deployment, and MLOps ensures tracking, validation, and governance.

    What is the purpose of machine learning ops?

    Adopting machine learning in production presents a challenge as it consists of various components with different levels of complexity, from data ingestion and preparation to model training, deployment, and monitoring. Responding to the demands of the machine learning lifecycle requires collaboration. An ML lifecycle encompasses an array of stages, ranging from experimentation to continuous integration, delivery, and deployment.

    How does machine learning ops work?

    Modeled after DevOps, MLOps orchestrates a team of ML engineers, data scientists, and IT experts and combines machine learning, app development, and IT operations into one environment.

    With its sets of principles and best practices, MLOps is the key to the success of enterprise AI adoption. A typical ML model management system includes: 

    Data acquisition: A stage of data collection, ingestion, and preparation that integrates all acquired data for validation and analysis. 

    Development: Using labeled libraries of data to build and train ML models.  

    Pre-production: Involves the validation of the ML system and model evaluation to test its readiness for deployment.  

    Production: An iterative deployment and continuous monitoring of the ML model in production at scale. 

    Features of machine learning ops

    MLOps focuses on machine learning projects, borrowing software engineering principles from DevOps, particularly the iterative approach to the writing, delivery, and deployment of enterprise applications.

    The components of MLOps can be divided into three parts: 

    1. Data prep and analytics: Aggregating data and creating reproducible datasets and visualizations. 
    2. Feature engineering: Developing features and making them visible and shareable across data teams. 
    3. ML modeling: Building ML models for deployment in actual production following these principles:  
    • Model training and tuning by leveraging open source libraries and machine learning tools.
    • Model review and governance involve discovering and collaborating across ML models by tracking their lineage, versions, and lifecycle transitions. 
    • Model inference and serving cover testing and quality assurance, including production specifics, like managing model refresh frequency and inference request times.
    • Model deployment and monitoring put ML models in production by automating permissions, creating clusters, and enabling REST API endpoints.
    • Model retraining automates corrective actions to the deployed model.

    Advantages of machine learning ops

    In adopting MLOps, an organization can realize these benefits: 

    1. Efficiency — develops high-quality ML models for deployment faster;
    2. Scalability — offers scalability, capable of managing, deploying, and monitoring thousands of ML models;
    3. Reduced risks — allows for transparency and regulatory compliance to minimize the risks involving the use of ML in production.