Open Source MLOPs: Version Control and Automation for Machine Learning Pipelines With DVC and CML

Paperback Published on: 26/09/2025
Price: £29.99
Free UK delivery on orders over £25
Not available
This product is currently unavailable
Make and edit your lists in your account
No stock available in any shop.
Not available
This product is currently unavailable
No stock available in any shop.

Synopsis

Build automated machine learning pipelines using CI/CD techniques applied to the domain of machine learning

Key Features

Create reproducible and automated machine learning pipelines using DVC and CML

Speed up your machine learning development and promote collaboration using CI/CD techniques

Ensure you stay ahead of the curve in the fiercely competitive machine learning market

Book DescriptionThe process of deriving useful insights from machine learning can be an arduous, though rewarding, one, even for data science practitioners. It’s worth investing in any tools or techniques that can assist with the process.

Open Source MLOPs with DVC and CML will take you through two such techniques, which will allow you to automate your machine learning pipelines and make them eminently reproducible.

You'll begin with an introduction to Data Version Control (DVC) and learn how it can help you keep track of your machine learning artifacts using a familiar Git-like approach. This will lead you on to building end-to-end machine learning pipelines, complete with visualizations of the results. We move on to Continuous Machine Learning (CML), with which you can automate the training and testing of machine learning models so they can run alongside the rest of your CI/CD pipeline, ensuring stability and reproducibility.

By the end of this book, you will be able to develop reproducible pipelines as directed acyclic graphs and run those pipelines effortlessly in the cloud to speed up the development of your machine learning models.What you will learn

Create an S3 bucket to act as a remote repository

Use remote storage and a GitHub repository to create a model registry

Construct pipelines in YAML format in the dvc.yaml file

Define for loops within the DVC pipeline to reduce repetition

Share experiments with a coworker

Access and save objects using DVC's Python API

Run CML workloads on AWS EC2 instances including GPU-equipped machines

Report results such as DVC metrics and plots to a GitHub pull request

Who this book is forPredominantly this book will be for people who want to learn how to use DVC and CML to build pipelines of the deployment of machine learning models. These people are most likely to be data scientists, or possibly software engineers, or students in training on PhD or MSc programs who are developing machine learning models. The book may also be useful for those interested in the Data Version Control aspect who are not (or not currently) developing or deploying machine learning models.

A bare minimum knowledge of data analytics, and a concern for producing analysis reproducibly and eagerness to learn is expected.

Publisher information

  • Publisher: Packt Publishing Limited
  • ISBN: 9781801813204
  • Number of pages: 186
  • Dimensions: 235 x 191 mm
  • Languages: English

Customer Reviews