MLOps Engineering at Scale

MLOps Engineering at Scale

by Carl Osipov
MLOps Engineering at Scale

MLOps Engineering at Scale

by Carl Osipov

eBook

$36.99 

Available on Compatible NOOK devices, the free NOOK App and in My Digital Library.
WANT A NOOK?  Explore Now

Related collections and offers


Overview

Dodge costly and time-consuming infrastructure tasks, and rapidly bring your machine learning models to production with MLOps and pre-built serverless tools!

In MLOps Engineering at Scale you will learn:

    Extracting, transforming, and loading datasets
    Querying datasets with SQL
    Understanding automatic differentiation in PyTorch
    Deploying model training pipelines as a service endpoint
    Monitoring and managing your pipeline’s life cycle
    Measuring performance improvements

MLOps Engineering at Scale shows you how to put machine learning into production efficiently by using pre-built services from AWS and other cloud vendors. You’ll learn how to rapidly create flexible and scalable machine learning systems without laboring over time-consuming operational tasks or taking on the costly overhead of physical hardware. Following a real-world use case for calculating taxi fares, you will engineer an MLOps pipeline for a PyTorch model using AWS server-less capabilities.

About the technology
A production-ready machine learning system includes efficient data pipelines, integrated monitoring, and means to scale up and down based on demand. Using cloud-based services to implement ML infrastructure reduces development time and lowers hosting costs. Serverless MLOps eliminates the need to build and maintain custom infrastructure, so you can concentrate on your data, models, and algorithms.

About the book
MLOps Engineering at Scale teaches you how to implement efficient machine learning systems using pre-built services from AWS and other cloud vendors. This easy-to-follow book guides you step-by-step as you set up your serverless ML infrastructure, even if you’ve never used a cloud platform before. You’ll also explore tools like PyTorch Lightning, Optuna, and MLFlow that make it easy to build pipelines and scale your deep learning models in production.

What's inside

    Reduce or eliminate ML infrastructure management
    Learn state-of-the-art MLOps tools like PyTorch Lightning and MLFlow
    Deploy training pipelines as a service endpoint
    Monitor and manage your pipeline’s life cycle
    Measure performance improvements

About the reader
Readers need to know Python, SQL, and the basics of machine learning. No cloud experience required.

About the author
Carl Osipov implemented his first neural net in 2000 and has worked on deep learning and machine learning at Google and IBM.

Table of Contents

PART 1 - MASTERING THE DATA SET
1 Introduction to serverless machine learning
2 Getting started with the data set
3 Exploring and preparing the data set
4 More exploratory data analysis and data preparation
PART 2 - PYTORCH FOR SERVERLESS MACHINE LEARNING
5 Introducing PyTorch: Tensor basics
6 Core PyTorch: Autograd, optimizers, and utilities
7 Serverless machine learning at scale
8 Scaling out with distributed training
PART 3 - SERVERLESS MACHINE LEARNING PIPELINE
9 Feature selection
10 Adopting PyTorch Lightning
11 Hyperparameter optimization
12 Machine learning pipeline

Product Details

ISBN-13: 9781638356509
Publisher: Manning
Publication date: 03/22/2022
Sold by: SIMON & SCHUSTER
Format: eBook
Pages: 344
File size: 5 MB

About the Author

Carl Osipov has been working in the information technology industry since 2001, with a focus on projects in big data analytics and machine learning in multi-core, distributed systems, such as service-oriented architecture and cloud computing platforms. While at IBM, Carl helped IBM Software Group to shape its strategy around the use of Docker and other container-based technologies for serverless cloud computing using IBM Cloud and Amazon Web Services. At Google, Carl learned from the world’s foremost experts in machine learning and helped manage the company’s efforts to democratize artificial intelligence with Google Cloud and TensorFlow. Carl is an author of over 20 articles in professional, trade, and academic journals; an inventor with six patents at USPTO; and the holder of three corporate technology awards from IBM.

Table of Contents

Preface xiii

Acknowledgments xv

About this book xvii

About the author xxi

About the cover illustration xxii

Part 1 Mastering the data set 1

1 Introduction to serverless machine learning 3

1.1 What is a machine learning platform? 5

1.2 Challenges when designing a machine learning platform 5

1.3 Public clouds for machine learning platforms 7

1.4 What is serverless machine learning? 8

1.5 Why serverless machine learning? 8

Serverless vs. laaS and PaaS 10

Serverless machine learning life cycle 11

1.6 Who is this book for? 11

What you can get out of this book 11

1.7 How does this book teach? 12

1.8 When is this book not for you? 13

1.9 Conclusions 14

2 Getting started with the data set 15

2.1 Introducing the Washington, DC, taxi rides data set 16

What is the business use case? 16

What are the business rules? 16

What is the schema for the business service? 17

What are the options for implementing the business service? 18

What data assets are available for the business service? 19

Downloading and unzipping the data set 19

2.2 Starting with object storage for the data set 20

Understanding object storage vs. filesystems 21

Authenticating with Amazon Web Services 22

Creating a serverless object storage bucket 23

2.3 Discovering the schema for the data set 26

Introducing AWS Glue 26

Authorizing the crawler to access your objects 27

Using a crawler to discover the data schema 28

2.4 Migrating to columnar storage for more efficient analytics 31

Introducing column-oriented data formats for analytics 31

Migrating to a column-oriented data format 33

3 Exploring and preparing the data set 38

3.1 Getting started with interactive querying 39

Choosing the right use case for interactive querying 39

Introducing AWS Athena 40

Preparing a sample data set 42

Interactive querying using Athena from a browser 43

Interactive querying using a sample data set 44

Querying the DC taxi data set 49

3.2 Getting started with data quality 49

From "garbage in, garbage out" to data quality 50

Before starting with data quality 51

Normative principles for data quality 52

3.3 Applying VACUUM to the DC taxi data 58

Enforcing the schema to ensure valid values 59

Cleaning up invalid fare amounts 63

Improving the accuracy 66

3.4 Implementing VACUUM in a PySpark job 74

4 More exploratory data analysis and data preparation 81

4.1 Getting started with data sampling 82

Exploring the summary statistics of the cleaned-up data set 82

Choosing the right sample size for the test data set 86

Exploring the statistics of alternative sample sizes 88

Using a PySpark job to sample the test set 92

Part 2 PyTorch for serverless machine learning 101

5 Introducing PyTorch: Tensor basics 103

5.1 Getting started with tensors 104

5.2 Getting started with PyTorch tensor creation operations 108

5.3 Creating PyTorch tensors of pseudorandom and interval values 110

5.4 PyTorch tensor operations and broadcasting 112

5.5 PyTorch tensors vs. native Python lists 116

6 Core PyTorch: Autograd, optimizers, and utilities 120

6.1 Understanding the basics of autodiff 121

6.2 Linear regression using PyTorch automatic differentiation 129

6.3 Transitioning to PyTorch optimizers for gradient descent 132

6.4 Getting started with data set batches for gradient descent 135

6.5 Data set batches with PyTorch Dataset and DataLoader 136

6.6 Dataset and DataLoader classes for gradient descent with batches 140

7 Serverless machine learning at scale 143

7.1 What if a single node is enough for my machine learning model? 144

7.2 Using IterableDataset and ObjectStorageDataset 145

7.3 Gradient descent with out-of-memory data sets 149

7.4 Faster PyTorch tensor operations with GPUs 154

7.5 Scaling up to use GPU cores 159

8 Scaling out with distributed training 162

8.1 What if the training data set does not fit in memory? 163

Illustrating gradient accumulation 163

Preparing a sample model and data set 164

Understanding gradient descent using out-of-memory data shards 166

8.2 Parameter server approach to gradient accumulation 169

8.3 Introducing logical ring-based gradient descent 170

8.4 Understanding ring-based distributed gradient descent 174

8.5 Phase 1: Reduce-scatter 176

8.6 Phase 2: All-gather 181

Part 3 Serverless machine learning pipeline 189

9 Feature selection 191

9.1 Guiding principles for feature selection 192

Related to the label 192

Recorded before inference time 194

Supported by abundant examples 196

Expressed as a number with a meaningful scale 197

Based on expert insights about the project 198

9.2 Feature selection case studies 198

9.3 Feature selection using guiding principles 199

Related to the label 199

Recorded before inference time 203

Supported by abundant examples 205

Numeric with meaningful magnitude 207

Bring expert insight to the problem 208

9.4 Selecting features for the DC taxi data set 210

10 Adopting Py Torch Lightning 213

10.1 Understanding PyTorch Lightning 213

Converting PyTorch-model training to PyTorch Lightning 214

Enabling test and reporting for a trained model 221

Enabling validation during model training 223

11 Hyperparameter optimization 228

11.1 Hyperparameter optimization with Optuna 229

Understanding loguniform hyperparameters 230

Using categorical and log-uniform hyperparameters 231

11.2 Neural network layers configuration as a hyperparameter 233

11.3 Experimenting with the batch normalization hyperparameter 235

Using Optuna study for hyperparameter optimization 240

Visualizing an HPO study in Optuna 242

12 Machine learning pipeline 245

12.1 Describing the machine learning pipeline 246

12.2 Enabling PyTorch-distributed training support with Kaen 249

Understanding PyTorch-distributed training settings 255

12.3 Unit testing model training in a local Kaen container 257

12.4 Hyperparameter optimization with Optuna 259

Enabling MLFlow support 264

Using HPO for DcTaxiModel in a local Kaen provider 265

Training with the Kaen AWS provider 269

Appendix A Introduction to machine learning 273

Appendix B Getting started with Docker 300

Index 311

From the B&N Reads Blog

Customer Reviews