Machine Learning Bookcamp: Build a portfolio of real-life projects
472Machine Learning Bookcamp: Build a portfolio of real-life projects
472eBook
Available on Compatible NOOK devices, the free NOOK App and in My Digital Library.
Related collections and offers
Overview
Summary
In Machine Learning Bookcamp you will:
Collect and clean data for training models
Use popular Python tools, including NumPy, Scikit-Learn, and TensorFlow
Apply ML to complex datasets with images
Deploy ML models to a production-ready environment
The only way to learn is to practice! In Machine Learning Bookcamp, you’ll create and deploy Python-based machine learning models for a variety of increasingly challenging projects. Taking you from the basics of machine learning to complex applications such as image analysis, each new project builds on what you’ve learned in previous chapters. You’ll build a portfolio of business-relevant machine learning projects that hiring managers will be excited to see.
Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.
About the technology
Master key machine learning concepts as you build actual projects! Machine learning is what you need for analyzing customer behavior, predicting price trends, evaluating risk, and much more. To master ML, you need great examples, clear explanations, and lots of practice. This book delivers all three!
About the book
Machine Learning Bookcamp presents realistic, practical machine learning scenarios, along with crystal-clear coverage of key concepts. In it, you’ll complete engaging projects, such as creating a car price predictor using linear regression and deploying a churn prediction service. You’ll go beyond the algorithms and explore important techniques like deploying ML applications on serverless systems and serving models with Kubernetes and Kubeflow. Dig in, get your hands dirty, and have fun building your ML skills!
What's inside
Collect and clean data for training models
Use popular Python tools, including NumPy, Scikit-Learn, and TensorFlow
Deploy ML models to a production-ready environment
About the reader
Python programming skills assumed. No previous machine learning knowledge is required.
About the author
Alexey Grigorev is a principal data scientist at OLX Group. He runs DataTalks.Club, a community of people who love data.
Table of Contents
1 Introduction to machine learning
2 Machine learning for regression
3 Machine learning for classification
4 Evaluation metrics for classification
5 Deploying machine learning models
6 Decision trees and ensemble learning
7 Neural networks and deep learning
8 Serverless deep learning
9 Serving models with Kubernetes and Kubeflow
Product Details
ISBN-13: | 9781638351054 |
---|---|
Publisher: | Manning |
Publication date: | 11/23/2021 |
Sold by: | SIMON & SCHUSTER |
Format: | eBook |
Pages: | 472 |
File size: | 21 MB |
Note: | This product may take a few minutes to download. |
About the Author
Table of Contents
Foreword xi
Preface xiii
Acknowledgments xv
About this book xvii
About the author xxi
About the cover illustration xxii
1 Introduction to machine learning 1
1.1 Machine learning 2
Machine learning vs. rule-based systems 4
When machine learning isn't helpful 7
Supervised machine learning 7
1.2 Machine learning process 9
Business understanding 10
Data understanding 11
Data preparation 11
Modeling 11
Evaluation 12
Deployment 12
Iterate 12
1.3 Modeling and model validation 12
2 Machine learning for regression 18
2.1 Car-price prediction project 19
Downloading the dataset 19
2.2 Exploratory data analysis 20
Exploratory data analysis toolbox 21
Reading and preparing data 22
Target variable, analysis 25
Checking for missing values 28
Validation framework 29
2.3 Machine learning for regression 32
Linear regression 32
Training linear regression model 41
2.4 Predicting the price 43
Baseline solution 43
RMSE: Evaluating model quality 46
Validating the model 50
Simple feature engineering 51
Handling categorical variables 53
Regularization 57
Using the model 61
2.5 Next steps 62
Exercises 62
Other projects 63
3 Machine learning for classification 65
3.1 Churn prediction project 66
Telco churn dataset 67
Initial data preparation 67
Exploratory data analysis 75
Feature importance 78
3.2 Feature engineering 88
One-hot encoding for categorical variables 88
3.3 Machine learning for classification 92
Logistic regression 92
Training logistic regression 95
Model interpretation 100
Using the model 108
3.4 Next steps 110
Exercises 110
Other projects 110
4 Evaluation metrics for classification 113
4.1 Evaluation metrics 114
Classification accuracy 114
Dummy baseline 117
4.2 Confusion table 119
Introduction to the confusion table 119
Calculating the confusion table with NumPy 122
Precision and recall 126
4.3 ROC curve and AUC score 129
True positive rate and false positive rate 130
Evaluating a model at multiple thresholds 131
Random baseline model 134
The ideal model 136
ROC Curve 140
Area under the ROC curve (AUC) 144
4.4 Parameter tuning 147
K-fold cross-validation 147
Finding best parameters 149
4.5 Next steps 151
Exercises 151
Other projects 152
5 Deploying machine learning models 154
5.1 Churn-prediction model 155
Using the model 155
Using Pickle to save and load the model 156
5.2 Model serving 159
Web services 160
Flask 161
Saving churn model with Flask 163
5.3 Managing dependencies 166
Pipenv 166
Docker 170
5.4 Deployment 174
AWS Elastic Beanstalk 175
5.5 Next steps 178
Exercises 179
Other projects 179
6 Decision trees and ensemble learning 180
6.1 Credit risk scoring project 181
Credit scoring dataset 181
Data cleaning 182
Dataset preparation 187
6.2 Decision trees 190
Decision tree classifier 191
Decision tree learning algorithm 194
Parameter tuning for decision tree 201
6.3 Random forest 203
Training a random forest 206
Parameter tuning for random forest 207
6.4 Gradient boosting 210
XGBoost: Extreme gradient boosting 211
Model performance monitoring 213
Parameter tuning for XGBoost 214
Testing the final model 220
6.5 Next steps 222
Exercises 222
Other projects 223
7 Neural networks and deep learning 224
7.1 Fashion classification 225
GPU vs. CPU 225
Downloading the clothing dataset 226
TensorFlow and Keras 228
Loading images 228
7.2 Convolutional neural networks 230
Using a pretrained model 230
Getting predictions 233
7.3 Internals of the model 234
Convolutional layers 234
Dense layers 237
7.4 Training the model 240
Transfer learning 240
Loading the data 241
Creating the model 242
Training the model 245
Adjusting the learning rate 249
Saving the model and checkpointing 251
Adding more layers 252
Regularization and dropout 254
Data augmentation 259
Training a larger model 264
7.5 Using the model 265
Loading the model 265
Evaluating the model 266
Getting the predictions 267
7.6 Next steps 269
Exercises 269
Other projects 269
8 Serverless deep learning 271
8.1 Serverless: AWS Lambda 272
TensorFlow Lite 273
Converting the model to TF Lite format 274
Preparing the images 274
Using the TensorFlow Lite model 276
Code for the lambda function 277
Preparing the Docker image 279
Pushing the image to AWSECR 281
Creating the lambda function 281
Creating the API Gateway 285
8.2 Next steps 290
Exercises 290
Other projects 290
9 Serving models with Kubernetes and Kubeflow 292
9.1 Kubernetes and Kubeflow 293
9.2 Serving models with TensorFlow Serving 293
Overview of the serving architecture 294
The saved_model format 295
Running TensorFlow Serving locally 296
Invoking the TF Sewing model from Jupyter 297
Creating the Gateway service 301
9.3 Model deployment with Kubernetes 304
Introduction to Kubernetes 304
Creating a Kubernetes cluster on AWS 305
Preparing the Docker images 307
Deploying to Kubernetes 310
Testing the service 316
9.4 Model deployment with Kubeflow 317
Preparing the model: Uploading it to S3 317
Deploying TensorFlow models with KFServing 318
Accessing the model 319
KFServing transformers 321
Testing the transformer 323
Deleting the EKS cluster 324
9.5 Next steps 324
Exercises 324
Other projects 325
Appendix A Preparing the environment 326
Appendix B Introduction to Python 357
Appendix C Introduction to NumPy 374
Appendix D Introduction to Pandas 404
Appendix E AWS SageMaker 427
Index 439