Generative Deep Learning: Teaching Machines To Paint, Write, Compose, and Play

Generative Deep Learning: Teaching Machines To Paint, Write, Compose, and Play

by David Foster
Generative Deep Learning: Teaching Machines To Paint, Write, Compose, and Play

Generative Deep Learning: Teaching Machines To Paint, Write, Compose, and Play

by David Foster

Paperback(2nd ed.)

$79.99 
  • SHIP THIS ITEM
    Qualifies for Free Shipping
  • PICK UP IN STORE
    Check Availability at Nearby Stores

Related collections and offers


Overview

Generative AI is the hottest topic in tech. This practical book teaches machine learning engineers and data scientists how to use TensorFlow and Keras to create impressive generative deep learning models from scratch, including variational autoencoders (VAEs), generative adversarial networks (GANs), Transformers, normalizing flows, energy-based models, and denoising diffusion models.

The book starts with the basics of deep learning and progresses to cutting-edge architectures. Through tips and tricks, you'll understand how to make your models learn more efficiently and become more creative.

  • Discover how VAEs can change facial expressions in photos
  • Train GANs to generate images based on your own dataset
  • Build diffusion models to produce new varieties of flowers
  • Train your own GPT for text generation
  • Learn how large language models like ChatGPT are trained
  • Explore state-of-the-art architectures such as StyleGAN2 and ViT-VQGAN
  • Compose polyphonic music using Transformers and MuseGAN
  • Understand how generative world models can solve reinforcement learning tasks
  • Dive into multimodal models such as DALL.E 2, Imagen, and Stable Diffusion

This book also explores the future of generative AI and how individuals and companies can proactively begin to leverage this remarkable new technology to create competitive advantage.


Product Details

ISBN-13: 9781098134181
Publisher: O'Reilly Media, Incorporated
Publication date: 06/06/2023
Edition description: 2nd ed.
Pages: 453
Sales rank: 567,054
Product dimensions: 7.00(w) x 9.19(h) x 0.92(d)

About the Author

David Foster is a data scientist, entrepreneur, and educator specializing in AI applications within creative domains. As cofounder of Applied Data Science Partners (ADSP), he inspires and empowers organizations to harness the transformative power of data and AI. He holds an MA in Mathematics from Trinity College, Cambridge, an MSc in Operational Research from the University of Warwick, and is a faculty member of the Machine Learning Institute, with a focus on the practical applications of AI and real-world problem solving. His research interests include enhancing the transparency and interpretability of AI algorithms, and he has published literature on explainable machine learning within healthcare.

Table of Contents

Preface ix

Part I Introduction to Generative Deep Learning

1 Generative Modeling 1

What Is Generative Modeling? 1

Generative Versus Discriminative Modeling 2

Advances in Machine Learning 4

The Rise of Generative Modeling 5

The Generative Modeling Framework 7

Probabilistic Generative Models 10

Hello Wrodl! 13

Your First Probabilistic Generative Model 14

Naive Bayes 17

Hello Wrodl! Continued 20

The Challenges of Generative Modeling 22

Representation Learning 23

Setting Up Your Environment 27

Summary 29

2 Deep Learning 31

Structured and Unstructured Data 31

Deep Neural Networks 33

Keras and TensorFlow 34

Your First Deep Neural Network 35

Loading the Data 35

Building the Model 37

Compiling the Model 41

Training the Model 43

Evaluating the Model 44

Improving the Model 46

Convolutional Layers 46

Batch Normalization 51

Dropout Layers 54

Putting It All Together 55

Summary 59

3 Variational Autoencoders 61

The Art Exhibition 61

Autoencoders 64

Your First Autoencoder 66

The Encoder 66

The Decoder 68

Joining the Encoder to the Decoder 71

Analysis of the Autoencoder 72

The Variational Art Exhibition 75

Building a Variational Autoencoder 78

The Encoder 78

The Loss Function 84

Analysis of the Variational Autoencoder 85

Using VAEs to Generate Faces 86

Training the VAE 87

Analysis of the VAE 91

Generating New Faces 92

Latent Space Arithmetic 93

Morphing Between Faces 94

Summary 95

4 Generative Adversarial Networks 97

Ganimals 97

Introduction to GANs 99

Your First GAN 100

The Discriminator 101

The Generator 103

Training the GAN 107

GAN Challenges 112

Oscillating Loss 112

Mode Collapse 113

Uninformative Loss 114

Hyperparameters 114

Tackling the GAN Challenges 115

Wasserstein GAN 115

Wasserstein Loss 115

The Lipschitz Constraint 117

Weight Clipping 118

Training the WGAN 119

Analysis of the WGAN 120

WGAN-GP 121

The Gradient Penalty Loss 121

Analysis of WGAN-GP 125

Summary 127

Part II Teaching Machines to Paint, Write, Compose, and Play

5 Paint 131

Apples and Organges 132

Cycle GAN 135

Your First CycleGAN 137

Overview 137

The Generators (U-Net) 139

The Discriminators 142

Compiling the CycleGAN 144

Training the CycleGAN 146

Analysis of the CycleGAN 147

Creating a CycleGAN to Paint Like Monet 149

The Generators (ResNet) 150

Analysis of the CycleGAN 151

Neural Style Transfer 153

Content Loss 154

Style Loss 156

Total Variance Loss 160

Running the Neural Style Transfer 160

Analysis of the Neural Style Transfer Model 161

Summary 162

6 Write 165

The Literary Society for Troublesome Miscreants 166

Long Short-Term Memory Networks 167

Your First LSTM Network 168

Tokenization 168

Building the Dataset 171

The LSTM Architecture 172

The Embedding Layer 172

The LSTM Layer 174

The LSTM Cell 176

Generating New Text 179

RNN Extensions 183

Stacked Recurrent Networks 183

Gated Recurrent Units 185

Bidirectional Cells 187

Encoder-Decoder Models 187

A Question and Answer Generator 190

A Question-Answer Dataset 191

Model Architecture 192

Inference 196

Model Results 198

Summary 200

7 Compose 201

Preliminaries 202

Musical Notation 202

Your First Music-Generating RNN 205

Attention 206

Building an Attention Mechanism in Keras 208

Analysis of the RNN with Attention 213

Attention in Encoder-Decoder Networks 217

Generating Polyphonic Music 221

The Musical Organ 221

Your First MuseGAN 223

The MuseGAN Generator 226

Chords, Style, Melody, and Groove 227

The Bar Generator 229

Putting It All Together 230

The Critic 232

Analysis of the MuseGAN 233

Summary 235

8 Play 237

Reinforcement Learning 238

OpenAI Gym 239

World Model Architecture 241

The Variational Autoencoder 242

The MDN-RNN 243

The Controller 243

Setup 244

Training Process Overview 245

Collecting Random Rollout Data 246

Training the VAE 248

The VAE Architecture 249

Exploring the VAE 252

Collecting Data to Train the RNN 255

Training the MDN-RNN 257

The MDN-RNN Architecture 258

Sampling the Next z and Reward from the MDN-RNN 259

The MDN-RNN Loss Function 259

Training the Controller 261

The Controller Architecture 262

CMA-ES 262

Parallelizing CMA-ES 265

Output from the Controller Training 267

In-Dream Training 268

In-Dream Training the Controller 270

Challenges of In-Dream Training 272

Summary 273

9 The Future of Generative Modeling 275

Five Years of Progress 275

The Transformer 277

Positional Encoding 279

Multihead Attention 280

The Decoder 283

Analysis of the Transformer 283

BERT 285

GPT-2 285

MuseNet 286

Advances in Image Generation 287

ProGAN 287

Self-Attention GAN (SAGAN) 289

BigGAN 291

StyleGAN 292

Applications of Generative Modeling 296

AI Art 296

AI Music 297

10 Conclusion 299

Index 303

From the B&N Reads Blog

Customer Reviews