Foundations of Predictive Analytics / Edition 1

Foundations of Predictive Analytics / Edition 1

ISBN-10:
0367381680
ISBN-13:
9780367381684
Pub. Date:
09/05/2019
Publisher:
Taylor & Francis
ISBN-10:
0367381680
ISBN-13:
9780367381684
Pub. Date:
09/05/2019
Publisher:
Taylor & Francis
Foundations of Predictive Analytics / Edition 1

Foundations of Predictive Analytics / Edition 1

$79.95
Current price is , Original price is $79.95. You
$79.95 
  • SHIP THIS ITEM
    Qualifies for Free Shipping
  • PICK UP IN STORE
    Check Availability at Nearby Stores

Overview

Drawing on the authors’ two decades of experience in applied modeling and data mining, Foundations of Predictive Analytics presents the fundamental background required for analyzing data and building models for many practical applications, such as consumer behavior modeling, risk and marketing analytics, and other areas. It also discusses a variety of practical topics that are frequently missing from similar texts.

The book begins with the statistical and linear algebra/matrix foundation of modeling methods, from distributions to cumulant and copula functions to Cornish–Fisher expansion and other useful but hard-to-find statistical techniques. It then describes common and unusual linear methods as well as popular nonlinear modeling approaches, including additive models, trees, support vector machine, fuzzy systems, clustering, naïve Bayes, and neural nets. The authors go on to cover methodologies used in time series and forecasting, such as ARIMA, GARCH, and survival analysis. They also present a range of optimization techniques and explore several special topics, such as Dempster–Shafer theory.

An in-depth collection of the most important fundamental material on predictive analytics, this self-contained book provides the necessary information for understanding various techniques for exploratory data analysis and modeling. It explains the algorithmic details behind each technique (including underlying assumptions and mathematical formulations) and shows how to prepare and encode data, select variables, use model goodness measures, normalize odds, and perform reject inference.

Web Resource
The book’s website at www.DataMinerXL.com offers the DataMinerXL software for building predictive models. The site also includes more examples and information on modeling.


Product Details

ISBN-13: 9780367381684
Publisher: Taylor & Francis
Publication date: 09/05/2019
Pages: 338
Product dimensions: 6.12(w) x 9.19(h) x (d)

About the Author

James Wu is a Fixed Income Quant with extensive expertise in a wide variety of applied analytical solutions in consumer behavior modeling and financial engineering. He previously worked at ID Analytics, Morgan Stanley, JPMorgan Chase, Los Alamos Computational Group, and CASA. He earned a PhD from the University of Idaho.

Stephen Coggeshall is the Chief Technology Officer of ID Analytics. He previously worked at Los Alamos Computational Group, Morgan Stanley, HNC Software, CASA, and Los Alamos National Laboratory. During his over 20 year career, Dr. Coggeshall has helped teams of scientists develop practical solutions to difficult business problems using advanced analytics. He earned a PhD from the University of Illinois and was named 2008 Technology Executive of the Year by the San Diego Business Journal.

Table of Contents

List of Figures xv

List of Tables xvii

Preface xix

1 Introduction 1

1.1 What Is a Model? 1

1.2 What Is a Statistical Model? 2

1.3 The Modeling Process 3

1.4 Modeling Pitfalls 4

1.5 Characteristics of Good Modelers 5

1.6 The Future of Predictive Analytics 7

2 Properties of Statistical Distributions 9

2.1 Fundamental Distributions 9

2.1.1 Uniform Distribution 9

2.1.2 Details of the Normal (Gaussian) Distribution 10

2.1.3 Lognormal Distribution 19

2.1.4 &CyrC; Distribution 20

2.1.5 Chi-Squared Distribution 22

2.1.6 Non-Central Chi-Squared Distribution 25

2.1.7 Student's t-Distribution 28

2.1.8 Multivariate t-Distribution 29

2.1.9 F-Distribution 31

2.1.10 Binomial Distribution 31

2.1.11 Poisson Distribution 32

2.1.12 Exponential Distribution 32

2.1.13 Geometric Distribution 33

2.1.14 Hypergeometric Distribution 33

2.1.15 Negative Binomial Distribution 34

2.1.16 Inverse Gaussian (IG) Distribution 35

2.1.17 Normal Inverse Gaussian (NIG) Distribution 36

2.2 Central Limit Theorem 38

2.3 Estimate of Mean, Variance, Skewness, and Kurtosis from Sample Data 40

2.4 Estimate of the Standard Deviation of the Sample Mean 40

2.5 (Pseudo) Random Number Generators 41

2.5.1 Mersenne Twister Pseudorandom Number Generator 42

2.5.2 Box-Muller Transform for Generating a Normal Distribution 42

2.6 Transformation of a Distribution Function 43

2.7 Distribution of a Function of Random Variables 43

2.7.1 Z = X + Y 44

2.7.2 Z = X Y 44

2.7.3 (Z1, Z2,…, Zn) = (X1, X2,…, Xn) Y 44

2.7.4 Z = X/Y 45

2.7.5 Z = max(X,Y) 45

2.7.6 Z = min(X,Y) 45

2.8 Moment Generating Function 46

2.8.1 Moment Generating Function of Binomial Distribution 46

2.8.2 Moment Generating Function of Normal Distribution 47

2.8.3 Moment Generating Function of the Γ Distribution 47

2.8.4 Moment Generating Function of Chi-Square Distribution 47

2.8.5 Moment Generating Function of the Poisson Distribution 48

2.9 Cumulant Generating Function 48

2.10 Characteristic Function 50

2.10.1 Relationship between Cumulative Function and Characteristic Function 51

2.10.2 Characteristic Function of Normal Distribution 52

2.10.3 Characteristic Function of &CyrC; Distribution 52

2.11 Chebyshev's Inequality 53

2.12 Markov's Inequality 54

2.13 Gram-Charlier Series 54

2.14 Edgeworth Expansion 55

2.15 Cornish-Fisher Expansion 56

2.15.1 Lagrange Inversion Theorem 56

2.15.2 Cornish-Fisher Expansion 57

2.16 Copula Functions 58

2.16.1 Gaussian Copula 60

2.16.2 t-Copula 61

2.16.3 Archimedean Copula 62

3 Important Matrix Relationships 63

3.1 Pseudo-Inverse of a Matrix 63

3.2 A Lemma of Matrix Inversion 64

3.3 Identity for a Matrix Determinant 66

3.4 Inversion of Partitioned Matrix 66

3.5 Determinant of Partitioned Matrix 67

3.6 Matrix Sweep and Partial Correlation 67

3.7 Singular Value Decomposition (SVD) 69

3.8 Diagonalization of a Matrix 71

3.9 Spectral Decomposition of a Positive Semi-Definite Matrix 75

3.10 Normalization in Vector Space 76

3.11 Conjugate Decomposition of a Symmetric Definite Matrix 77

3.12 Cholesky Decomposition 77

3.13 Cauchy-Schwartz Inequality 80

3.14 Relationship of Correlation among Three Variables 81

4 Linear Modeling and Regression 83

4.1 Properties of Maximum Likelihood Estimators 84

4.1.1 Likelihood Ratio Test 87

4.1.2 Wald Test 87

4.1.3 Lagrange Multiplier Statistic 88

4.2 Linear Regression 88

4.2.1 Ordinary Least Squares (OLS) Regression 89

4.2.2 Interpretation of the Coefficients of Linear Regression 95

4.2.3 Regression on Weighted Data 97

4.2.4 Incrementally Updating a Regression Model with Additional Data 100

4.2.5 Partitioned Regression 101

4.2.6 How Does the Regression Change When Adding One More Variable? 101

4.2.7 Linearly Restricted Least Squares Regression 103

4.2.8 Significance of the Correlation Coefficient 105

4.2.9 Partial Correlation 105

4.2.10 Ridge Regression 105

4.3 Fisher's Linear Discriminant Analysis 106

4.4 Principal Component Regression (PCR) 109

4.5 Factor Analysis 110

4.6 Partial Least Squares Regression (PLSR) 111

4.7 Generalized Linear Model (GLM) 113

4.8 Logistic Regression: Binary 116

4.9 Logistic Regression: Multiple Nominal 119

4.10 Logistic Regression: Proportional Multiple Ordinal 121

4.11 Fisher Scoring Method for Logistic Regression 123

4.12 Tobit Model: A Censored Regression Model 125

4.12.1 Some Properties of the Normal Distribution 125

4.12.2 Formulation of the Tobit Model 126

5 Nonlinear Modeling 129

5.1 Naive Bayesian Classifier 129

5.2 Neural Network 131

5.2.1 Back Propagation Neural Network 131

5.3 Segmentation and Tree Models 137

5.3.1 Segmentation 137

5.3.2 Tree Models 138

5.3.3 Sweeping to Find the Best Cutpoint 140

5.3.4 Impurity Measure of a Population: Entropy and Gini Index 143

5.3.5 Chi-Square Splitting Rule 147

5.3.6 Implementation of Decision Trees 148

5.4 Additive Models 151

5.4.1 Boosted Tree 153

5.4.2 Least Squares Regression Boosting Tree 154

5.4.3 Binary Logistic Regression Boosting Tree 155

5.5 Support Vector Machine (SVM) 158

5.5.1 Wolfe Dual 158

5.5.2 Linearly Separable Problem 159

5.5.3 Linearly Inseparable Problem 161

5.5.4 Constructing Higher-Dimensional Space and Kernel 162

5.5.5 Model Output 163

5.5.6 C-Support Vector Classification (C-SVC) for Classification 164

5.5.7 ε-Support Vector Regression (ε-SVR) for Regression 164

5.5.8 The Probability Estimate 167

5.6 Fuzzy Logic System 168

5.6.1 A Simple Fuzzy Logic System 168

5.7 Clustering 169

5.7.1 K Means, Fuzzy C Means 170

5.7.2 Nearest Neighbor, K Nearest Neighbor (KNN) 171

5.7.3 Comments on Clustering Methods 171

6 Time Series Analysis 173

6.1 Fundamentals of Forecasting 173

6.1.1 Box-Cox Transformation 174

6.1.2 Smoothing Algorithms 175

6.1.3 Convolution of Linear Filters 176

6.1.4 Linear Difference Equation 177

6.1.5 The Autocovariance Function and Autocorrelation Function 178

6.1.6 The Partial Autocorrelation Function 179

6.2 ARIMA Models 181

6.2.1 MA(q) Process 182

6.2.2 AR(p) Process 184

6.2.3 ARMA(p,q) Process 186

6.3 Survival Data Analysis 187

6.3.1 Sampling Method 190

6.4 Exponentially Weighted Moving Average (EWMA) and GARCH(1, 1) 191

6.4.1 Exponentially Weighted Moving Average (EWMA) 191

6.4.2 ARCH and GARCH Models 192

7 Data Preparation and Variable Selection 195

7.1 Data Quality and Exploration 196

7.2 Variable Scaling and Transformation 197

7.3 How to Bin Variables 197

7.3.1 Equal Interval 198

7.3.2 Equal Population 198

7.3.3 Tree Algorithms 199

7.4 Interpolation in One and Two Dimensions 199

7.5 Weight of Evidence (WOE) Transformation 200

7.6 Variable Selection Overview 204

7.7 Missing Data Imputation 206

7.8 Stepwise Selection Methods 207

7.8.1 Forward Selection in Linear Regression 208

7.8.2 Forward Selection in Logistic Regression 208

7.9 Mutual Information, KL Distance 209

7.10 Detection of Multicollinearity 210

8 Model Goodness Measures 213

8.1 Training, Testing, Validation 213

8.2 Continuous Dependent Variable 215

8.2.1 Example: Linear Regression 217

8.3 Binary Dependent Variable (Two-Group Classification) 218

8.3.1 Kolmogorov-Smirnov (KS) Statistic 218

8.3.2 Confusion Matrix 220

8.3.3 Concordant and Discordant 221

8.3.4 R2 for Logistic Regression 223

8.3.5 AIC and SBC 224

8.3.6 Hosmer-Lemeshow Goodness-of-Fit Test 224

8.3.7 Example: Logistic Regression 225

8.4 Population Stability Index Using Relative Entropy 227

9 Optimization Methods 231

9.1 Lagrange Multiplier 232

9.2 Gradient Descent Method 234

9.3 Newton-Raphson Method 236

9.4 Conjugate Gradient Method 238

9.5 Quasi-Newton Method 240

9.6 Genetic Algorithms (GA) 242

9.7 Simulated Annealing 242

9.8 Linear Programming 243

9.9 Nonlinear Programming (NLP) 247

9.9.1 General Nonlinear Programming (GNLP) 248

9.9.2 Lagrange Dual Problem 249

9.9.3 Quadratic Programming (QP) 250

9.9.4 Linear Complementarity Programming (LCP) 254

9.9.5 Sequential Quadratic Programming (SQP) 256

9.10 Nonlinear Equations 263

9.11 Expectation-Maximization (EM) Algorithm 264

9.12 Optimal Design of Experiment 268

10 Miscellaneous Topics 271

10.1 Multidimensional Scaling 271

10.2 Simulation 274

10.3 Odds Normalization and Score Transformation 278

10.4 Reject Inference 280

10.5 Dempster-Shafer Theory of Evidence 281

10.5.1 Some Properties in Set Theory 281

10.5.2 Basic Probability Assignment, Belief Function, and Plausibility Function 282

10.5.3 Dempster-Shafer's Rule of Combination 285

10.5.4 Applications of Dempster-Shafer Theory of Evidence: Multiple Classifier Function 287

Appendix A Useful Mathematical Relations 291

A.1 Information Inequality 291

A.2 Relative Entropy 291

A.3 Saddle-Point Method 292

A.4 Stirling's Formula 293

A.5 Convex Function and Jensen's Inequality 294

Appendix B DataMinerXL - Microsoft Excel Add-In for Building Predictive Models 299

B.1 Overview 299

B.2 Utility Functions 299

B.3 Data Manipulation Functions 300

B.4 Basic Statistical Functions 300

B.5 Modeling Functions for All Models 301

B.6 Weight of Evidence Transformation Functions 301

B.7 Linear Regression Functions 302

B.8 Partial Least Squares Regression Functions 302

B.9 Logistic Regression Functions 303

B.10 Time Series Analysis Functions 303

B.11 Naive Bayes Classifier Functions 303

B.12 Tree-Based Model Functions 304

B.l3 Clustering and Segmentation Functions 304

B.14 Neural Network Functions 304

B.15 Support Vector Machine Functions 304

B.16 Optimization Functions 305

B.17 Matrix Operation Functions 305

B.18 Numerical Integration Functions 306

B.19 Excel Built-in Statistical Distribution Functions 306

Bibliography 309

Index 313

From the B&N Reads Blog

Customer Reviews