SALE ON YALE! History • Biography & more... TELL ME MORE

Close Notification

Your cart does not contain any items

Multivariate Statistics

Classical Foundations and ModernMachine Learning

Hemant Ishwaran (University of Miami, U.S.A)

$263

Hardback

Forthcoming
Pre-Order now

QTY:

English
Chapman & Hall/CRC
31 March 2025
This book explores multivariate statistics from both traditional and modern perspectives. The first section covers core topics like multivariate normality, MANOVA, discrimination, PCA, and canonical correlation analysis. The second section includes modern concepts such as gradient boosting, random forests, variable importance, and causal inference.

A key theme is leveraging classical multivariate statistics to explain advanced topics and prepare for contemporary methods. For example, linear models provide a foundation for understanding regu-larization with AIC and BIC, leading to a deeper analysis of regularization through generalization error and the VC theorem. Discriminant analysis introduces the weighted Bayes rule, which leads into modern classification techniques for class-imbalanced machine learning problems. Steepest descent serves as a precursor to matching pursuit and gradient boosting. Axis-aligned trees like CART, a classical tool, set the stage for more recent methods like super greedy trees.

Another central theme is training error. Introductory courses often caution that reducing training error too aggressively can lead to overfitting. At the same time, training error, also referred to as empirical risk, is a foundational concept in statistical learning theory. In regression, training error corresponds to the residual sum of squares, and minimizing it results in the least squares solution, which can lead to overfitting. Regardless of this concern, empirical risk plays a pivotal role in evaluating the potential for effective learning. The principle of empirical risk minimization demonstrates that minimizing training error can be advantageous when paired with regularization. This idea is further examined through techniques such as penalization, matching pursuit, gradient boosting, and super greedy tree constructions.

Key Features:

• Covers both classical and contemporary multivariate statistics. • Each chapter includes a carefully selected set of exercises that vary in degree of difficulty and are both applied and theoretical. • The book can also serve as a reference for researchers due to the diverse topics covered, including new material on super greedy trees, rule-based variable selection, and machine learning for causal inference. • Extensive treatment on trees that provides a comprehensive and unified approach to understanding trees in terms of partitions and empirical risk minimization. • New content on random forests, including random forest quantile classifiers for class-imbalanced problems, multivariate random forests, subsampling for confidence regions, super greedy forests. An entire chapter is dedicated to random survival forests, featuring new material on random hazard forests extending survival forests to time-varying covariates.
By:  
Imprint:   Chapman & Hall/CRC
Country of Publication:   United Kingdom
Dimensions:   Height: 254mm,  Width: 178mm, 
ISBN:   9781032758794
ISBN 10:   1032758791
Pages:   466
Publication Date:  
Audience:   College/higher education ,  A / AS level ,  Further / Higher Education
Format:   Hardback
Publisher's Status:   Forthcoming
Preface 1. Introduction 2. Properties of Random Vectors and Background Material 3. Multivariate Normal Distribution 4. Linear Regression 5. Multivariate Regression 6. Discriminant Analysis and Classification 7. Generalization Error 8. Principal Component Analysis 9. Canonical Correlation Analysis 10. Newton’s Method 11. Steepest Descent 12. Gradient Boosting 13. Detailed Analysis of L2Boost 14. Coordinate Descent 15. Trees 16. Random Forests 17. Random Forests Variable Selection 18. Splitting Effect on Random Forests 19. Random Survival Forests 20. Causal Estimates using Machine Learning

Dr. Hemant Ishwaran’s work focuses on advancing machine learning techniques for applications in public health, medicine, and informatics. His contributions include the development of open-source tools, such as R packages for his pioneering methods, including the widely-used random survival forests—a significant extension of the random forest algorithm in machine learning. His collaborations with healthcare experts have resulted in precision models for cardiovascular disease (CVD), heart transplantation, cancer staging, and resistance to gene cancer therapy.

See Also