A much-needed guide to implementing new technology in workspaces
From experts in the field comes Machine Learning Upgrade: A Data Scientist's Guide to MLOps, LLMs, and ML Infrastructure, a book that provides data scientists and managers with best practices at the intersection of management, large language models (LLMs), machine learning, and data science. This groundbreaking book will change the way that you view the pipeline of data science. The authors provide an introduction to modern machine learning, showing you how it can be viewed as a holistic, end-to-end system—not just shiny new gadget in an otherwise unchanged operational structure. By adopting a data-centric view of the world, you can begin to see unstructured data and LLMs as the foundation upon which you can build countless applications and business solutions. This book explores a whole world of decision making that hasn't been codified yet, enabling you to forge the future using emerging best practices.
Gain an understanding of the intersection between large language models and unstructured data Follow the process of building an LLM-powered application while leveraging MLOps techniques such as data versioning and experiment tracking Discover best practices for training, fine tuning, and evaluating LLMs Integrate LLM applications within larger systems, monitor their performance, and retrain them on new data
This book is indispensable for data professionals and business leaders looking to understand LLMs and the entire data science pipeline.
By:
Kristen Kehrer (Data Moves Me LLC),
Caleb Kaiser (Comet)
Imprint: John Wiley & Sons Inc
Country of Publication: United States
Dimensions:
Height: 226mm,
Width: 150mm,
Spine: 18mm
Weight: 272g
ISBN: 9781394249633
ISBN 10: 1394249632
Pages: 240
Publication Date: 20 August 2024
Audience:
Professional and scholarly
,
Undergraduate
Format: Paperback
Publisher's Status: Active
Introduction ix 1 A Gentle Introduction to Modern Machine Learning 1 Data Science Is Diverging from Business Intelligence 3 From CRISP-DM to Modern, Multicomponent ml Systems 4 The Emergence of LLMs Has Increased ML’s Power and Complexity 7 What You Can Expect from This Book 9 2 An End-to-End Approach 11 Components of a YouTube Search Agent 13 Principles of a Production Machine Learning System 16 Observability 19 Reproducibility 19 Interoperability 20 Scalability 21 Improvability 22 A Note on Tools 23 3 A Data-Centric View 25 The Emergence of Foundation Models 25 The Role of Off-the-Shelf Components 27 The Data-Driven Approach 28 A Note on Data Ethics 28 Building the Dataset 30 Working with Vector Databases 34 Data Versioning and Management 50 Getting Started with Data Versioning 53 Knowing “Just Enough” Engineering 57 4 Standing Up Your LLM 61 Selecting Your LLM 61 What Type of Inference Do I Need to Perform? 65 How Open-Ended Is This Task? 66 What Are the Privacy Concerns for This Data? 66 How Much Will This Model Cost? 67 Experiment Management with LLMs 68 LLM Inference 74 Basics of Prompt Engineering 74 In-Context Learning 77 Intermediary Computation 85 Augmented Generation 89 Agentic Techniques 94 Optimizing LLM Inference with Experiment Management 102 Fine-Tuning LLMs 111 When to Fine-Tune an LLM 112 Quantization, QLOrA, and Parameter Efficient Fine-Tuning 113 Wrapping Things Up 121 5 Putting Together an Application 123 Prototyping with Gradio 125 Creating Graphics with Plotnine 128 Adding the Author Selector 137 Adding a Logo 138 Adding a Tab 139 Adding a Title and Subtitle 140 Changing the Color of the Buttons 140 Click to Download Button 141 Putting It All Together 141 Deploying Models as APIs 144 Implementing an API with FastAPI 146 Implementing Uvicorn 148 Monitoring an LLM 149 Dockerizing Your Service 151 Deploying Your Own LLM 154 Wrapping Things Up 159 6 Rounding Out the ML Life Cycle 161 Deploying a Simple Random Forest Model 161 An Introduction to Model Monitoring 167 Model Monitoring with Evidently AI 175 Building a Model Monitoring System 176 Final Thoughts on Monitoring 187 7 Review of Best Practices 189 Step 1: Understand the Problem 189 Step 2: Model Selection and Training 190 Step 3: Deploy and Maintain 192 Step 4: Collaborate and Communicate 196 Emerging Trends in LLMs 197 Next Steps in Learning 199 Appendix: Additional LLM Example 201 Index 209
Kristen Kehrer has been providing innovative and practical statistical modeling solutions since 2010. In 2018, she achieved recognition as a LinkedIn Top Voice in Data Science & Analytics. Kristen is also the founder of Data Moves Me, LLC. Caleb Kaiser is a Full Stack Engineer at Comet. Caleb was previously on the Founding Team at Cortex Labs. Caleb also worked at Scribe Media on the Author Platform Team.