Discover the power of open source observability for your enterprise environment
In Mastering Observability and OpenTelemetry: Enhancing Application and Infrastructure Performance and Avoiding Outages, accomplished engineering leader and open source contributor Steve Flanders unlocks the secrets of enterprise application observability
with a comprehensive guide to OpenTelemetry (OTel). Explore how OTel transforms observability, providing a robust toolkit for capturing and analyzing telemetry data across your environment.
You will learn how OTel delivers unmatched flexibility, extensibility, and vendor neutrality, freeing you from vendor lock-in and enabling data sovereignty and portability. You will also discover:
Comprehensive coverage of observability issues and technology: Dive deep into the world of observability and gain a comprehensive understanding of observability fundamentals with practical insights and real-world use cases. Practical guidance: From instrumentation techniques to advanced tracing strategies, gain the skills needed to create highly observable systems. Learn how to deploy and configure OTel, even in challenging brownfield environments, with step-by-step instructions and hands-on exercises. An opportunity for community contributions and communication: Join the OTel
community, including end-users, vendors, and cloud providers, and shape the future of observability while connecting with experts and peers.
Whether you are a novice or a seasoned professional, Mastering Observability and OpenTelemetry is your roadmap to troubleshooting availability and performance problems by learning to detect anomalies, interpret data, and proactively optimize performance in your enterprise environment. Embark on your journey to observability mastery today!
By:
Steve Flanders
Imprint: John Wiley & Sons Inc
Country of Publication: United States
Dimensions:
Height: 234mm,
Width: 183mm,
Spine: 23mm
Weight: 476g
ISBN: 9781394253128
ISBN 10: 1394253125
Series: Tech Today
Pages: 368
Publication Date: 29 October 2024
Audience:
Professional and scholarly
,
Undergraduate
Format: Paperback
Publisher's Status: Active
Foreword xiii Introduction xiv The Mastering Series xvi Chapter 1 What Is Observability? 1 Definition 1 Background 4 Cloud Native Era 4 Monitoring Compared to Observability 5 Metadata 8 Dimensionality 9 Cardinality 9 Semantic Conventions 10 Data Sensitivity 10 Signals 10 Metrics 10 Logs 13 Traces 14 Other Signals 20 Collecting Signals 20 Instrumentation 21 Push Versus Pull Collection 22 Data Collection 23 Sampling Signals 26 Observability 27 Platforms 27 Application Performance Monitoring 28 The Bottom Line 28 Notes 30 Chapter 2 Introducing OpenTelemetry! 31 Background 31 Observability Pain Points 31 The Rise of Open Source Software 34 Introducing OpenTelemetry 35 OpenTelemetry Components 37 OpenTelemetry Concepts 48 Roadmap 50 The Bottom Line 50 Notes 51 Chapter 3 Getting Started with the Astronomy Shop 53 Background 53 Architecture 54 Prerequisites 54 Getting Started 55 Accessing the Astronomy Shop 57 Accessing Telemetry Data 57 Beyond the Basics 58 Configuring Load Generation 58 Configuring Feature Flags 59 Configuring Tests Built from Traces 60 Configuring the OTel Collector 60 Configuring OTel Instrumentation 62 Troubleshooting Astronomy Shop 62 Astronomy Shop Scenarios 63 Troubleshooting Errors 63 Troubleshooting Availability 69 Troubleshooting Performance 70 Troubleshooting Telemetry 74 The Bottom Line 75 Notes 76 Chapter 4 Understanding the OpenTelemetry Specification 77 Background 77 API Specification 79 API Definition 80 API Context 80 API Signals 81 API Implementation 82 SDK Specification 82 SDK Definition 83 SDK Signals 83 SDK Implementation 84 Data Specification 84 Data Models 86 Data Protocols 88 Data Semantic Conventions 88 Data Compatibility 89 General Specification 90 The Bottom Line 91 Notes 92 Chapter 5 Managing the OpenTelemetry Collector 93 Background 94 Deployment Modes 95 Agent Mode 96 Gateway Mode 98 Reference Architectures 100 The Basics 101 The Binary 103 Sizing 103 Components 104 Configuration 106 Receivers and Exporters 115 Processors 116 Extensions 126 Connectors 127 Observing 128 Relevant Metrics 128 Health Check Extension 131 zPages Extension 131 Troubleshooting 134 Out of Memory Crashes 134 Data Not Being Received or Exported 134 Performance Issues 135 Beyond the Basics 135 Distributions 135 Securing 137 Management 138 The Bottom Line 140 Notes 141 Chapter 6 Leveraging OpenTelemetry Instrumentation 143 Environment Setup 144 Python Trace Instrumentation 149 Automatic Instrumentation 150 Manual Instrumentation 157 Programmatic Instrumentation 163 Mixing Automatic and Manual Trace Instrumentation 166 Python Metrics Instrumentation 167 Automatic Instrumentation 168 Manual Instrumentation 169 Programmatic Instrumentation 174 Mixing Automatic and Manual Metric Instrumentation 176 Python Log Instrumentation 178 Manual Metadata Enrichment 179 Trace Correlation 181 Language Considerations 183 NET 184 Java 184 Go 184 Node js 185 Deployment Models 185 Distributions 185 The Bottom Line 186 Notes 187 Chapter 7 Adopting OpenTelemetry 189 The Basics 189 Why OTel and Why Now? 190 Where to Start? 191 General Process 192 Data Collection 193 Instrumentation 195 Production Readiness 196 Maturity Framework 197 Brownfield Deployment 198 Data Collection 198 Instrumentation 200 Dashboards and Alerts 202 Greenfield Deployment 204 Data Collection 204 Instrumentation 208 Other Considerations 208 Administration and Maintenance 208 Environments 211 Semantic Conventions 212 The Future 213 The Bottom Line 213 Notes 214 Chapter 8 The Power of Context and Correlation 215 Background 215 Context 217 OTel Context 219 Trace Context 221 Resource Context 223 Logic Context 224 Correlation 225 Time Correlation 225 Context Correlation 226 Trace Correlation 228 Metric Correlation 230 The Bottom Line 230 Notes 231 Chapter 9 Choosing an Observability Platform 233 Primary Considerations 233 Platform Capabilities 235 Marketing Versus Reality 237 Price, Cost, and Value 238 Observability Fragmentation 241 Primary Factors 242 Build, Buy, or Manage 242 Licensing, Operations, and Deployment 244 OTel Compatibility and Vendor Lock-In 244 Stakeholders and Company Culture 245 Implementation Basics 246 Administration 247 Usage 248 Maturity Framework 248 The Bottom Line 250 Notes 250 Chapter 10 Observability Antipatterns and Pitfalls 251 Telemetry Data Missteps 251 Mixing Instrumentation Libraries Scenario 253 Automatic Instrumentation Scenario 253 Custom Instrumentation Scenario 254 Component Configuration Scenario 255 Performance Overhead Scenario 255 Resource Allocation Scenario 256 Security Considerations Scenario 256 Monitoring and Maintenance Scenario 257 Observability Platform Missteps 258 Vendor Lock-in Scenario 260 Fragmented Tooling Scenario 260 Tool Fatigue Scenario 261 Inadequate Scalability Scenario 261 Data Overload Scenario 262 Company Culture Implications 264 Lack of Leadership Support Scenario 265 Resistance to Change Scenario 266 Collaboration and Alignment Scenario 266 Goals and Success Criteria Scenario 267 Standardization and Consistency Scenario 268 Incentives and Recognition Scenario 268 Feedback and Improvement Scenario 269 Prioritization Framework 270 The Bottom Line 272 Notes 273 Chapter 11 Observability at Scale 275 Understanding the Challenges 275 Volume and Velocity of Telemetry Data 276 Distributed System Complexity 278 Observability Platform Complexity 281 Infrastructure and Resource Constraints 281 Strategies for Scaling Observability 282 Elasticity, Elasticity, Elasticity! 282 Leverage Cloud Native Technologies 284 Filter, Sample, and Aggregate 286 Anomaly Detection and Predictive Analytics 290 Emerging Technologies and Methodologies 291 Best Practices for Managing Scale 292 General Recommendations 292 Instrumentation and Data Collection 293 Observability Platform 293 The Bottom Line 294 Notes 295 Chapter 12 The Future of Observability 297 Challenges and Opportunities 297 Cost 297 Complexity 299 Compliance 300 Code 301 Emerging Trends and Innovations 302 Artificial Intelligence 303 Observability as Code 304 Service Mesh 305 eBPF 306 The Future of OpenTelemetry 307 Stabilization and Expansion 308 Expanded Signal Support 308 Unified Query Language 310 Community-driven Innovation 310 The Bottom Line 311 Notes 311 Appendix A The Bottom Line 313 Chapter 1: What Is Observability? 313 Chapter 2: Introducing OpenTelemetry! 315 Chapter 3: Getting Started with the Astronomy Shop 316 Chapter 4: Understanding the OpenTelemetry Specification 317 Chapter 5: Managing the OpenTelemetry Collector 318 Chapter 6: Leveraging OpenTelemetry Instrumentation 320 Chapter 7: Adopting OpenTelemetry 321 Chapter 8: The Power of Context and Correlation 323 Chapter 9: Choosing an Observability Platform 324 Chapter 10: Observability Antipatterns and Pitfalls 326 Chapter 11: Observability at Scale 327 Chapter 12: The Future of Observability 328 Appendix B Introduction 329 Chapter 2: Introducing OpenTelemetry! 330 OpenTelemetry Concepts > Roadmap 330 Chapter 3: Getting Started with the Astronomy Shop 330 Background > Architecture 330 Chapter 5: Managing the OpenTelemetry Collector 332 Background 332 The Basics > Components 332 Chapter 12: The Future of Observability 340 Challenges and Opportunities > Code 340 Notes 341 Index 343
STEVE FLANDERS is a Senior Director of Engineering at Splunk, a Cisco company. Steve is one of the founding members of the OpenTelemetry project.