Descarga la app

Apache DataFusion: Building Custom Analytics Engines in Rust

Por
- Trex Team
Editorial
- NobleTrex Press

Descarga la app

Idioma: Inglés
Formato
Categoría: No ficción

"Apache DataFusion: Building Custom Analytics Engines in Rust"

This book is for experienced Rust developers and data infrastructure engineers who want to build fast, embeddable analytics systems—without reinventing a query engine from scratch. Using Apache DataFusion as the core, it shows how to turn Arrow’s zero-copy columnar memory model into production-grade pipelines, and how to make deliberate architectural choices around extensibility, isolation, and predictable performance in real services.

You’ll learn how DataFusion’s SQL/DataFrame front-ends map into logical plans and expressions, where semantic analysis and type coercion boundaries belong, and how to shape and optimize plans safely. The book then goes deeper into physical planning and the ExecutionPlan contract: partitioning and ordering requirements, streaming execution semantics, and physical optimization techniques that reduce data movement. You’ll also implement the key extension points—TableProviders and catalogs, object-store and file-format I/O, Arrow-native UDF/UDAF/UDWFs, and custom physical operators—while avoiding the common “wrong answer” and performance pitfalls.

Prerequisites include strong Rust fluency, comfort with async/concurrency, and basic familiarity with relational query processing. The emphasis is on engine-authoring workflows: capability contracts, cost and correctness trade-offs, and operational readiness through metrics, spilling, concurrency control, and rigorous testing and diagnostics.

Fecha de lanzamiento

Libro electrónico: 9 de marzo de 2026

Etiquetas

Empieza por aquí

Ver todos los títulos

Escucha y lee

Apache DataFusion: Building Custom Analytics Engines in Rust

Empieza por aquí