Fakta og dokumentar
"Deequ Data Quality: Constraint‑Based Validation for Big Data Pipelines"
Data quality failures in big data systems rarely look like broken code—they look like “successful” jobs shipping quietly corrupted tables. This book is for experienced data engineers, platform engineers, and analytics/ML practitioners who need enforceable guarantees, not ad‑hoc SQL spot checks. It treats data quality as an engineering discipline: explicit contracts, measurable signals, and operational response patterns that keep pipelines trustworthy without freezing delivery.
You’ll learn Deequ’s core model—metrics plus assertions—and how it maps onto Spark execution, cost, and reproducibility. The book goes deep on authoring production-grade constraints (completeness, uniqueness, validity, ranges, patterns, proportions), composing checks with stable thresholds, and turning failures into actionable diagnostics. It then operationalizes validation via VerificationSuite, showing how to plan analyzer execution, interpret VerificationResult edge cases, and implement gating strategies such as fail-fast, quarantine, and partial publishes. Profiling and constraint suggestion are covered as accelerators—followed by governance and rollout workflows that keep rules maintainable as data and business semantics evolve.
A strong working knowledge of Spark and DataFrames is assumed. Coverage includes longitudinal quality via metrics repositories, regression detection, and alerting, plus advanced patterns for partitioned/incremental data, late arrivals, custom analyzers, and real-world version compatibility across
© 2026 NobleTrex Press (E-bok): 6610001179250
Utgivelsesdato
E-bok: 9. mars 2026
Over 900 000 lydbøker og e-bøker
Eksklusive nyheter hver uke
Lytt og les offline
Kids Mode (barnevennlig visning)
Avslutt når du vil
For deg som vil lytte og lese ubegrenset.
219 kr /måned
Lytt så mye du vil
Over 900 000 bøker
Nye eksklusive bøker hver uke
Avslutt når du vil
For deg som ønsker å dele historier med familien.
Fra 289 kr /måned
Lytt så mye du vil
Over 900 000 bøker
Nye eksklusive bøker hver uke
Avslutt når du vil
289 kr /måned
For deg som lytter og leser ofte.
189 kr /måned
Avslutt når du vil
Nye eksklusive bøker hver uke
Over 900 000 bøker
Lytt opptil 50 timer per måned
For deg som lytter og leser av og til.
149 kr /måned
Lytt opp til 20 timer per måned
Over 900 000 bøker
Nye eksklusive bøker hver uke
Avslutt når du vil