
Hör so viel du willst – ohne Stundenlimit
Tauche ein in eine Welt mit über 600.000 Hörbüchern und E-Books – 2 Monate ohne Stundenbegrenzung für nur 1 € pro Monat. Ohne Bindung, jederzeit kündbar.
Jetzt Angebot aktivierenSachbuch
"TensorRT‑LLM Optimization: Quantization, Kernel Fusion, and Throughput Engineering"
Built for experienced ML systems engineers, inference specialists, and GPU performance practitioners, this book is a deep guide to making large language models run faster, cheaper, and more predictably with TensorRT‑LLM. Rather than offering generic acceleration advice, it develops a precise mental model of the TensorRT‑LLM stack so readers can understand where performance is won or lost: in quantization choices, graph compilation, fused kernels, KV-cache policy, and serving scheduler behavior.
The book covers the full optimization path from precision strategy and post-training quantization pipelines to engine build configuration, plugin-enabled fusion, attention specialization, and throughput-oriented serving design. Readers will learn how to choose among FP16, BF16, FP8, INT8, and INT4 in hardware-aware ways; validate deployable quantized artifacts; realize fused execution paths in compiled engines; engineer KV-cache behavior for long-context workloads; and benchmark and profile systems with enough rigor to attribute gains to the right layer.
Structured as an advanced, implementation-minded text, the book emphasizes cross-layer tradeoffs rather than isolated tricks. It assumes solid familiarity with transformer inference, CUDA-era GPU concepts, and production deployment concerns, and rewards readers who want durable optimization judgment instead of version-fragile recipes."
© 2026 NobleTrex Press (E-Book): 6610001219079
Erscheinungsdatum
E-Book: 8. Mai 2026
Über 600.000 Titel
Lade Titel herunter mit dem Offline Modus
Exklusive Titel und Storytel Originals
Sicher für Kinder (Kindermodus)
Einfach jederzeit kündbar
Für alle, die gelegentlich hören und lesen.
8.90 € /Monat
Jederzeit kündbar
Abo-Upgrade jederzeit möglich
Für alle, die unbegrenzt hören und lesen möchten.
18.90 € /Monat
Jederzeit kündbar
Wechsel zu Basic jederzeit möglich