No ficción
"llama.cpp in the Real World: On‑Device LLM Inference, Profiling, and Deployment"
Large language models no longer belong only in hyperscale clouds. This book is for experienced engineers, systems programmers, ML infrastructure practitioners, and advanced application developers who need to run LLMs locally with rigor, not folklore. Centered on llama.cpp, it treats on-device inference as a serious engineering discipline: one shaped by hardware limits, fast-moving runtime capabilities, reproducibility demands, and real operational trade-offs.
Readers will learn how llama.cpp is structured as an inference stack, why GGUF defines the deployment boundary, and how backend selection, build strategy, quantization, context sizing, and KV-cache planning interact to determine feasibility and performance. The book then moves into repeatable benchmarking, runtime tuning, server deployment through local APIs, multi-user serving behavior, structured outputs, embeddings, reranking, and production-grade observability. Throughout, it emphasizes evidence-based decisions, compatibility awareness, and benchmark methodology that survives version churn.
Rather than offering shallow setup recipes, this book provides a deep, system-level treatment of local LLM deployment in the real world. Familiarity with C/C++ build environments, model-serving concepts, and modern ML tooling will help readers get the most from it, but the presentation remains self-contained and operationally focused.
© 2026 NobleTrex Press (Libro electrónico): 6610001219116
Fecha de lanzamiento
Libro electrónico: 8 de mayo de 2026
Más de 1 millón de títulos
Modo sin conexión
Kids Mode
Cancela en cualquier momento
Escucha y lee sin límites.
$169 /mes
Escucha y lee los títulos que quieras
Modo sin conexión + Kids Mode
Cancela en cualquier momento
Escucha y lee sin límites a un mejor precio.
$1190 /año
Escucha y lee los títulos que quieras
Modo sin conexión + Kids Mode
Cancela en cualquier momento
Perfecto para compartir historias con toda la familia.
Desde $259 /mes
Acceso a todo el catálogo
Modo sin conexión + Kids Mode
Cancela en cualquier momento
$259 /mes