Hands-On RAG for Production: Design, Develop, and Deploy Production-Ready RAG Applications

Hands-On RAG for Production: Design, Develop, and Deploy Production-Ready RAG Applications book cover

Hands-On RAG for Production: Design, Develop, and Deploy Production-Ready RAG Applications

Author(s): Ofer Mendelevitch (Author), Forrest Sheng Bao (Author)

  • Publisher: O’Reilly Media
  • Publication Date: July 7, 2026
  • Edition: 1st
  • Language: English
  • Print length: 356 pages
  • ASIN: B0G48HGR81
  • ISBN-13: 9798341621718

Book Description

Retrieval-augmented generation (RAG) is the go-to strategy for integrating large language models with your organization’s unique knowledge. However, the market is full of RAG pipelines and components, making it hard to choose the right solution for your enterprise’s needs. This book simplifies the process, offering a comprehensive road map to building, refining, and scaling production-grade RAG applications.

Authors Ofer Mendelevitch and Forrest Bao guide you through every phase of development, from data ingestion, embeddings, and vector search to advanced techniques like agentic RAG, multimodal RAG, and GraphRAG. Engineers and architects will learn how to tackle the challenges they’ll encounter when building RAG applications at enterprise scale: ensuring high accuracy with minimal hallucinations, maintaining low-latency performance, safeguarding data privacy, and providing transparent, explainable responses among them.

  • Determine whether to build RAG yourself or deploy a RAG-as-a-service platform
  • Build a basic RAG stack that maximizes performance and cost-effectiveness
  • Measure key metrics such as hallucinations, response quality, latency, and cost
  • Address challenges in enterprise deployment, such as compliance with data security and privacy requirements, explainability, and prompt design
  • Implement advanced techniques such as multimodal RAG, agentic RAG, and GraphRAG

Editorial Reviews

Editorial Reviews

From the Inside Flap

“Hands-On RAG for Production doesn’t skip the unglamorous parts. Ofer and Forrest give document parsing, tables, and ingestion the serious treatment they deserve — which is exactly where most real-world RAG systems live or die.”
— Jerry Liu – CEO, LlamaIndex

“We are entering an era where software is no longer just a tool we use, but an intelligence we collaborate with. This book correctly identifies that the future of the ‘Corporate Brain’ relies on the unification of fragmented institutional knowledge through robust RAG pipelines. For teams building these living, breathing engines of insight, this book serves as the definitive guide to mastering the complex interplay between vector stores, agents, and evaluators.”
— Bob van Luijt – Cofounder and CEO, Weaviate

About the Author

Ofer Mendelevitch is an AI and ML leader specializing in building production systems with large language models (LLMs), retrieval-augmented generation (RAG), and agentic workflows. He is the author of Practical Data Science with Hadoop (Addison-Wesley).
Forrest Sheng Bao co-leads the Machine Learning team at Vectara. He has over 10+ years of research experience in the areas of Artificial Intelligence (AI) and Natural Language Processing (NLP). Prior to Vectara, he was an assistant professor at Iowa State University. Forrest holds a PhD in computer science with a minor in electrical engineering from Texas Tech University.

View on Amazon

未经允许不得转载:Wow! eBook » Hands-On RAG for Production: Design, Develop, and Deploy Production-Ready RAG Applications