
Hands-On RAG for Production: Design, Develop, and Deploy Production-Ready RAG Applications
Author(s): Ofer Mendelevitch (Author), Forrest Sheng Bao (Author)
- Publisher: O’Reilly Media
- Publication Date: July 7, 2026
- Edition: 1st
- Language: English
- Print length: 356 pages
- ASIN: B0G48HGR81
- ISBN-13: 9798341621718
Book Description
Retrieval-augmented generation (RAG) is the go-to strategy for integrating large language models with your organization’s unique knowledge. However, the market is full of RAG pipelines and components, making it hard to choose the right solution for your enterprise’s needs. This book simplifies the process, offering a comprehensive road map to building, refining, and scaling production-grade RAG applications.
Authors Ofer Mendelevitch and Forrest Bao guide you through every phase of development, from data ingestion, embeddings, and vector search to advanced techniques like agentic RAG, multimodal RAG, and GraphRAG. Engineers and architects will learn how to tackle the challenges they’ll encounter when building RAG applications at enterprise scale: ensuring high accuracy with minimal hallucinations, maintaining low-latency performance, safeguarding data privacy, and providing transparent, explainable responses among them.
- Determine whether to build RAG yourself or deploy a RAG-as-a-service platform
- Build a basic RAG stack that maximizes performance and cost-effectiveness
- Measure key metrics such as hallucinations, response quality, latency, and cost
- Address challenges in enterprise deployment, such as compliance with data security and privacy requirements, explainability, and prompt design
- Implement advanced techniques such as multimodal RAG, agentic RAG, and GraphRAG
Editorial Reviews
Editorial Reviews
From the Inside Flap
— Jerry Liu – CEO, LlamaIndex
“We are entering an era where software is no longer just a tool we use, but an intelligence we collaborate with. This book correctly identifies that the future of the ‘Corporate Brain’ relies on the unification of fragmented institutional knowledge through robust RAG pipelines. For teams building these living, breathing engines of insight, this book serves as the definitive guide to mastering the complex interplay between vector stores, agents, and evaluators.”
— Bob van Luijt – Cofounder and CEO, Weaviate
About the Author
Forrest Sheng Bao co-leads the Machine Learning team at Vectara. He has over 10+ years of research experience in the areas of Artificial Intelligence (AI) and Natural Language Processing (NLP). Prior to Vectara, he was an assistant professor at Iowa State University. Forrest holds a PhD in computer science with a minor in electrical engineering from Texas Tech University.
Wow! eBook

