⚠ This Demo is getting old!

Databricks is moving fast — it’s now easier than ever to deploy multi-agent systems
and run evals with MLflow 3.0.

Install dbdemos ai-agent demos to explore the latest features.

This content will be deprecated soon.

Deploy Your LLM Chatbots with Mosaic AI Agent Evaluation and Lakehouse Applications

In this tutorial, you will learn how to build your own Chatbot Assisstant to help your customers answer questions about Databricks, using Retrieval Augmented Generation (RAG), Databricks State of The Art LLM DBRX Instruct Foundation Model Vector Search.

Scaling your business with a GenAI-Powered Assistant and DBRX Instruct

LLMs are disrupting the way we interact with information, from internal knowledge bases to external, customer-facing documentation or support.

While ChatGPT democratized LLM-based chatbots for consumer use, companies need to deploy personalized models that answer their needs:

- Privacy requirements on sensitive information
- Preventing hallucination
- Specialized content, not available on the Internet
- Specific behavior for customer tasks
- Control over speed and cost
- Deploy models on private infrastructure for security reasons

Introducing Mosaic AI Agent Evaluation

To solve these challenges, custom knowledge bases and models need to be deployed. However, doing so at scale isn't simple and requires:

- Ingesting and transforming massive amounts of data
- Ensuring privacy and security across your data pipeline
- Deploying systems such as Vector Search Index
- Having access to GPUs and deploying efficient LLMs for inference serving
- Training and deploying custom models
- Evaluating your RAG application

This is where the Databricks AI comes in. Databricks simplifies all these steps so that you can focus on building your final model, with the best prompts and performance.

GenAI & Maturity curve

Deploying GenAI can be done in multiple ways:

- **Prompt engineering on public APIs (e.g. Databricks DBRX Instruct, LLama 2, openAI)**: answer from public information, retail (think ChatGPT)
- **Retrieval Augmented Generation (RAG)**: specialize your model with additional content. *This is what we'll focus on in this demo*
- **OSS model Fine tuning**: when you have a large corpus of custom data and need specific model behavior (execute a task)
- **Train your own LLM**: for full control on the underlying data sources of the model (biomedical, Code, Finance...)

What is Retrieval Augmented Generation (RAG) for LLMs?

RAG is a powerful and efficient GenAI technique that allows you to improve model performance by leveraging your own data (e.g., documentation specific to your business), without the need to fine-tune the model.

This is done by providing your custom information as context to the LLM. This reduces hallucination and allows the LLM to produce results that provide company-specific data, without making any changes to the original LLM.

RAG has shown success in chatbots and Q&A systems that need to maintain up-to-date information or access domain-specific knowledge.

RAG and Vector Search

To be able to provide additional context to our LLM, we need to search for documents/articles where the answer to our user question might be.
To do so, a common solution is to deploy a vector database. This involves the creation of document embeddings, vectors of fixed size representing your document.

The vectors will then be used to perform real-time similarity search during inference.

01-First Step: Deploy and test your first RAG application in 10minutes

New to RAG and Mosaic AI Quality Labs? Start here if this is your first time implementing a GenAI application leveraging Databricks DBRX.

You will learn:

- Create your Vector Search index and send queries to find similar documents
- Build your langchain model leveraging Databricks Foundation Model (DBRX Instruct)
- Deploy and test your Chatbot with Databricks review app

Get started: open the [01-first-step/01-First-Step-RAG-On-Databricks]($./01-first-step/01-First-Step-RAG-On-Databricks).

02-Simple Rag App: Build a production-grade RAG application

Start here if this is your first time implementing a GenAI application leveraging Databricks DBRX, our State Of the Art LLM, open and Available as a model serving endpoint.

You will learn:

- How to prepare your document dataset, creating text chunk from documentation pages
- Create your Vector Search index and send queries to find similar documents
- Build a complete langchain model leveraging Databricks Foundation Model (DBRX Instruct)
- Deploy and test your Chatbot with Mosaic AI Agent Evaluation review app
- Ask external expert to test and review your chatbot
- Deploy a front end application using Databricks Lakehouse app

Get started: open the [02-simple-app/01-Data-Preparation-and-Index notebook]($./02-simple-app/01-Data-Preparation-and-Index).

03-Advanced: Going further, build and manage your Evaluation Dataset with Mosaic AI Agent Evaluation

Explore this content to discover how to leverage all the Databricks Data Intelligence Platform capabilities for your GenAI Apps.

You will learn:

- How to extract information from unstructured documents (pdfs) and create custom chunks
- Leverage Databricks Embedding Foundation Model to compute the chunks embeddings
- Create a Self Managed Vector Search index and send queries to find similar documents
- Build an advanced langchain model with chat history, leveraging Databricks Foundation Model (DBRX Instruct)
- Ask external expert to test and review your chatbot with Mosaic AI Agent Evaluation review app
- Run online llm evaluation and track your metrics with Databricks Monitoring
- Deploy a front end application using Databricks Lakehouse app

Learn more adavanced GenAI concepts: [open the 03-advanced-app/01-PDF-Advanced-Data-Preparation]($./03-advanced-app/01-PDF-Advanced-Data-Preparation).

What's next: LLM Fine Tuning

Discover how to fine-tune your LLMs for your RAG applications: `dbdemos.install('llm-fine-tuning)`