Using the MLflow Prompt Engineering UI#

This notebook will show how to use the MLflow Prompt Engineering UI with the Databricks Foundation Model API. The prompt engineering UI lets you combine different models, prompts, and parameter configurations and works seamlessly with the Foundation Model API in Databricks.

Getting Started#

Start a New MLflow Run using Prompt Engineering#

Next, click “+New Run” and select “using Prompt Engineering” as shown in the image below.

Starting a new MLflow Run

Select a Foundation Model API Model#

Models from the foundation model API start with databricks-. Select a model from the “Served LLM Model” dropdown.

Selecting a Foundation Model API model

Fill Out the Prompt Template#

The Prompt Engineering UI allows you to compare the performance of different prompts and different models using a the same prompt template. In the example below, we simulate a RAG scenario and provide a source text about Delta Lake. We then configure a variable called “question”, formatted {{ question }}, where user prompts will be inserted. We can see how different prompts and models operate with this same template.

Filling out the prompt template

Note that, at this phase, you can also configure some model generation parameters, specifically “temperature,” “Max tokens,” and “Stop Sequences.”

Create the first Run#

After filling out the prompt template, fill out an initial question, click “Evaluate”, and then click “Create run” at the bottom.

launching the prompt engineering ui

Adding new prompts and new runs#

At this point, you can see a table with one row (for the initial question) and one column (for the initial model). You can add new questions, which will be inserted into the same template, and new models, which you can evaluate on the same questions.

To add a new question, click the “+” button on the left side of the table and enter the question.

To add a new model, click the “+ New run” button on the upper right side of the interface and repeat the process above for the new model. You can also use the new run interface to configure different templates and variables.

Adding new prompts and models

Evaluating cells after adding new prompts or models#

Adding new prompts or models does not automatically evaluate all combinations of models and prompts. To evaluate after adding new models/prompts, click the “Evaluate” button in empty cells or “Evaluate all” in the column (run) headers.

Evaluating cells