(nbs:fm_api_openai_sdk)=
# Querying the Foundation Model API with the OpenAI Python SDK

The Databricks Foundation Model API is partially compatible with the OpenAI API, making it easy to switch from the OpenAI to the Databricks Foundation Model API without significant code changes. All we need to do is change a couple of environment variables:

- we change `OPENAI_BASE_URL` to `<databricks_workspace_url>/serving-endpoints`
- we change `OPENAI_API_KEY` to a [Databricks PAT](https://docs.databricks.com/en/dev-tools/auth/pat.html).

After defining these variables, you can use the OpenAI Python client to call on models accessible via the Databricks Foundation Model API.

While we generally recommend using the [Foundation Model API Python SDK](https://docs.databricks.com/en/machine-learning/foundation-models/query-foundation-model-apis.html#how-to-query-foundation-model-apis-with-the-python-sdk), using the OpenAI client provides a simple way to test open source foundation models in existing codebases that currently use OpenAI models via the OpenAI Python client. You can switch to the Foundation Model API with just a few small code changes.

In [None]:
%pip install --upgrade openai databricks-genai-inference
dbutils.library.restartPython()

## Configure the OpenAI client

As mentioned above, to use the foundation model API with the OpenAI Python Client, we have to change the `OPENAI_BASE_URL` and `OPENAI_API_KEY` environment variables.

In [None]:
from openai import OpenAI
import os

os.environ["OPENAI_BASE_URL"] = (
    "https://" + spark.conf.get("spark.databricks.workspaceUrl") + "/serving-endpoints/"
)

os.environ["OPENAI_API_KEY"] = (
    dbutils.notebook.entry_point.getDbutils()
    .notebook()
    .getContext()
    .apiToken()
    .getOrElse(None)
)

client = OpenAI()

## Chat Completions with DBRX

With these changes made, we can specify supported Foundation Model API models and call on them with the OpenAI client. Here, for example, we generate a chat completion with the `databricks-dbrx-instruct` model.

In [None]:
chat_completion = client.chat.completions.create(
    model="databricks-dbrx-instruct",
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant",
        },
        {
            "role": "user",
            "content": "What is the relationship between Delta Lake and Parquet?",
        },
    ],
)

print(chat_completion.choices[0].message.content)

1. Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark and big data workloads. It provides schema enforcement, scalable metadata handling, and unifies streaming and batch data processing. Delta Lake is built on top of Apache Parquet, a columnar storage file format.

2. Parquet is a columnar storage file format that is optimized for use with big data processing frameworks like Apache Spark and Apache Hive. It is designed to handle complex data processing tasks and perform well with large datasets.

3. Delta Lake uses the Parquet file format to store data, taking advantage of its columnar storage and performance optimizations. Delta Lake builds on top of Parquet by adding features such as ACID transactions, scalable metadata handling, and unified batch and streaming processing.

4. Delta Lake can be thought of as an extension to Parquet that adds transactional capabilities and other features that are useful for managing large datasets in a distributed

## Embeddings

In this example, we embed the input text using the `databricks-bge-large-en` model.

In [None]:
client.embeddings.create(
    input="Your text string goes here", model="databricks-bge-large-en"
)

CreateEmbeddingResponse(data=[Embedding(embedding=[0.0018854141235351562, -0.005176544189453125, 0.0297088623046875, 0.0225982666015625, -0.006092071533203125, -0.04730224609375, 0.0394287109375, 0.0119476318359375, 0.045257568359375, 0.0215606689453125, -0.003971099853515625, 0.0191192626953125, -0.0080413818359375, -0.00624847412109375, -0.01488494873046875, -0.03778076171875, 0.0153350830078125, -0.0209503173828125, -0.056610107421875, 0.014892578125, -0.046051025390625, 0.0170135498046875, -0.1043701171875, -0.01270294189453125, -0.024322509765625, 0.0065155029296875, 0.022003173828125, -0.00426483154296875, 0.040252685546875, 0.0408935546875, -0.02923583984375, 0.046661376953125, 0.01485443115234375, -0.0196380615234375, -0.010833740234375, -0.010833740234375, 0.031402587890625, 0.0009326934814453125, -0.050537109375, -0.01299285888671875, -0.01013946533203125, -0.0038661956787109375, -0.0121612548828125, -0.01015472412109375, -0.049072265625, -0.0206298828125, -0.0065574645996093

## Completions

In this example, we generate a text completion with the `databricks-mpt-30b-instruct` model.

In [None]:
client.completions.create(
    model="databricks-mpt-30b-instruct",
    prompt="Delta Lake offers ACID transactions. ACID means ",
)

Completion(id='d69e7502-ef5d-4aab-98eb-b4e34e3929ad', choices=[CompletionChoice(finish_reason='stop', index=0, logprobs=None, text='Atomicity, Consistency, Isolation, Durability')], created=None, model='mpt-30b-instruct', object='text_completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=13, prompt_tokens=38, total_tokens=51))

## Conclusion

You now know how to generate Databricks Foundation Model API completions using the OpenAI Python Client. This provides a straightforward way to test whether you can replace OpenAI models with Open Source models from the Databricks Foundation Model API in your Generative AI projects.