Leveraging Databricks Lakehouse & system table to forecast your billing





As your billing information is saved in your system table, it's easy to leverage the lakehouse capabilities to forecast your consumption, and add alerts.

In this notebook, we'll run some analysis on the current consumption and build a Prophet model to forecast the future usage, based on SKU and workspace.

Because each workspace can have a different pattern / trend, we'll specialize multiple models on each SKU and Workspace and train them in parallel.


Note: refresh your forecast data every day





Make sure you refresh your forecast every day to get accurate previsions.

To do that, simply click on Schedule and select "Every Day" to create a new Workflow refreshing this notebook on a daily basis.

If you don't do it, your forecast will quickly expire and won't reflect potential consumption change / trigger alerts.

To make sure this happens, we also added a tracker in the Databricks SQL dashboard to display the number of days since the last forecast.

A note on pricing tables


Note that Pricing tables (containing the price information in `$` for each SKU) is available as a system table.

**Please consider these numbers as estimates which do not include any add-ons or discounts. It is using list price, not contractual. Please review your contract for more accurate information.**

Granular Forecasts



SKUs are detailed per line of product and per region. To simplify our billing dashboard, we'll merge them under common SKU types i.e. grouping `STANDARD_ALL_PURPOSE_COMPUTE` and `PREMIUM_ALL_PURPOSE_COMPUTE` as `ALL_PURPOSE`

Leveraging Databricks AI_FORECAST function



Databricks provides a built-in AI Forecast capability. See the [AI_FORECAST documentation](https://docs.databricks.com/en/sql/language-manual/functions/ai_forecast.html) for more details.

**Note that this might require the preview to be enabled to your workspace. If the preview isn't enable, we show you how to do the forecast below using prophet in python.**

**Make sure you run these next cells using a SQL WAREHOUSE as compute, not a classic cluster as the AI_FORECAST preview is only available in serverless for now.**

*Note: If the `AI_FORECAST` isn't yet available in your workspace, you can skip to the next section where we show you how to do the same in python.*

Let's now leverage the `AI_FORECAST` function to forecast the future pricing fur each sku/workspace id.

Manual AI Forecast leveraging Prophet and pandas UDF with spark



If the AI_FORECAST function isn't available yet in your workspace, you can do the prediction manually in python with prophet.

**Note: if you already ran the AI_FORECAST function, you can skip these next steps**

Using prophet to forecast our billing data



Once aggregated at a daily level, billing information can be forecasted as a timeseries. For this demo, we will use a simple Prophet model to extend the timeseries to the next quarter

Your forecast tables are ready for BI





dbdemos installed a dashboard for you to start exploring your data.
Open the dashboard to start exploring your billing data.

Remember that if you changed your Catalog and Schema, you'll have to update the queries.

Exploring our forecasting data from the notebook



Our forecasting data is ready! We can explore it within the notebook directly, using Databricks built-in widget our any python plot library: