Show Lineage for Delta Tables in Unity Catalog

Unity Catalog captures runtime data lineage for any table to table operation executed on a Databricks cluster or SQL endpoint. Lineage operates across all languages (SQL, Python, Scala and R) and it can be visualized in the Data Explorer in near-real-time, and also retrieved via REST API.
Lineage is available at two granularity levels:
- Tables
- Columns: ideal to track GDPR dependencies
Lineage takes into account the Table ACLs present in Unity Catalog. If a user is not allowed to see a table at a certain point of the graph, its information are redacted, but they can still see that a upstream or downstream table is present.
Lineage can also include **external assets and workflows** that are run **outside** of Databricks. This external lineage metadata feature is in Public Preview. See [Bring your own data lineage](https://docs.databricks.com/aws/en/data-governance/unity-catalog/external-lineage).
Working with Lineage
No modifications are needed to the existing code to generate lineage. As long as you operate with tables saved in Unity Catalog, Databricks will capture all lineage information for you.
Requirements:
- Source and target tables must be registered in a Unity Catalog metastore.
- External assets (not in the metastore) must be added as external metadata objects and linked to registered securable objects.
- Queries must use Spark DataFrame APIs (e.g., Spark SQL functions returning a DataFrame) or Databricks SQL interfaces (notebooks, SQL query editor).
- To view lineage, users must have at least the `BROWSE` privilege on the parent catalog, and the catalog must be accessible from the workspace.
- Permissions are required on notebooks, jobs, or dashboards as per workspace access control settings.
- For UC-enabled pipelines, users must have `CAN VIEW` permission on the pipeline.
- Streaming lineage between Delta tables requires DBR `11.3 LTS+`.
- Column lineage for Lakeflow Declarative Pipelines requires DBR `13.3 LTS+`.

1/ Create a Delta Table In Unity Catalog
The first step is to create a Delta Table in Unity Catalog.
We want to do that in SQL, to show multi-language support:
1. Use the `CREATE TABLE` command and define a schema
1. Use the `INSERT INTO` command to insert some rows in the table2/ Create a Delta Table from the Original table
To show dependencies between tables, we create a new table using the `CREATE TABLE AS SELECT (CTAS)` statement from the previous table `menu`, concatenating three columns into a new one3/ Create a Delta Table by joining Two Tables
The last step is to create a third table as a join from the two previous ones. This time we will use Python instead of SQL.
- We create a Dataframe with some random data formatted according to two columns, `id` and `recipe_id`
- We save this Dataframe as a new table, `main.lineage.price`
- We read as two Dataframes the previous two tables, `main.lineage.dinner` and `main.lineage.price`
- We join them on `recipe_id` and save the result as a new Delta table `main.lineage.dinner_price`4/ Visualize Table Lineage
The Table lineage can be visualized by folowing the steps below:
1. Select the `Catalog` explorer icon

on the navigation bar to the left.
2. Search for `uc_lineage` in the search tab.
3. Expand the `dbdemos_uc_lineage` schema that is used for this demo.
3. Click the kebab menu on any of these tables `dinner`, `menu` or `price` under the `dbdemos_uc_lineage` schema.
4. Click the `Open in Catalog Explorer` option.
5. Click the `Lineage` tab.
6. Explore the page, feel free to click the `See Lineage Graph` option as well.

5/ Visualize Column Lineage
Lineage is also available at the column level, making it useful for tracking column dependencies and ensuring compliance with GDPR standards. Column-level lineage can also be accessed via the API.
You can access the column lineage on the `Lineage graph` view by clicking the `+` icon at the end of the table box boundary, followed by clicking each column. In this case we see that the column `full_menu` in the `dinner` table is derived from the three columns `app`, `main`, and `desert` of the `menu` table:

6/ Lineage Permission Model
Lineage graphs share the same permission model as Unity Catalog. If a user does not have the `BROWSE` or `SELECT` privilege on a table, they cannot explore its lineage.
Lineage graphs display Unity Catalog objects across all workspaces attached to the metastore, as long as the user has adequate object permissions.Conclusion
Databricks Unity Catalog let you track data lineage out of the box.
No extra setup required, just read and write from your table and the engine will build the dependencies for you. Lineage can work at a table level but also at the column level, which provide a powerful tool to track dependencies on sensible data.
Lineage can also show you the potential impact updating a table/column and find who will be impacted downstream.Existing Limitations
Review the data lineage documentation [[AWS](https://docs.databricks.com/aws/en/data-governance/unity-catalog/data-lineagelineage-limitations), [GCP](https://docs.databricks.com/gcp/en/data-governance/unity-catalog/data-lineage), [Azure](https://learn.microsoft.com/en-us/azure/databricks/data-governance/unity-catalog/data-lineage)] for the latest limitations.