Skip to content

Unity Catalog Spice OSS Integration

Spice OSS is a unified SQL query interface and portable runtime to locally materialize, accelerate, and query datasets across databases, data warehouses, and data lakes.

Unity Catalog and Databricks Unity Catalog can be used with Spice OSS as Catalog Connectors to make catalog tables available for query in Spice.

Follow the guide below to connect to open-source Unity Catalog. See the Databricks Unity Catalog quickstart if using Databricks Unity Catalog.

Prerequisites

  • Spice OSS installed: To install Spice, see Spice OSS Installation.
  • Access to an open-source Unity Catalog server with 1+ tables.

Step 1. Create a new directory and initialize a Spicepod

mkdir uc_quickstart
cd uc_quickstart
spice init

Step 2. Add the Unity Catalog Connector

Configure the spicepod.yaml with:

catalogs:
  - from: unity_catalog:https://<unity_catalog_host>/api/2.1/unity-catalog/catalogs/<catalog_name>
    name: uc_quickstart
    params:
      # Configure the object store credentials here

The Unity Catalog connector currently supports Delta Lake tables only and requires object store credentials.

Step 3. Configure Delta Lake tables object store credentials

AWS S3

params:
  unity_catalog_aws_access_key_id: ${env:AWS_ACCESS_KEY_ID}
  unity_catalog_aws_secret_access_key: ${env:AWS_SECRET_ACCESS_KEY}
  unity_catalog_aws_region: <region> # E.g. us-east-1, us-west-2
  unity_catalog_aws_endpoint: <endpoint> # If using an S3-compatible service, like Minio

Set the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables to the AWS access key and secret key, respectively.

Azure Storage

params:
  unity_catalog_azure_storage_account_name: ${env:AZURE_ACCOUNT_NAME}
  unity_catalog_azure_account_key: ${env:AZURE_ACCOUNT_KEY}

Set the AZURE_ACCOUNT_NAME and AZURE_ACCOUNT_KEY environment variables to the Azure storage account name and account key, respectively.

Google Cloud Storage

params:
  unity_catalog_google_service_account: </path/to/service-account.json>

Example Delta Lake Spicepod

version: v1beta1
kind: Spicepod
name: uc_quickstart

catalogs:
  - from: unity_catalog:https://<unity_catalog_host>/api/2.1/unity-catalog/catalogs/<catalog_name>
    name: uc_quickstart
    params:
      # delta_lake S3 parameters
      unity_catalog_aws_region: us-west-2
      unity_catalog_aws_access_key_id: ${secrets:aws_access_key_id}
      unity_catalog_aws_secret_access_key: ${secrets:aws_secret_access_key}
      unity_catalog_aws_endpoint: s3.us-west-2.amazonaws.com

Step 5. Start the Spice runtime and show the available tables

Once the spicepod.yml is configured, start the Spice runtime:

spice run

In a seperate terminal, run the Spice SQL REPL:

spice sql

In the REPL, show the available tables.

SHOW TABLES;
+---------------+--------------+---------------+------------+
| table_catalog | table_schema | table_name    | table_type |
+---------------+--------------+---------------+------------+
| spice         | runtime      | metrics       | BASE TABLE |
| spice         | runtime      | task_history  | BASE TABLE |
| uc_quickstart | default      | taxi_trips    | BASE TABLE |
+---------------+--------------+---------------+------------+

Step 6. Query a dataset

In the SQL REPL execute query. For example:

-- SELECT * FROM uc_quickstart.<SCHEMA_NAME>.<TABLE_NAME> LIMIT 5;
sql> SELECT fare_amount FROM uc_quickstart.default.taxi_trips LIMIT 5;
+-------------+
| fare_amount |
+-------------+
| 11.4        |
| 13.5        |
| 11.4        |
| 27.5        |
| 18.4        |
+-------------+

Time: 1.4579897499999999 seconds. 5 rows.

More information