Databricks

Overview

Granulate gAgent installation on Databricks allows seamless integration with plug-and-play support. The gAgent Databricks installation enables deploying Granulate agent on Databricks cluster nodes by incorporating Granulate's bash CLI installation to the Databricks cluster-scoped init scripts.

Installation

Granulate gAgent can be added to a Databricks cluster-scoped init scripts directly from the cluster's DBFS or from an S3 bucket.

Add the init script as a DBFS directory-

  1. Create a DBFS directory you want to store the init script in by running:

dbutils.fs.mkdirs("dbfs:/databricks/scripts/")

2. Create Granulate agent installation script in the directory. Make sure to fill in your customer-specific Download Bucket and Client ID.

dbutils.fs.put("/databricks/scripts/granulate_run_agents.sh","""
#!/bin/bash
set -e
export CLIENT_ID="aLU+0EiAhKU8uP7nxR86Zg=="
curl -s https://s3.amazonaws.com/<Download Bucket>/granulate_run_gagent.sh | sudo \
CLIENT="${CLIENT_ID}" \
SERVICE="databricks-$(echo ${DB_CLUSTER_NAME} | tr ' ' '-' | tr '.' '-')" \
bash
echo "Installation Successful"
exit 0
""", True)

Note: The Granulate Service ID is automatically extracted based on the Databricks environment variable (Cluster Name)

Configure a cluster-scoped init script using the UI

To use the cluster configuration page to configure a cluster to run an init script:

  1. On the cluster configuration page, click the Advanced Options toggle.

  2. At the bottom of the page, click the Init Scripts tab.

  3. In the Destination drop-down, select a destination type. In the example in the preceding section, the destination is DBFS.

  4. Choose the following File Path - dbfs:/databricks/scripts/granulate_run_agents.sh