Databricks 6 min read

Your LakeFlow Gateway Is Burning Azure Credits — How to Attach a Compute Policy and Stop It

Melody Egwuchukwu Melody Egwuchukwu · April 2026
Blog cover

You are building a LakeFlow Connect pipeline. CDC ingestion is humming, the gateway is running, and everything looks fine until you check your Azure cost report.

You realised it is the ingestion gateway quietly autoscaling to five workers in the background, burning compute credits around the clock, whether data is flowing or not. There is no alert. No warning in the UI.

The fix is a compute policy that caps the gateway at one worker. What makes this annoying is that you cannot set it through the Databricks UI — there is no policy field on the gateway configuration screen. You need the Databricks CLI. This post covers the setup end-to-end, including specific errors that tripped me up and how to fix them.

What is an ingestion gateway, and why does the compute matter?

When you use LakeFlow Connect to pull data from Azure SQL into Databricks, two things run simultaneously:

The gateway is the expensive one. By default, it can autoscale up to five workers. For a project with a small dataset, you need exactly one worker. Attaching a policy enforces that cap.

What you need before you start

Step 1

Create the compute policy in the Databricks UI

Go to: Compute → Policies → Create Policy

Paste this JSON into the definition box:

{
  "num_workers": {
    "type": "fixed",
    "value": 1
  },
  "driver_node_type_id": {
    "type": "allowlist",
    "values": ["Standard_E4d_v4"]
  },
  "node_type_id": {
    "type": "allowlist",
    "values": ["Standard_F4s"]
  }
}

Save the policy. Click into it and copy the policy ID from the URL or the details panel, you will need it shortly.

Step 2

Install the Databricks CLI (Mac)

There are two versions of the Databricks CLI with completely different command structures. The old one — the databricks-cli pip package, v0.18.x — is deprecated, but if you installed it inside a Python virtual environment it is probably still sitting in your $PATH and silently causing issues.

Install the new CLI via Homebrew:

brew tap databricks/tap
brew install databricks

Verify the right version is running:

databricks --version

You should see Databricks CLI v0.2xx. If you see both versions in the output:

Databricks CLI v0.296.0 found at /opt/homebrew/bin/databricks
Your current $PATH prefers running CLI v0.18.0 at /Users/yourname/project/.venv/bin/databricks
💡 Do not panic. The new CLI executes itself even when the old one is higher in your PATH. It will use v0.296.0 automatically. You can ignore the warning or remove the old package with pip uninstall databricks-cli.
Step 3

Configure the CLI

databricks configure

You will be prompted for two things.

Host — your Databricks workspace URL. Copy it from your browser:

https://adb-123*************.azuredatabricks.net

Do not include anything after .net.

Token — generate a personal access token in Databricks: Profile icon (top right) → Settings → Developer → Access Tokens → Generate new token. Name it something like cli-project, set 90 days expiry. Copy it immediately — you cannot see it again after closing the dialog.

Verify the connection works:

databricks workspace list /

If you see your workspace folders listed, you are connected. If you get an authentication error, regenerate your token and run databricks configure again.

Step 4

Stop your ingestion gateway

Go to Databricks UI → Jobs & Pipelines → find your ingestion gateway → click Stop.

🛑 Do not skip this. Updating a running pipeline will either fail silently or produce unexpected behaviour.
Step 5

Get your pipeline ID and policy ID

List your pipelines:

databricks pipelines list

Find your ingestion gateway in the output and copy its pipeline_id. It looks like a UUID: 22****-af58-****-b101-*********.

List your policies:

databricks cluster-policies list

Find minimal_compute_policy and copy its policy_id.

Step 6

Create your settings JSON file

{
  "id": "YOUR_GATEWAY_PIPELINE_ID",
  "name": "gw_ingestion",
  "catalog": "YOUR_CATALOG_NAME",
  "schema": "00_landing",
  "continuous": true,
  "gateway_definition": {
    "connection_name": "YOUR_CONNECTION_NAME",
    "gateway_storage_catalog": "YOUR_CATALOG_NAME",
    "gateway_storage_schema": "00_landing"
  },
  "clusters": [
    {
      "policy_id": "YOUR_POLICY_ID"
    }
  ]
}

Replace:

Create the file inside your project folder:

nano /your/project/path/gateway_policy.json
Step 7

Run the update command

databricks pipelines update YOUR_GATEWAY_PIPELINE_ID --json @/your/project/path/gateway_policy.json
⚠️ The @ symbol before the file path is not optional. Without it, the CLI tries to parse the file path itself as JSON and throws this error:

Error: error decoding JSON at (inline):1:1: invalid character '/' looking for beginning of value

The @ tells the CLI to read the content from the file at that path, not treat the path as the JSON value.
Step 8

Verify the policy attached

databricks pipelines get YOUR_GATEWAY_PIPELINE_ID

In the output, look inside the spec section for a clusters array containing your policy_id. If it is there, the policy is attached.

💡 If the spec section has no clusters field at all, the update did not apply correctly. Check your JSON for typos in field names and run the update again.
Step 9

Start the gateway and confirm

Go back to Databricks UI → Jobs & Pipelines → your ingestion gateway → Start it.

While it is initialising (this takes 2–5 minutes on Azure), go to Compute in the left sidebar. You should see a cluster spinning up with 1 worker (not 5) and minimal_compute_policy attached.

Policy attached 1 Policy attached 2
If the gateway gets stuck on “Waiting for resources” for more than 10 minutes, Azure may not have capacity for Standard_E4d_v4 or Standard_F4s in your region. Edit the policy and try Standard_D4s_v3 as an alternative.

Once the gateway starts and you see one worker spinning up with the policy attached, that is it. You have capped the compute, stopped the credit bleed, and the pipeline behaviour is unchanged.

The two things worth remembering from this setup: you cannot do any of this from the UI, and the gateway must be stopped before you run the update command.

If the gateway gets stuck or the policy does not appear in the spec output, do not restart blindly. Run databricks pipelines get first, confirm what actually applied, then fix and rerun the update before starting again.


#databricks #lakeflow-connect #databricks-cli #azure #data-engineering #cdc
Read next
Creating VPC Peering Across Multiple VPCs
Melody Egwuchukwu
Melody Egwuchukwu
Cloud Data Engineer. Writing about data pipelines, cloud architecture, and the things that trip you up in production.