Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Able to authenticate against Databricks SQL Connector from a Databricks notebook #535

Open
ewengillies opened this issue Mar 12, 2025 · 0 comments

Comments

@ewengillies
Copy link

ewengillies commented Mar 12, 2025

More of a feature request and very similar to this one: #148

It would be nice to be able to run databricks-sql-connector against a warehouse from a databricks notebook while not leaving the python runtime of the notebook. Other solutions:

  • spark.sql, but this usually means worse price-performance than using a Databricks SQL warehouse. Some workloads just have a little bit of heavy SQL in them and the rest is easy - its hard to move away from the performance of the SQL warehouse once you're used to it.
  • spark.sql but fine-tune pyspark settings in the notebook cluster to get similar performance is a pain.
  • Serverless pyspark - pretty good alternative option, though RAM limits on the driver node can be a pain. Also autoscaling here is pretty "invisible", so if you have a strong prior about workload size a SQL warehouse can be better.
  • Embed API keys in dbutils.secrets and connect to databricks-sql-connector that way. Annoying, I'm already auth'ed i nthe databricks notebook, why carry an API key around?

With 15.4 and above, authentication in the WorkspaceClient() from the SDK works out of the box in the databricks runtime. That means you can borrow its ability to OAuth to connect to a SQL warehouse without having to generate an API key and embed this anywhere.

This is what it looks like (you'll need a pip install databricks-sql-connector first).

from databricks.sdk import WorkspaceClient
from databricks import sql as dbx_sql
from databricks.sdk.credentials_provider import databricks_cli, runtime_native_auth
from databricks.sdk.core import Config

client = WorkspaceClient()
oauth_provider = lambda: runtime_native_auth(client.config)

sql_conn = dbx_sql.connect(
    server_hostname=client.config.host,
    http_path='/sql/1.0/warehouses/<SOME_WAREHOUSE_ID_HERE>',
    credentials_provider=oauth_provider,
)
with sql_conn as a_conn:
    with a_conn.cursor() as cursor:
        cursor.execute("SELECT * FROM samples.nyctaxi.trips LIMIT 10;")
        print(cursor.fetchall())

Not sure if this is worth some documentation or example, but it was a solution to an issue we've had for a while. Any thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant