Warning This project has been archived and is no longer actively maintained or supported. We highly recommend migrating to the official Databricks SDK for Python available at https://github.com/databricks/databricks-sdk-py.
azure-databricks-sdk-python is a Python SDK for the Azure Databricks REST API 2.0.
Easily, perform all the operations as if on the Databricks UI:
from azure_databricks_sdk_python import Client
from azure_databricks_sdk_python.types.clusters import AutoScale, ClusterAttributes
client = Client(databricks_instance="<instance>", personal_access_token="<token>")
spark_conf = {'spark.speculation': True}
autoscale = AutoScale(min_workers=0, max_workers=1)
attributes = ClusterAttributes(cluster_name="my-cluster",
spark_version="7.2.x-scala2.12",
node_type_id="Standard_D3_v2",
spark_conf=spark_conf,
autoscale=autoscale)
created = client.clusters.create(attributes)
print(created.cluster_id)azure-databricks-sdk-python is ready for your use-case:
- Clear standard to access to APIs.
- Contains custom types for the API results and requests.
- Support for Personal Access token authentification.
- Support for Azure AD authentification.
- Support for the use of Azure AD service principals.
- Allows free-style API calls with a force mode -(bypass types validation).
- Error handeling and proxy support.
Officially supports 3.6+, and runs great on PyPy.
Please refer to the progress below:
| Feature | Progress |
|---|---|
| Authentification | 100% ✔ |
| Custom types | 25% |
| API Wrappers | 25% |
| Error handling | 80% |
| Proxy support | 0% |
| Documentation | 20% |
As for specific API wrappers:
| API | Progress |
|---|---|
| Clusters API | 100% ✔ |
| Secrets API | 100% ✔ |
| Token API | 100% ✔ |
| Jobs API | 0% |
| DBFS API | 0% |
| Groups API | 0% |
| Libraries API | 0% |
| Workspace API | 0% |
| Clusters Policies API | 0% |
| Instance Pools API | 0% |
| MLflow API | 0% |
| Permissions API | 0% |
| SCIM API | 0% |
| Token Management API | 0% |
Check the documentation on readthedocs.org.