WIP: Add instructions for running the example driver on GKE #93

mortent · 2025-04-09T00:21:49Z

We currently only provide instructions for running the example driver on Kind. We should provide instructions for how to run the driver on other Kubernetes distributions, as they will include some different steps. This PR adds instructions for how to get the example driver running on GKE.

For now this just adds a new section with the new content. Ideally we should restructure the README file, so that we can provide separate instructions for getting the driver running on a cluster, and then a shared section for running the examples. I decided not to do that in this PR, as I want to make sure we have agreement on a structure first. Created an issue for this: #94

k8s-ci-robot · 2025-04-09T00:21:54Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mortent
Once this PR has been reviewed and has the lgtm label, please assign elezar for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

nojnhuh

I don't have a GCP account where I could test this myself, but the docs look good overall, thanks! I left a few small comments inline.

nojnhuh · 2025-04-09T19:00:24Z

README.md

+enabled by default in GKE since 1.32.1-gke.1489001, so we will create
+a cluster in the rapid channel to make sure we get a recent version.
+
+Since DRA is still a beta feature, we need to explicitely enable it


Suggested change

Since DRA is still a beta feature, we need to explicitely enable it

Since DRA is still a beta feature, we need to explicitly enable it

nojnhuh · 2025-04-09T19:02:53Z

README.md

+  deployments/helm/dra-example-driver
+```
+
+The examples in `demo/gpu-test{1,2,3,4,5}.yaml` works just like with Kind.


Suggested change

The examples in `demo/gpu-test{1,2,3,4,5}.yaml` works just like with Kind.

The examples in `demo/gpu-test{1,2,3,4,5}.yaml` work just like with Kind.

nojnhuh · 2025-04-09T19:06:52Z

README.md

 ./demo/delete-cluster.sh
 ```

+## Installing the example driver on a GKE cluster


It would be great to also run the e2e test on GKE to make sure we don't accidentally break this. I'm not sure if there's an existing pattern to do that in Prow or if GitHub Actions makes that easy. We can definitely address that later.

nojnhuh · 2025-04-09T19:44:09Z

README.md

+  --namespace dra-example-driver \
+  --set=resourcequota.enabled=true \
+  dra-example-driver \
+  deployments/helm/dra-example-driver


Using the chart from the main branch might not always be compatible with the latest released version of the container image. Ideally we'd stick to either building the image from source for the checked out version of the chart (like the e2e CI tests do), or use compatible published release versions of each. I see published releases won't work here immediately without these unreleased changes to the chart, but we could merge the chart changes, cut a chart release, then merge the docs.

Seamless upgrade support for 1.33 is one thing I'm anticipating which will involve changes to both the image and the chart. Installing a chart that is set up for seamless upgrades and an image that isn't will likely cause issues, though that might only be in marginal cases users walking through the demo wouldn't normally hit. I haven't thought that particular scenario all the way through. For sensitive changes like that I'm confident we can work things out such that a little bit of skew is probably fine, but zero skew is of course preferred.

nojnhuh · 2025-04-09T19:49:38Z

deployments/helm/dra-example-driver/values.yaml

    name: ""
+
+resourcequota:
+  enabled: false


I'd vote to enable this for e2e tests for that extra bit of coverage like we do for the webhook which is disabled by default:

dra-example-driver/test/e2e/setup-e2e.sh

Line 36 in e82f291

--set webhook.enabled=true \

k8s-ci-robot · 2025-05-05T19:31:24Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

pohly · 2025-05-13T13:28:02Z

@mortent: please address the comments and rebase.

pohly · 2025-07-21T07:09:04Z

/retitle WIP: Add instructions for running the example driver on GKE

k8s-triage-robot · 2025-10-19T07:44:54Z

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle stale
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

github-project-automation bot added this to Dynamic Resource Allocation Apr 9, 2025

github-project-automation bot moved this to 🆕 New in Dynamic Resource Allocation Apr 9, 2025

k8s-ci-robot requested review from bart0sh and klueska April 9, 2025 00:21

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 9, 2025

pohly moved this from 🆕 New to 👀 In review in Dynamic Resource Allocation Apr 9, 2025

Add instructions for running the example driver on GKE

d441c6d

mortent force-pushed the GKEInstructions branch from fe57805 to d441c6d Compare April 9, 2025 11:45

nojnhuh reviewed Apr 9, 2025

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 5, 2025

k8s-ci-robot changed the title ~~Add instructions for running the example driver on GKE~~ WIP: Add instructions for running the example driver on GKE Jul 21, 2025

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 21, 2025

pohly moved this from 👀 In review to 🏗 In progress in Dynamic Resource Allocation Sep 15, 2025

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP: Add instructions for running the example driver on GKE #93

WIP: Add instructions for running the example driver on GKE #93

mortent commented Apr 9, 2025 •

edited

Loading

Uh oh!

k8s-ci-robot commented Apr 9, 2025

Uh oh!

nojnhuh left a comment

Uh oh!

nojnhuh Apr 9, 2025

Uh oh!

nojnhuh Apr 9, 2025

Uh oh!

nojnhuh Apr 9, 2025

Uh oh!

nojnhuh Apr 9, 2025

Uh oh!

nojnhuh Apr 9, 2025

Uh oh!

k8s-ci-robot commented May 5, 2025

Uh oh!

pohly commented May 13, 2025

Uh oh!

pohly commented Jul 21, 2025

Uh oh!

k8s-triage-robot commented Oct 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	Since DRA is still a beta feature, we need to explicitely enable it
	Since DRA is still a beta feature, we need to explicitly enable it

	The examples in `demo/gpu-test{1,2,3,4,5}.yaml` works just like with Kind.
	The examples in `demo/gpu-test{1,2,3,4,5}.yaml` work just like with Kind.

WIP: Add instructions for running the example driver on GKE #93

Are you sure you want to change the base?

WIP: Add instructions for running the example driver on GKE #93

Conversation

mortent commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Apr 9, 2025

Uh oh!

nojnhuh left a comment

Choose a reason for hiding this comment

Uh oh!

nojnhuh Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

nojnhuh Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

nojnhuh Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

nojnhuh Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

nojnhuh Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

k8s-ci-robot commented May 5, 2025

Uh oh!

pohly commented May 13, 2025

Uh oh!

pohly commented Jul 21, 2025

Uh oh!

k8s-triage-robot commented Oct 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mortent commented Apr 9, 2025 •

edited

Loading