-
Notifications
You must be signed in to change notification settings - Fork 14
feat: extract model-runner APIs from docker/model-cli #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
mdelapenya
commented
Apr 23, 2025
- feat: extract from docker/model-cli
- chore: make method public
- docs: add Go docs for methods, types and vars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have done a standard review looking at style and idiomatic go code. The changes flagged may want to be addressed separately but wanted flag
dockermodelrunner/desktop/desktop.go
Outdated
} | ||
} | ||
return Status{ | ||
Running: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: this looks like a bug as we have already confirmed the service is running?
|
||
import "github.com/docker/docker-sdk-go/dockermodelrunner/modeldistribution/types" | ||
|
||
// ModelCreateRequest represents a model create request. It is designed to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: this sounds all fine but it's used to "pull" models which is confusing. Should this be changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we chose using ModelCreate for adding future operations in the create namespace. Reading the original docs:
// ModelCreateRequest represents a model create request. It is designed to
// follow Docker Engine API conventions, most closely following the request
// associated with POST /images/create. At the moment is only designed to
// facilitate pulls, though in the future it may facilitate model building and
// refinement (such as fine tuning, quantization, or distillation).
I like the idea of using ModelPullRequest
although, after reading that, I'm concerned if we would be stopping ourselves for doing more things in that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As user I would expect pull and create to be seperate, much like docker build and docker pull are seperate.
It would be good to confirm the thinking behind using create before making a call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second pass with some local validation
} | ||
|
||
// Status represents the status of the Docker Model Runner. | ||
type Status struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: whats the use case here as running, status and error have a lot of overlap? It may be better to simplify, have the method return an error even if that indicates not running which is idiomatic behaviour.
status, err := client.Status()
if err != nil {
if errors.Is(err, ErrServiceNotAvailable) {
// Special handling goes here.
}
return fmt.Errorf("status: %w", err)
}
I suspect the common use for this is can I use model runner, for which only an error check is needed the actual state is kind of irrelevant.
Current the status request appears to just be GET /models
so this isn't actually working and a GET /
just returns text:
curl http://model-runner.docker.internal/ -H "Accept: application/json"
Docker Model Runner
The service is running.
Also if the service is not enable, the DNS entry doesn't exist so you actually just a DNS failure, so looks like ErrServiceUnavailable
is not even possible currently.
package main
import (
"net/http"
"fmt"
"io"
"errors"
)
var ErrServiceUnavailable = errors.New("not available")
func URL(path string) string {
return "http://model-runner.docker.internal/" + path
}
// doRequest is a helper function that performs HTTP requests and handles 503 responses
func doRequest(method, path string, body io.Reader) (*http.Response, error) {
req, err := http.NewRequest(method, URL(path), body)
if err != nil {
return nil, fmt.Errorf("new %s request: %w", method, err)
}
if body != nil {
req.Header.Set("Content-Type", "application/json")
}
resp, err := http.DefaultClient.Do(req)
if err != nil {
return nil, err
}
if resp.StatusCode == http.StatusServiceUnavailable {
resp.Body.Close()
return nil, ErrServiceUnavailable
}
return resp, nil
}
func main() {
resp, err := doRequest(http.MethodGet, "/models", nil)
fmt.Printf("%+v, %v\n", resp, err)
}
go run main.go
<nil>, Get "http://model-runner.docker.internal//models": dial tcp: lookup model-runner.docker.internal on 192.168.65.7:53: no such host
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, Status
is populated by querying /engines/status
, endpoint which will be available in Docker Desktop v4.41.
Querying /models
returns the Model Manager (component of the Docker Model Runner)'s status.
Querying /engines/status
returns the Model Scheduler's status, which can be different as this one deals with more things.
I agree that there should be a single endpoint to query for getting the complete status of Docker Model Runner, a status that includes all the necessary details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification @doringeman.
Would definitely be good to have more detailed information. One of the biggest issues with running local models is resources, memory and compute capacity. For MacOS with its unified memory architecture its simpler but on PC it's definitely more challenging, as soon as GPU memory is full performance drops off a cliff making it unusable, so good visibility into that will be key and it would be nice to have a holistic picture from docker as a whole.
Out of interest have you considered exposing via the standard docker host instead of dedicated one, I ask as that would allow a status endpoint to provide details even when docker model runner isn't enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would definitely be good to have more detailed information. One of the biggest issues with running local models is resources, memory and compute capacity. For MacOS with its unified memory architecture its simpler but on PC it's definitely more challenging, as soon as GPU memory is full performance drops off a cliff making it unusable, so good visibility into that will be key and it would be nice to have a holistic picture from docker as a whole.
I agree! It's on our TODO list to implement a "system status"/"process status". 😁
Out of interest have you considered exposing via the standard docker host instead of dedicated one, I ask as that would allow a status endpoint to provide details even when docker model runner isn't enabled?
The ExperimentalEndpointsPrefix
which is also present in this PR, but not used, is the prefix used for querying the Model Runner via the docker socket on the host (currently only available with Docker Desktop). See the code from https://github.com/docker/model-cli/blob/main/commands/root.go#L21.
// handleQueryError is a helper function that handles query errors. | ||
func (c *Client) handleQueryError(err error, path string) error { | ||
if errors.Is(err, ErrServiceUnavailable) { | ||
return ErrServiceUnavailable | ||
} | ||
return fmt.Errorf("error querying %s: %w", path, err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: while this keeps the error clean, the point of wrapping is that doesn't matter, so I would eliminate this.
// ExperimentalEndpointsPrefix is used to prefix all <paths.InferencePrefix> routes on the Docker | ||
// socket while they are still in their experimental stage. This prefix doesn't | ||
// apply to endpoints on model-runner.docker.internal. | ||
const ExperimentalEndpointsPrefix = "/exp/vDD4.40" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Is this out of date? Using as is results in failures when used in URL(path...) e.g. Status()
fails as 80 is not bound in localhost, not sure if I'm missing something?
} | ||
|
||
// URL returns the URL for the Docker Model Runner. | ||
func URL(path string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Is this out date as it doesn't work for me, changing it to be http://model-runner.docker.internal/
without the experimental prefix fixes that.