Add deploy from catalog to huggingface client #2880

ErikKaum · 2025-02-19T15:21:13Z

Adds a similar "one-click" deploy experience to the python client as we have in the UI:

get list of available catalog models https://endpoints.huggingface.co/api/catalog/repo-list
deploy model like

curl -X POST "https://endpoints.huggingface.co/api/catalog/deploy" \
  -H "Content-Type: application/json" \
  -d '{
    "accessToken": ACCESS_TOKEN,
    "namespace": NAMESPACE,
    "repoId": REPO_ID,
    "endpointName": ENDPOINT_NAME
}'

We omit for now more "advanced" checking if a model was removed from the catalog or similar. Let's first see if the feature gets traction.

The text was updated successfully, but these errors were encountered:

julien-c · 2025-02-24T13:56:20Z

how is this different from the existing create_inference_endpoint (and shouldn't we make that better instead)
is there a slack convo about this?
Thanks!

ErikKaum · 2025-02-24T14:16:38Z

here is the slack context where it started, also we had chats with @Wauplin about this and thought it would be a nice thing to test
the difference is similar to deploying manually from the ui VS from the catalog in the UI, so create_inference_endpoint requires the user to specify all the details about which container to use, which hardware etc, whereas this just calls deploy and it uses the recommended catalog configuration. The details remain opaque to the user.

Also good to note that this api/catalog/repo-list is experimental and there we'd just want to quickly experiment if this type of a feature would be nice + used in the python client.

hanouticelina · 2025-02-24T14:22:25Z

👍 I like the idea to be able to programmatically deploy an endpoint without having to worry about the config (device, framework and all of that) + it seems to be a low hanging fruit.

Wauplin · 2025-02-27T15:38:01Z

@ErikKaum a few comments about the API:

is "https://endpoints.huggingface.co/api/catalog/" API designed to be stable?
would it be possible to pass the accessToken as a Authorization header instead of the payload? (as done in every other HF endpoint)
when trying to pass "llama-3.2-3b-instruct-gth" as endpointName, I'm getting {"message":"Bad Request: Invalid endpoint name. It can only contain lowercase alphanumeric characters or '-' and have a length of 32 characters"}.
1. is it possible to put the error message under "error" instead of "message"? (better for error formatting client-side)
2. is 32 a hard limit? Seems very low TBH. Repo ids can go up to 96 chars if I remember correctly

ErikKaum · 2025-02-27T20:01:54Z

if possible to mark the python wrapper as "beta", that would be nice. Also I could add a /v1 in there just to make life easier. I don't think we'll change much but nice to be able to change it without breaking it for you
Yes, I'll put it under error and yes it's a bit of an annoying limit I agree. I think it would require a bigger rework on the backend side which might be a bit of pandora's box, ideally we'd avoid it.

Wauplin · 2025-02-28T11:02:39Z

In #2892, methods are flagged as experimental.

Request auth, is it possible to update the endpoint to pass token as header instead of payload?

ErikKaum · 2025-02-28T13:25:35Z

Awesome 👍

Ah sorry, missed to reply to that one. Yes, I'll make that change as well!

ErikKaum assigned Wauplin Feb 19, 2025

Wauplin mentioned this issue Feb 27, 2025

[Draft] to add deploy from catalog #2892

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add deploy from catalog to huggingface client #2880

Add deploy from catalog to huggingface client #2880

ErikKaum commented Feb 19, 2025

julien-c commented Feb 24, 2025

ErikKaum commented Feb 24, 2025

hanouticelina commented Feb 24, 2025

Wauplin commented Feb 27, 2025

ErikKaum commented Feb 27, 2025

Wauplin commented Feb 28, 2025

ErikKaum commented Feb 28, 2025

Add deploy from catalog to huggingface client #2880

Add deploy from catalog to huggingface client #2880

Comments

ErikKaum commented Feb 19, 2025

julien-c commented Feb 24, 2025

ErikKaum commented Feb 24, 2025

hanouticelina commented Feb 24, 2025

Wauplin commented Feb 27, 2025

ErikKaum commented Feb 27, 2025

Wauplin commented Feb 28, 2025

ErikKaum commented Feb 28, 2025