Documentation for scw inference
This API allows you to handle your Managed Inference services.
- Deployment commands
- Create a deployment
- Delete a deployment
- Get a deployment
- Get the CA certificate
- List inference deployments
- Update a deployment
- Endpoint management commands
- Create an endpoint
- Delete an endpoint
- Update an endpoint
- Models commands
- Delete a model
- Get a model
- Import a model
- List models
- Node types management commands
- List available node types
Deployment commands
Deployment commands.
Create a deployment
Create a new inference deployment related to a specific model.
Usage:
scw inference deployment create [arg=value ...]
Args:
Name | Description | |
---|---|---|
name | Required Default: <generated> |
Name of the deployment |
project-id | Project ID to use. If none is passed the default project ID will be used | |
model-id | Required | ID of the model to use |
accept-eula | Accept the model's End User License Agreement (EULA). | |
node-type-name | Required | Name of the node type to use |
tags.{index} | List of tags to apply to the deployment | |
min-size | Defines the minimum size of the pool | |
max-size | Defines the maximum size of the pool | |
endpoints.{index}.private-network.private-network-id | ||
endpoints.{index}.disable-auth | Default: false |
Disable the authentication on the endpoint. |
quantization.bits | The number of bits each model parameter should be quantized to. The quantization method is chosen based on this value. | |
region | Default: fr-par One of: fr-par |
Region to target. If none is passed will use default region from the config |
Delete a deployment
Delete an existing inference deployment.
Usage:
scw inference deployment delete <deployment-id ...> [arg=value ...]
Args:
Name | Description | |
---|---|---|
deployment-id | Required | ID of the deployment to delete |
region | Default: fr-par One of: fr-par |
Region to target. If none is passed will use default region from the config |
Get a deployment
Get the deployment for the given ID.
Usage:
scw inference deployment get <deployment-id ...> [arg=value ...]
Args:
Name | Description | |
---|---|---|
deployment-id | Required | ID of the deployment to get |
region | Default: fr-par One of: fr-par |
Region to target. If none is passed will use default region from the config |
Get the CA certificate
Get the CA certificate used for the deployment of private endpoints. The CA certificate will be returned as a PEM file.
Usage:
scw inference deployment get-certificate [arg=value ...]
Args:
Name | Description | |
---|---|---|
deployment-id | Required | |
region | Default: fr-par One of: fr-par |
Region to target. If none is passed will use default region from the config |
List inference deployments
List all your inference deployments.
Usage:
scw inference deployment list [arg=value ...]
Args:
Name | Description | |
---|---|---|
order-by | One of: created_at_desc , created_at_asc , name_asc , name_desc |
Order in which to return results |
project-id | Filter by Project ID | |
name | Filter by deployment name | |
tags.{index} | Filter by tags | |
organization-id | Filter by Organization ID | |
region | Default: fr-par One of: fr-par , all |
Region to target. If none is passed will use default region from the config |
Update a deployment
Update an existing inference deployment.
Usage:
scw inference deployment update <deployment-id ...> [arg=value ...]
Args:
Name | Description | |
---|---|---|
deployment-id | Required | ID of the deployment to update |
name | Name of the deployment | |
tags.{index} | List of tags to apply to the deployment | |
min-size | Defines the new minimum size of the pool | |
max-size | Defines the new maximum size of the pool | |
model-id | Id of the model to set to the deployment | |
quantization.bits | The number of bits each model parameter should be quantized to. The quantization method is chosen based on this value. | |
region | Default: fr-par One of: fr-par |
Region to target. If none is passed will use default region from the config |
Endpoint management commands
Endpoint management commands.
Create an endpoint
Create a new Endpoint related to a specific deployment.
Usage:
scw inference endpoint create <deployment-id ...> [arg=value ...]
Args:
Name | Description | |
---|---|---|
deployment-id | Required | ID of the deployment to create the endpoint for |
endpoint.private-network.private-network-id | ||
endpoint.disable-auth | Default: false |
Disable the authentication on the endpoint. |
region | Default: fr-par One of: fr-par |
Region to target. If none is passed will use default region from the config |
Delete an endpoint
Delete an existing Endpoint.
Usage:
scw inference endpoint delete <endpoint-id ...> [arg=value ...]
Args:
Name | Description | |
---|---|---|
endpoint-id | Required | ID of the endpoint to delete |
region | Default: fr-par One of: fr-par |
Region to target. If none is passed will use default region from the config |
Update an endpoint
Update an existing Endpoint.
Usage:
scw inference endpoint update <endpoint-id ...> [arg=value ...]
Args:
Name | Description | |
---|---|---|
endpoint-id | Required | ID of the endpoint to update |
disable-auth | Disable the authentication on the endpoint. | |
region | Default: fr-par One of: fr-par |
Region to target. If none is passed will use default region from the config |
Models commands
Models commands.
Delete a model
Delete an existing model from your model library.
Usage:
scw inference model delete <model-id ...> [arg=value ...]
Args:
Name | Description | |
---|---|---|
model-id | Required | ID of the model to delete |
region | Default: fr-par One of: fr-par |
Region to target. If none is passed will use default region from the config |
Get a model
Get the model for the given ID.
Usage:
scw inference model get <model-id ...> [arg=value ...]
Args:
Name | Description | |
---|---|---|
model-id | Required | ID of the model to get |
region | Default: fr-par One of: fr-par |
Region to target. If none is passed will use default region from the config |
Import a model
Import a new model to your model library.
Usage:
scw inference model import [arg=value ...]
Args:
Name | Description | |
---|---|---|
name | Required Default: <generated> |
Name of the model |
project-id | Project ID to use. If none is passed the default project ID will be used | |
source.url | ||
source.secret | ||
region | Default: fr-par One of: fr-par |
Region to target. If none is passed will use default region from the config |
List models
List all available models.
Usage:
scw inference model list [arg=value ...]
Args:
Name | Description | |
---|---|---|
order-by | One of: display_rank_asc , created_at_asc , created_at_desc , name_asc , name_desc |
Order in which to return results |
project-id | Filter by Project ID | |
name | Filter by model name | |
tags.{index} | Filter by tags | |
region | Default: fr-par One of: fr-par , all |
Region to target. If none is passed will use default region from the config |
Node types management commands
Node types management commands.
List available node types
List all available node types. By default, the node types returned in the list are ordered by creation date in ascending order, though this can be modified via the order_by
field.
Usage:
scw inference node-type list [arg=value ...]
Args:
Name | Description | |
---|---|---|
include-disabled-types | Include disabled node types in the response | |
region | Default: fr-par One of: fr-par , all |
Region to target. If none is passed will use default region from the config |