API Reference
API Version: substratus.ai/v1
Package v1 contains API Schema definitions for Substratus.
Resources
Types
ArtifactsStatus
Appears in:
Field | Description |
---|---|
url string |
Build
Appears in:
Field | Description |
---|---|
git BuildGit | Git is a reference to a git repository that will be built within the cluster. Built image will be set in the .spec.image field. |
upload BuildUpload | Upload can be set to request to start an upload flow where the client is responsible for uploading a local directory that is to be built in the cluster. |
BuildGit
Appears in:
Field | Description |
---|---|
url string | URL to the git repository to build. Example: https://github.com/my-username/my-repo |
path string | Path within the git repository referenced by url. |
tag string | Tag is the git tag to use. Choose either tag or branch. This tag will be pulled only at build time and not monitored for changes. |
branch string | Branch is the git branch to use. Choose either branch or tag. This branch will be pulled only at build time and not monitored for changes. |
BuildUpload
Appears in:
Field | Description |
---|---|
md5Checksum string | MD5Checksum is the md5 checksum of the tar'd repo root requested to be uploaded and built. |
requestID string | RequestID is the ID of the request to build the image. Changing this ID to a new value can be used to get a new signed URL (useful when a URL has expired). |
Dataset
The Dataset API is used to describe data that can be referenced for training Models.
- Datasets pull in remote data sources using containerized data loaders.
- Users can specify their own ETL logic by referencing a repository from a Dataset.
- Users can leverage pre-built data loader integrations with various sources.
- Training typically requires a large dataset. The Dataset API pulls a dataset once and stores it in a bucket, which is mounted directly into training Jobs.
- The Dataset API allows users to query ready-to-use datasets (
kubectl get datasets
). - The Dataset API allows Kubernetes RBAC to be applied as a mechanism for controlling access to data.
Field | Description |
---|---|
apiVersion string | substratus.ai/v1 |
kind string | Dataset |
metadata ObjectMeta | Refer to Kubernetes API documentation for fields of metadata . |
spec DatasetSpec | Spec is the desired state of the Dataset. |
status DatasetStatus | Status is the observed state of the Dataset. |
DatasetSpec
DatasetSpec defines the desired state of Dataset.
Appears in:
Field | Description |
---|---|
command string array | Command to run in the container. |
env object (keys:string, values:string) | Environment variables in the container |
image string | Image that contains dataset loading code and dependencies. |
build Build | Build specifies how to build an image. |
resources Resources | Resources are the compute resources required by the container. |
params object (keys:string, values:IntOrString) | Params will be passed into the loading process as environment variables. |
DatasetStatus
DatasetStatus defines the observed state of Dataset.
Appears in:
Field | Description |
---|---|
ready boolean | Ready indicates that the Dataset is ready to use. See Conditions for more details. |
conditions Condition array | Conditions is the list of conditions that describe the current state of the Dataset. |
artifacts ArtifactsStatus | Artifacts status. |
buildUpload UploadStatus | BuildUpload contains the status of the build context upload. |
GPUResources
Appears in:
Field | Description |
---|---|
type GPUType | Type of GPU. |
count integer | Count is the number of GPUs. |
GPUType
Underlying type: string
Appears in:
Model
The Model API is used to build and train machine learning models.
- Base models can be built from a Git repository.
- Models can be trained by combining a base Model with a Dataset.
- Model artifacts are persisted in cloud buckets.
Field | Description |
---|---|
apiVersion string | substratus.ai/v1 |
kind string | Model |
metadata ObjectMeta | Refer to Kubernetes API documentation for fields of metadata . |
spec ModelSpec | Spec is the desired state of the Model. |
status ModelStatus | Status is the observed state of the Model. |
ModelSpec
ModelSpec defines the desired state of Model
Appears in:
Field | Description |
---|---|
command string array | Command to run in the container. |
env object (keys:string, values:string) | Environment variables in the container |
image string | Image that contains model code and dependencies. |
build Build | Build specifies how to build an image. |
resources Resources | Resources are the compute resources required by the container. |
model ObjectRef | Model should be set in order to mount another model to be used for transfer learning. |
dataset ObjectRef | Dataset to mount for training. |
params object (keys:string, values:IntOrString) | Parameters are passing into the model training/loading container as environment variables. Environment variable name will be "PARAM_" + uppercase(key) . |
ModelStatus
ModelStatus defines the observed state of Model
Appears in:
Field | Description |
---|---|
ready boolean | Ready indicates that the Model is ready to use. See Conditions for more details. |
conditions Condition array | Conditions is the list of conditions that describe the current state of the Model. |
artifacts ArtifactsStatus | Artifacts status. |
buildUpload UploadStatus | BuildUpload contains the status of the build context upload. |
Notebook
The Notebook API can be used to quickly spin up a development environment backed by high performance compute.
- Notebooks integrate with the Model and Dataset APIs allow for quick iteration.
- Notebooks can be synced to local directories to streamline developer experiences using Substratus kubectl plugins.
Field | Description |
---|---|
apiVersion string | substratus.ai/v1 |
kind string | Notebook |
metadata ObjectMeta | Refer to Kubernetes API documentation for fields of metadata . |
spec NotebookSpec | Spec is the observed state of the Notebook. |
status NotebookStatus | Status is the observed state of the Notebook. |
NotebookSpec
NotebookSpec defines the desired state of Notebook
Appears in:
Field | Description |
---|---|
command string array | Command to run in the container. |
env object (keys:string, values:string) | Environment variables in the container |
suspend boolean | Suspend should be set to true to stop the notebook (Pod) from running. This is a pointer to distinguish between explicit false and not specified. |
image string | Image that contains notebook and dependencies. |
build Build | Build specifies how to build an image. |
resources Resources | Resources are the compute resources required by the container. |
model ObjectRef | Model to load into the notebook container. |
dataset ObjectRef | Dataset to load into the notebook container. |
params object (keys:string, values:IntOrString) | Params will be passed into the notebook container as environment variables. |
NotebookStatus
NotebookStatus defines the observed state of Notebook
Appears in:
Field | Description |
---|---|
ready boolean | Ready indicates that the Notebook is ready to serve. See Conditions for more details. |
conditions Condition array | Conditions is the list of conditions that describe the current state of the Notebook. |
artifacts ArtifactsStatus | Artifacts status. |
buildUpload UploadStatus | BuildUpload contains the status of the build context upload. |
ObjectRef
Appears in:
Field | Description |
---|---|
name string | Name of Kubernetes object. |
Resources
Appears in:
Field | Description |
---|---|
cpu integer | CPU resources. |
disk integer | Disk size in Gigabytes. |
memory integer | Memory is the amount of RAM in Gigabytes. |
gpu GPUResources | GPU resources. |
Server
The Server API is used to deploy a server that exposes the capabilities of a Model via a HTTP interface.
Field | Description |
---|---|
apiVersion string | substratus.ai/v1 |
kind string | Server |
metadata ObjectMeta | Refer to Kubernetes API documentation for fields of metadata . |
spec ServerSpec | Spec is the desired state of the Server. |
status ServerStatus | Status is the observed state of the Server. |
ServerSpec
ServerSpec defines the desired state of Server
Appears in:
Field | Description |
---|---|
command string array | Command to run in the container. |
env object (keys:string, values:string) | Environment variables in the container |
image string | Image that contains model serving application and dependencies. |
build Build | Build specifies how to build an image. |
resources Resources | Resources are the compute resources required by the container. |
model ObjectRef | Model references the Model object to be served. |
params object (keys:string, values:IntOrString) | Params will be passed into the loading process as environment variables. |
ServerStatus
ServerStatus defines the observed state of Server
Appears in:
Field | Description |
---|---|
ready boolean | Ready indicates whether the Server is ready to serve traffic. See Conditions for more details. |
conditions Condition array | Conditions is the list of conditions that describe the current state of the Server. |
buildUpload UploadStatus | Upload contains the status of the build context upload. |
UploadStatus
Appears in:
Field | Description |
---|---|
signedURL string | SignedURL is a short lived HTTPS URL. The client is expected to send a PUT request to this URL containing a tar'd docker build context. Content-Type of "application/octet-stream" should be used. |
requestID string | RequestID is the request id that corresponds to this status. Clients should check that this matches the request id that they set in the upload spec before uploading. |
expiration Time | Expiration is the time at which the signed URL expires. |
storedMD5Checksum string | StoredMD5Checksum is the md5 checksum of the file that the controller observed in storage. |