Skip to main content

API Reference

API Version: substratus.ai/v1

Package v1 contains API Schema definitions for Substratus.

Resources

Types

ArtifactsStatus

Appears in:

FieldDescription
url string

Build

Appears in:

FieldDescription
git BuildGitGit is a reference to a git repository that will be built within the cluster. Built image will be set in the .spec.image field.
upload BuildUploadUpload can be set to request to start an upload flow where the client is responsible for uploading a local directory that is to be built in the cluster.

BuildGit

Appears in:

FieldDescription
url stringURL to the git repository to build. Example: https://github.com/my-username/my-repo
path stringPath within the git repository referenced by url.
tag stringTag is the git tag to use. Choose either tag or branch. This tag will be pulled only at build time and not monitored for changes.
branch stringBranch is the git branch to use. Choose either branch or tag. This branch will be pulled only at build time and not monitored for changes.

BuildUpload

Appears in:

FieldDescription
md5Checksum stringMD5Checksum is the md5 checksum of the tar'd repo root requested to be uploaded and built.
requestID stringRequestID is the ID of the request to build the image. Changing this ID to a new value can be used to get a new signed URL (useful when a URL has expired).

Dataset

The Dataset API is used to describe data that can be referenced for training Models.

  • Datasets pull in remote data sources using containerized data loaders.
  • Users can specify their own ETL logic by referencing a repository from a Dataset.
  • Users can leverage pre-built data loader integrations with various sources.
  • Training typically requires a large dataset. The Dataset API pulls a dataset once and stores it in a bucket, which is mounted directly into training Jobs.
  • The Dataset API allows users to query ready-to-use datasets (kubectl get datasets).
  • The Dataset API allows Kubernetes RBAC to be applied as a mechanism for controlling access to data.
FieldDescription
apiVersion stringsubstratus.ai/v1
kind stringDataset
metadata ObjectMetaRefer to Kubernetes API documentation for fields of metadata.
spec DatasetSpecSpec is the desired state of the Dataset.
status DatasetStatusStatus is the observed state of the Dataset.

DatasetSpec

DatasetSpec defines the desired state of Dataset.

Appears in:

FieldDescription
command string arrayCommand to run in the container.
env object (keys:string, values:string)Environment variables in the container
image stringImage that contains dataset loading code and dependencies.
build BuildBuild specifies how to build an image.
resources ResourcesResources are the compute resources required by the container.
params object (keys:string, values:IntOrString)Params will be passed into the loading process as environment variables.

DatasetStatus

DatasetStatus defines the observed state of Dataset.

Appears in:

FieldDescription
ready booleanReady indicates that the Dataset is ready to use. See Conditions for more details.
conditions Condition arrayConditions is the list of conditions that describe the current state of the Dataset.
artifacts ArtifactsStatusArtifacts status.
buildUpload UploadStatusBuildUpload contains the status of the build context upload.

GPUResources

Appears in:

FieldDescription
type GPUTypeType of GPU.
count integerCount is the number of GPUs.

GPUType

Underlying type: string

Appears in:

Model

The Model API is used to build and train machine learning models.

  • Base models can be built from a Git repository.
  • Models can be trained by combining a base Model with a Dataset.
  • Model artifacts are persisted in cloud buckets.
FieldDescription
apiVersion stringsubstratus.ai/v1
kind stringModel
metadata ObjectMetaRefer to Kubernetes API documentation for fields of metadata.
spec ModelSpecSpec is the desired state of the Model.
status ModelStatusStatus is the observed state of the Model.

ModelSpec

ModelSpec defines the desired state of Model

Appears in:

FieldDescription
command string arrayCommand to run in the container.
env object (keys:string, values:string)Environment variables in the container
image stringImage that contains model code and dependencies.
build BuildBuild specifies how to build an image.
resources ResourcesResources are the compute resources required by the container.
model ObjectRefModel should be set in order to mount another model to be used for transfer learning.
dataset ObjectRefDataset to mount for training.
params object (keys:string, values:IntOrString)Parameters are passing into the model training/loading container as environment variables. Environment variable name will be "PARAM_" + uppercase(key).

ModelStatus

ModelStatus defines the observed state of Model

Appears in:

FieldDescription
ready booleanReady indicates that the Model is ready to use. See Conditions for more details.
conditions Condition arrayConditions is the list of conditions that describe the current state of the Model.
artifacts ArtifactsStatusArtifacts status.
buildUpload UploadStatusBuildUpload contains the status of the build context upload.

Notebook

The Notebook API can be used to quickly spin up a development environment backed by high performance compute.

  • Notebooks integrate with the Model and Dataset APIs allow for quick iteration.
  • Notebooks can be synced to local directories to streamline developer experiences using Substratus kubectl plugins.
FieldDescription
apiVersion stringsubstratus.ai/v1
kind stringNotebook
metadata ObjectMetaRefer to Kubernetes API documentation for fields of metadata.
spec NotebookSpecSpec is the observed state of the Notebook.
status NotebookStatusStatus is the observed state of the Notebook.

NotebookSpec

NotebookSpec defines the desired state of Notebook

Appears in:

FieldDescription
command string arrayCommand to run in the container.
env object (keys:string, values:string)Environment variables in the container
suspend booleanSuspend should be set to true to stop the notebook (Pod) from running. This is a pointer to distinguish between explicit false and not specified.
image stringImage that contains notebook and dependencies.
build BuildBuild specifies how to build an image.
resources ResourcesResources are the compute resources required by the container.
model ObjectRefModel to load into the notebook container.
dataset ObjectRefDataset to load into the notebook container.
params object (keys:string, values:IntOrString)Params will be passed into the notebook container as environment variables.

NotebookStatus

NotebookStatus defines the observed state of Notebook

Appears in:

FieldDescription
ready booleanReady indicates that the Notebook is ready to serve. See Conditions for more details.
conditions Condition arrayConditions is the list of conditions that describe the current state of the Notebook.
artifacts ArtifactsStatusArtifacts status.
buildUpload UploadStatusBuildUpload contains the status of the build context upload.

ObjectRef

Appears in:

FieldDescription
name stringName of Kubernetes object.

Resources

Appears in:

FieldDescription
cpu integerCPU resources.
disk integerDisk size in Gigabytes.
memory integerMemory is the amount of RAM in Gigabytes.
gpu GPUResourcesGPU resources.

Server

The Server API is used to deploy a server that exposes the capabilities of a Model via a HTTP interface.

FieldDescription
apiVersion stringsubstratus.ai/v1
kind stringServer
metadata ObjectMetaRefer to Kubernetes API documentation for fields of metadata.
spec ServerSpecSpec is the desired state of the Server.
status ServerStatusStatus is the observed state of the Server.

ServerSpec

ServerSpec defines the desired state of Server

Appears in:

FieldDescription
command string arrayCommand to run in the container.
env object (keys:string, values:string)Environment variables in the container
image stringImage that contains model serving application and dependencies.
build BuildBuild specifies how to build an image.
resources ResourcesResources are the compute resources required by the container.
model ObjectRefModel references the Model object to be served.
params object (keys:string, values:IntOrString)Params will be passed into the loading process as environment variables.

ServerStatus

ServerStatus defines the observed state of Server

Appears in:

FieldDescription
ready booleanReady indicates whether the Server is ready to serve traffic. See Conditions for more details.
conditions Condition arrayConditions is the list of conditions that describe the current state of the Server.
buildUpload UploadStatusUpload contains the status of the build context upload.

UploadStatus

Appears in:

FieldDescription
signedURL stringSignedURL is a short lived HTTPS URL. The client is expected to send a PUT request to this URL containing a tar'd docker build context. Content-Type of "application/octet-stream" should be used.
requestID stringRequestID is the request id that corresponds to this status. Clients should check that this matches the request id that they set in the upload spec before uploading.
expiration TimeExpiration is the time at which the signed URL expires.
storedMD5Checksum stringStoredMD5Checksum is the md5 checksum of the file that the controller observed in storage.