Skip to main content

2 posts tagged with "introduction"

View All Tags

· 3 min read
Nick Stogner
kubectl notebook

Substratus has added the kubectl notebook command!

"Wouldn't it be nice to have a single command that containerized your local directory and served it as a Jupyter Notebook running on a machine with a bunch of GPUs attached?"

The conversation went something like that while we daydreamed about our preferred workflow. At that point in time we were hopping back-n-forth between Google Colab and our containers while developing a LLM training job.

"Annnddd it should automatically sync file-changes back to your local directory so that you can commit your changes to git and kick off a long-running ML training job - containerized with the exact same python version and packages!"

So we built it!

kubectl notebook -d .

And now it has become an integral part of our workflow as we build out the Substratus ML platform.

Check out the 50 second screenshare:

Design Goals

  1. One command should build, launch, and sync the Notebook.
  2. Users should only need a Kubeconfig - no other credentials.
  3. Admins should not need to setup networking, TLS, etc.

Implementation

We tackled our design goals using the following techniques:

  1. Implemented as a single Go binary, executed as a kubectl plugin.
  2. Signed URLs allow for users to upload their local directory to a bucket without requiring cloud credentials (Similar to how popular consumer clouds function).
  3. Kubernetes port-forwarding allows for serving remote notebooks without requiring admins to deal with networking / TLS concerns. It also leans on existing Kubernetes RBAC for access control.

Some interesting details:

  • Builds are executed remotely for two reasons:
    • Users don't need to install docker.
    • It avoids pushing massive container images from one's local machine (pip installs often inflate the final docker image to be much larger than the build context itself).
  • The client requests an upload URL by specifying the MD5 hash it wishes to upload - allowing for server-side signature verification.
  • Builds are skipped entirely if the MD5 hash of the build context already exists in the bucket.

The system underneath the notebook command:

diagram

More to come!

Lazy-loading large models from disk... Incremental dataset loading... Stay tuned to learn more about how Notebooks on Substratus can speed up your ML workflows.

Don't forget to star and follow the repo!

https://github.com/substratusai/substratus

Star

· 3 min read
Brandon Bjelland
Nick Stogner
Sam Stoelinga

We are excited to introduce Substratus, the open-source cross-cloud substrate for training and serving ML models with an initial focus on Large Language Models. Fine-tune and serve LLMs on your Kubernetes clusters in your cloud.

Can’t wait? - Get started with our quick start docs or jump over to the GitHub repo.

Why Substratus?

Press the fast-button for ML: Leverage out of the box container images to load a base model, optionally fine-tune with your own dataset and spin up a model server, all without writing any code.

Notebook-integrated workflows: Launch a remote, containerized, GPU-enabled notebook from a local directory with a single command. Develop in the exact same environment as your long running training jobs.

No vendor lock-in: Substratus is open-source and can run anywhere Kubernetes runs.

Keep company data internal: Deploy in your cloud account. Training data and inference APIs stay within your company’s network.

Best practices by default: Substratus models are immutable and contain information about their lineage. Datasets are imported and snapshotted using off-the-shelf manifests. Training executes in containerized environments, using immutable base artifacts. Inference servers are pre-configured to leverage quantization on supported models. GitOps is built-in, not bolted-on.

Guiding Principles

As we continue to develop Substratus, we’re grounded in the following guiding principles:

1. Prioritize Simplicity

We believe the importance of minimizing complexity in software cannot be understated. In Substratus, we will work hard to keep complexity to a minimum as the project grows. The Substratus API currently consists of 4 resource types: Datasets, Models, Servers, and Notebooks. The project currently depends on two cloud services outside of the cluster: a bucket and a container registry (we are working on making these optional too). The project does not (and will never) depend on a web of complex components like Istio.

2. Prioritize UX

We believe a company’s most precious resource is their engineer’s time. Substratus seeks to maximize the productivity of data scientists and engineers through providing a best-in-class user experience. We strive to build a set of well-designed primitives that allow ML practitioners to enter a flow state as they move between importing data, training, and serving models.

Roadmap

We are fast at work adding new functionality, focused on creating the most productive and enjoyable platform for ML practitioners. Coming soon:

  1. Support for AWS and Azure
  2. VS Code Notebook Integration
  3. Large-scale distributed training
  4. ML ecosystem integrations

Try Substratus today in your GCP project by following the quick start docs. Let us know what features you would like to see on our GitHub repo and don’t forget to add a star!