Skip to content

Releases: dstackai/dstack-enterprise

0.19.35-v1

30 Oct 08:38

Choose a tag to compare

Runpod

Instant Clusters

dstack adds support for Runpod Instant Clusters enabling multi-node tasks on Runpod:

✗ dstack apply -f nccl-tests.dstack.yaml -b runpod

 Project          main                                    
 User             admin                                   
 Configuration    .dstack/confs/nccl-tests-simple.yaml    
 Type             task                                    
 Resources        cpu=2.. mem=8GB.. disk=100GB.. gpu:1..8 
 Spot policy      auto                                    
 Max price        -                                       
 Retry policy     -                                       
 Creation policy  reuse-or-create                         
 Idle duration    5m                                      
 Max duration     -                                       
 Reservation      -                                       

 #  BACKEND           RESOURCES                          INSTANCE TYPE     PRICE    
 1  runpod (US-KS-2)  cpu=128 mem=2008GB disk=100GB      NVIDIA A100-SXM…  $16.7…   
                      A100:80GB:8                                                   
 2  runpod (US-MO-1)  cpu=128 mem=2008GB disk=100GB      NVIDIA A100-SXM…  $16.7…   
                      A100:80GB:8                                                   
 3  runpod            cpu=160 mem=1504GB disk=100GB      NVIDIA H100 80G…  $25.8…   
    (CA-MTL-1)        H100:80GB:8                                                   
    ...                                                                             
 Shown 3 of 5 offers, $34.464max

Submit the run nccl-tests? [y/n]: 

Runpod offers clusters of 2 to 8 nodes with H200, B200, H100, and A100 GPUs and InfiniBand networking up to 3200 Gbps.

What's Changed

New Contributors

Full Changelog: dstackai/dstack@0.19.34...0.19.35

0.19.34-v2

27 Oct 15:26

Choose a tag to compare

What's changed

0.19.34-v1

23 Oct 10:13

Choose a tag to compare

UI

Scheduled runs

The Run details page now shows Schedule and Next time for scheduled runs.

Finished times

The Run and Job list and details pages now display Finished times.

Backends

GCP

GCP G4 instances with NVIDIA RTX PRO 6000 GPUs are now generally available:

> dstack offer -b gcp --gpu RTXPRO6000                                          

 #  BACKEND            RESOURCES                                        INSTANCE TYPE    PRICE      
 1  gcp (us-central1)  cpu=48 mem=180GB disk=100GB RTXPRO6000:96GB:1    g4-standard-48   $4.5001    
 2  gcp (us-central1)  cpu=96 mem=360GB disk=100GB RTXPRO6000:96GB:2    g4-standard-96   $9.0002    
 3  gcp (us-central1)  cpu=192 mem=720GB disk=100GB RTXPRO6000:96GB:4   g4-standard-192  $18.0003   
 4  gcp (us-central1)  cpu=384 mem=1440GB disk=100GB RTXPRO6000:96GB:8  g4-standard-384  $36.0006

Also, GCP A4 instances are supported via reservations.

Runs

SSH keys

dstack now uses server-managed user SSH keys when starting new runs. This allows users to attach to the run from different machines, since the SSH key is automatically replicated to all clients. User-supplied SSH keys are still used if specified explicitly.

Docs

Kubernetes

The Kubernetes backend docs now include a list of required permissions.

What's Changed

Full Changelog: dstackai/dstack@0.19.33...0.19.34

0.19.33-v1

16 Oct 17:38

Choose a tag to compare

UI

Dev environments

You can now configure and provision dev environments directly from the user interface.

create-dev-environment-2.mp4

Note

CLI version 0.19.33 or later is required to attach to runs created from the UI.

GCP

Reservations

You can now configure specifically-targeted GCP reservations in fleet configurations to leverage reserved compute capacity:

type: fleet
nodes: 4
placement: cluster
backends: [gcp]
reservation: my-reservation

For reservations shared between projects, use the full syntax:

type: fleet
nodes: 4
placement: cluster
backends: [gcp]
reservation: projects/my-proj/reservations/my-reservation

dstack will automatically locate the specified reservation, match offers to the reservation's properties, and provision instances within the reservation. If there are multiple reservations with the specified name, all of them will be considered for provisioning.

Note

Using reservations requires the compute.reservations.list permission in the project that owns the reservation.

G4 preview

If your GCP project has access to the preview G4 instance type, you can now try it out with dstack.

> dstack offer -b gcp --gpu RTXPRO6000

 #  BACKEND            RESOURCES                                      INSTANCE TYPE   PRICE
 1  gcp (us-central1)  cpu=48 mem=180GB disk=100GB RTXPRO6000:96GB:1  g4-standard-48  $0

To use G4, enable its preview in the backend settings.

projects:
- name: main
  backends:
  - type: gcp
    project_id: my-project
    creds:
      type: default
    preview_features: [g4]

AMD

Kubernetes

The kubernetes backend now allows you to run workloads on AMD GPU-enabled Kubernetes clusters.

> dstack offer -b kubernetes

 #  BACKEND         RESOURCES                                   INSTANCE TYPE         PRICE    
 1  kubernetes (-)  cpu=19 mem=225GB disk=628GB MI300X:192GB:1  pool-iqg23mp4v-po04e  $0
 1  kubernetes (-)  cpu=19 mem=225GB disk=628GB MI300X:192GB:1  pool-iqg23mp4v-po04g  $0

Hot Aisle

The hotaisle backend now supports 8x MI300X instances too.

> dstack offer -b hotaisle --gpu 8:MI300X                         

 #  BACKEND                   RESOURCES                                       INSTANCE TYPE                      PRICE    
 1  hotaisle (us-michigan-1)  cpu=104 mem=1792GB disk=12288GB MI300X:192GB:8  8x MI300X 104x Xeon Platinum 8470  $15.92

Docker

Default image

The default Docker image now uses CUDA 12.8 (updated from 12.1).

What's changed

Full changelog: dstackai/dstack@0.19.32...0.19.33

0.19.32-v1

09 Oct 11:09

Choose a tag to compare

Fleets

Nodes

Maximum number of nodes

The fleet nodes.max property is now respected that allows limiting maximum number of instances allowed in a fleet. For example, to allow at most 10 instances in the fleet, you can do:

type: fleet
name: cloud-fleet
nodes: 0..10

A fleet will be considered for a run only if the run can fit into the fleet without violating nodes.max. If you don't need to enforce an upper limit, you can omit it:

type: fleet
name: cloud-fleet
nodes: 0..

Backends

Nebius

Tags

Nebius backend now supports backend and resource-level tags to tag cloud resources provisioned via dstack:

type: nebius
creds:
  type: service_account
  # ...
tags:
  team: my_team
  user: jake

Credentials file

It's also possible to configure the nebius backend using a credentials file generated by the nebius CLI:

nebius iam auth-public-key generate \
    --service-account-id <service account ID> \
    --output ~/.nebius/sa-credentials.json
projects:
- name: main
  backends:
  - type: nebius
    creds:
      type: service_account
      filename: ~/.nebius/sa-credentials.json

Hot Aisle

Hot Aisle backend now supports multi-GPU VMs such as 2xMI300X and 4xMI300X.

dstack apply -f .local/.dstack.yml --gpu amd:2
The working_dir is not set — using legacy default "/workflow". Future versions will default to the
image's working directory.

 #  BACKEND               RESOURCES                                 INSTANCE TYPE        PRICE
 1  hotaisle              cpu=26 mem=448GB disk=12288GB             2x MI300X 26x Xeon…  $3.98
    (us-michigan-1)       MI300X:192GB:2

What's changed

New contributors

Full changelog: dstackai/dstack@0.19.31...0.19.32

0.19.31-v1

02 Oct 13:00

Choose a tag to compare

Kubernetes

The kubernetes backend introduces many significant improvements and has now graduated from alpha to beta. It is much more stable and can be reliably used on GPU clusters for all kinds of workloads, including distributed tasks.

Here's what changed:

  • Resource allocation now fully respects the user’s resources specification. Previously, it ignored certain aspects, especially the proper selection of GPU labels according to the specified gpu spec.
  • Distributed tasks now fully work on Kubernetes clusters with fast interconnect enabled. Previously, this caused many issues.
  • Added support privileged.

We’ve also published a dedicated guide on how to get started with dstack on Kubernetes, highlighting important nuances.

Warning

Be aware of breaking changes if you used the kubernetes backend before. The following properties in the Kubernetes backend configuration have been renamed:

  • networkingproxy_jump
  • ssh_hosthostname
  • ssh_portport

Additionally, the "proxy jump" pod and service names now include a dstack- prefix.

GCP

A4 spot instances with B200 GPUs

The gcp backend now supports A4 spot instances equipped with B200 GPUs. This includes provisioning both standalone A4 instances and A4 clusters with high-performance RoCE networking.

To use A4 clusters with high-performance networking, you must configure multiple VPCs in your backend settings (~/.dstack/server/config.yml):

projects:
- name: main
  backends:
  - type: gcp
    project_id: my-project
    creds:
      type: default
    vpc_name: my-vpc-0   # regular, 1 subnet
    extra_vpcs:
    - my-vpc-1   # regular, 1 subnet
    roce_vpcs:
    - my-vpc-mrdma   # RoCE profile, 8 subnets

Then, provision a cluster using a fleet configuration:

type: fleet

nodes: 2
placement: cluster

availability_zones: [us-west2-c]
backends: [gcp]

spot_policy: spot

resources:
  gpu: B200:8

Each instance in the cluster will have 10 network interfaces: 1 regular interface in the main VPC, 1 regular interface in the extra VPC, and 8 RDMA interfaces in the RoCE VPC.

Note

Currently, the gcp backend only supports A4 spot instances. Support for other options, such as flex and calendar scheduling via Dynamic Workload Scheduler, is coming soon.

CUDA drivers

CUDA drivers in dstack's default aws, gcp, azure, and oci OS images has been updated from 535 to 570 by @jvstme in dstackai/dstack#3099

CLI

dstack project is now faster

The USER column in dstack project list is now shown only when the --verbose flag is used.
This significantly improves performance for users with many configured projects, reducing execution time from ~20 seconds to as little as 2 seconds in some cases.

dstack init with private repos

Previously, dstack init (or dstack apply with repo credentials) might erroneously detect a private repo as public under specific circumstances. This is resolved now.

What's changed

Full changelog: dstackai/dstack@0.19.29...0.19.31

0.19.29-v1

16 Sep 10:27

Choose a tag to compare

Fleets

Over the last few releases, we’ve been reworking how fleets work to radically simplify management and make it fully declarative.

Previously, you had to specify a fleet via fleets explicitly — otherwise, dstack always created a new one. Now, dstack automatically picks an existing fleet if it fits the requirements, creating a new one only when needed.

For more on the fleet roadmap, see this meta issue.

User Interface

Grouping offers by backend

The Offers page in the UI now lets you group available offers by backend, making it easier to compare options across cloud providers.

Breaking changes

  • The tensordock backend hasn’t worked for a long time (due to the API it relied on being deprecated) and has now been removed.

What's changed

Full changelog: dstackai/dstack@0.19.28...0.19.29

0.19.28-v1

10 Sep 10:35

Choose a tag to compare

CLI

Argument Handling

The CLI now properly handles unrecognized arguments and rejects them with clear error messages. The ${{ run.args }} interpolation for tasks and services is still supported but now requires the -- pseudo-argument separator:

dstack apply --reuse -- --some=arg --some-option

This change prevents accidental typos in command arguments from being silently ignored.

What's Changed

Full Changelog: dstackai/dstack@0.19.27...0.19.28

0.19.27-v1

04 Sep 12:33

Choose a tag to compare

Run configurations

Repo directory

It's now possible to specify the directory in the container where the repo is mounted:

type: dev-environment

ide: vscode

repos:
  - local_path: .
    path: my_repo

  # or using short syntax:
  # - .:my_repo

The path property can be an absolute path or a relative path (with respect to working_dir). It's available inside run as the $DSTACK_REPO_DIR environment variable. If path is not set, the /workflow path is used.

Working directory

Previously, the working_dir property had complicated semantics: it defaulted to the repo path (/workflow), but for tasks and services without commands, the image working directory was used. You could also specify custom working_dir relative to the repo directory. This is now reversed: you specify working_dir as absolute path, and the repo path can be specified relative to it.

Note

During transitioning period, the legacy behavior of using /workflow is preserved if working_dir is not set. In future releases, this will be simplified, and working_dir will always default to the image working directory.

Fleet configuration

Nodes, retry, and target

dstack now indefinitely maintains nodes.min specified for cloud fleets. If instances get terminated for any reason and there are fewer instances than nodes.min, dstack will provision new fleet instances in the background.

There is also a new nodes.target property that specifies the number of instances to provision on fleet apply. Since now nodes.min is always maintained, you may specify nodes.target different from nodes.min to provision more instances than needs to be maintained.

Example:

type: fleet
name: default-fleet
nodes:
  min: 1 # Maintain one instance
  target: 2 # Provision two instances initially
  max: 3

dstack will provision two instances. After deleting one instance, there will be one instances left. Deleting the last instance will trigger dstack to re-create the instance.

Offers

The UI now has a dedicated page showing GPU offers available across all configured backends.

Digital Ocean and AMD Developer Cloud

The release adds native integration with DigitalOcean and
AMD Developer Cloud.

A backend configuration example:

projects:
- name: main
  backends:
  - type: amddevcloud
    project_name: TestProject
    creds:
        type: api_key
        api_key: ...

For DigitalOcean, set type to digitalocean.

The digitalocean and amddevcloud backends support NVIDIA and AMD GPU VMs, respectively, and allow you to run
dev environments (interactive development), tasks
(training, fine-tuning, or other batch jobs), and services (inference).

Security

Important

This update fixes a vulnerability in the cloudrift, cudo, and datacrunch backends. Instances created with earlier dstack versions lack proper firewall rules, potentially exposing internal APIs and allowing unauthorized access.

Users of these backends are advised to update to the latest version and re-create any running instances.

What's changed

New contributors

Full changelog: 0.19.10-v2...0.19.27-v1

0.19.26-v1

28 Aug 11:02

Choose a tag to compare

Repos

Previously, dstack always required running the dstack init command before use. This also meant that dstack would always mount the current folder as a repo.

With this update, repo configuration is now explicit and declarative. If you want to use a repo in your run, you must specify it with the new repos property. The dstack init command is now only used to provide custom Git credentials when working with private repos.

For example, imagine you have a cloned Git repo with an examples subdirectory containing a .dstack.yml file:

type: dev-environment
name: vscode    

repos:
  # Mounts the parent directory of `examples` (must be a Git repo)
  #   to `/workflow` (the default working directory)
  - ..

ide: vscode

When you run this configuration, dstack fetches the repo on the instance, applies your local changes, and mounts it—so the container always matches your local repo.

Sometimes you may want to mount a Git repo without cloning it locally. In that case, simply provide a URL in repos:

type: dev-environment
name: vscode    

repos:
  # Clone the specified repo to `/workflow` (the default working directory)
  - https://github.com/dstackai/dstack

ide: vscode

If the repo is private, dstack will automatically try to use your default Git credentials (from ~/.ssh/config or ~/.config/gh/hosts.yml).

To configure custom Git credentials, use dstack init.

Note

If you previously initialized a repo via dstack init, it will still be mounted. Be sure to migrate to repos, as implicitly configured repos are deprecated and will stop working in future releases.

If you no longer want to use the implicitly configured repo, run dstack init --remove.

Note

Currently, you can configure only one repo per run configuration.

Fleets

Previously, when dstack added new instances to existing fleets, it ignored the fleet configuration and used only the run configuration for which the instance was created. This could result in fleets containing instances that didn’t match their configuration.

This has now been fixed: fleet configurations and run configurations are intersected so that provisioned instances respect both. For example, given a fleet configuration:

type: fleet
name: cloud-fleet
placement: any
nodes: 0..2
backends:
  - runpod

and a run configuration:

type: dev-environment
ide: vscode
spot_policy: spot
fleets:
  - cloud-fleet

dstack will provision a RunPod spot instance in cloud-fleet.

This change lets you define main provisioning parameters in fleet configurations, while adjusting them in run configurations as needed.

Note

Currently, the run plan does not take fleet configuration into account when showing offers, since the target fleet may not be known beforehand. We plan to improve this by showing offers for all candidate fleets.

Examples

Wan2.2

We've added a new example
demonstrating how to use Wan2.2, the new open-source SOTA text-to-video model, to generate videos.

Internals

Pyright integration

We now use pyright for type checking dstack Python code in CI. If you contribute to dstack, we recommend you configure your IDE to use pyright/pylance with standard type checking mode.

What's changed

Full changelog: dstackai/dstack@0.19.25...0.19.26