diff --git a/blueprints/per-user-dev-environments.html.md b/blueprints/per-user-dev-environments.html.md index 0c0df40484..0d5a612a41 100644 --- a/blueprints/per-user-dev-environments.html.md +++ b/blueprints/per-user-dev-environments.html.md @@ -2,7 +2,7 @@ title: Per-User Dev Environments with Fly Machines layout: docs nav: guides -date: 2025-04-02 +date: 2026-05-07 ---
@@ -61,6 +61,7 @@ You'll want to spin up at least one Machine per user app (but apps can have as m - **Machines & volumes are tied to physical hardware:** hardware failures can destroy machines and attached volumes. **Always persist important user data** (code, config, outputs) to external storage (like [Tigris Data](/docs/tigris/#main-content-start) or AWS S3). - **Your users will break their environments:** pre-create standby machines to handle hardware & runtime failures, or the inevitable user or robot poisoned environment. Pre-create standby machines that you can quickly activate in these scenarios. - **Machine restarts reset ephemeral filesystem:** the temporary Fly Machine filesystem state resets on Machine restarts, ensuring clean environments. However, volume data remains persistent, making it useful for retaining user progress or state. +- **One app per user:** Putting all user Machines into a single app and routing to them with dynamic routing works, but `auto_stop` won't behave the way you expect once you scale. The Fly Proxy's stop loop is rate-limited: it [stops or suspends at most one Machine per region each pass, and runs every few minutes](/docs/reference/fly-proxy-autostop-autostart/#fly-proxy-process-to-stop-or-suspend-machines). That's fine for a normal app; with thousands of Machines the loop can't keep up, and most of your idle Machines stay running. Machines in a single app also share app-level secrets and a flat [private network](/docs/networking/private-networking/). A compromised user environment can reach every other Machine in the app. If you do keep all user Machines in one app, implement stop-when-idle behavior in your app or orchestrator (see [apps that shut down when idle](/docs/launch/autostop-autostart/#apps-that-shut-down-when-idle)). Don't rely on the Fly Proxy to keep most Machines stopped. ## Related reading diff --git a/launch/autostop-autostart.html.markerb b/launch/autostop-autostart.html.markerb index a6d75eb0de..ef0edfbbfa 100644 --- a/launch/autostop-autostart.html.markerb +++ b/launch/autostop-autostart.html.markerb @@ -14,6 +14,8 @@ Get all the details of [how Fly Proxy autostop/autostart works](/docs/reference/ Autostop/autostart works well for apps with highly variable workloads, for smaller apps with low or sporadic traffic, and for most apps that aren't receiving requests continuously. You can reduce resource usage and costs by using autostop/autostart to manage your Fly Machines as demand decreases and increases. You'll never have to run excess Machines to handle peak load; you'll only run, and get charged for, the number of Machines that you need. You can choose to keep one or more Machines running in your primary region. +Autostop/autostart isn't a fit for every workload. The Fly Proxy's stop loop runs every few minutes and stops at most one Machine per region per pass. That's fine for normal apps, but if you're running thousands of Machines in a single app (for example, [per-user dev environments](/docs/blueprints/per-user-dev-environments/)), the loop can't keep up. In that case, use [one app per user](/docs/machines/guides-examples/one-app-per-user-why/) with [dynamic routing](/docs/networking/dynamic-request-routing/), or have your app shut itself down when idle. + ## Configure autostop/autostart The autostop/autostart settings are part of each service in an app's `fly.toml` file. See the [[[services]]](/docs/reference/configuration/#the-services-sections) or [[http_service]](/docs/reference/configuration/#the-http_service-section) docs for details about service configuration. You can also add services to [private apps](#private-apps). diff --git a/reference/fly-proxy-autostop-autostart.html.markerb b/reference/fly-proxy-autostop-autostart.html.markerb index 915ee43890..ff3e989563 100644 --- a/reference/fly-proxy-autostop-autostart.html.markerb +++ b/reference/fly-proxy-autostop-autostart.html.markerb @@ -39,6 +39,10 @@ Fly Proxy determines excess capacity per region as follows: * the proxy checks if the Machine has any traffic * if the Machine has no traffic (a load of 0), then the proxy stops or suspends the Machine +
+**At scale**, the rate-limited loop becomes the constraint: with thousands of Machines in a single app, the proxy can't stop them fast enough to keep most idle Machines stopped. If that's your use case (for example, [per-user dev environments](/docs/blueprints/per-user-dev-environments/)), use [one app per user](/docs/machines/guides-examples/one-app-per-user-why/) with [dynamic routing](/docs/networking/dynamic-request-routing/), or have your app shut down when idle. +
+ ### Fly Proxy process to start Machines When `auto_start_machines = true` in your `fly.toml`, the Fly Proxy restarts a Machine in the nearest region when required.