feat(cli): add concurrent images pulling#4394
Conversation
sweatybridge
left a comment
There was a problem hiding this comment.
Can we check how docker compose downloads images and import that as a library instead? https://github.com/docker/compose/blob/main/pkg/compose/images.go
|
What about the console logs when pulling images concurrently? Docker compose has a nice display for each image and clears up when downloads complete. |
|
Hmm ok, I will explore a bit more if you don't mind. Their progress logs is something I've always wanted to try. |
If it's only the progress bar you're interested in, then maybe that's doable, it seems like it's handled on it's own separate package in their codebase. |
|
Made a new version keeping the concurrency logic in our code (since it's not exposed by compose). But improving the implementation by looking how compose does it (errgroup, channel). Also added a 1s delay at the start between each image pull, this reduce greatly the number of "api rate limit" encountered which overall reduce the total pulling time. And re-use part of compose for better logging (docker-compose style): Also thank's for this new more detailed log (time for each image) we can see the largest image (postgres) takes ~1min to download, which match our total time of running from cached image (30s) + network speed (1min to download the largest). |
…image-pulls-in-supabase-cli-start
Pull Request Test Coverage Report for Build 19627513060Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
|
Closing in favor of: #4447 Which branch from 3b6ecd4 before the attempt to directly use the docker-compose approach. Reason are:
|
…image-pulls-in-supabase-cli-start
…image-pulls-in-supabase-cli-start
…se#5681) ## What changed `supabase start` could start containers before their Docker images had finished downloading. The command ran two uncoordinated image-acquisition paths: 1. A best-effort concurrent pre-pull (`pullImagesUsingCompose`) using docker-compose's `Pull` with `PullOptions{IgnoreFailures: true}`. It only targets the primary registry and, by design, **silently swallows per-image pull failures** (the `IgnoreFailures` flag is the hook that lets the registry fallback recover). 2. An authoritative lazy per-container pull inside `utils.DockerStart` → `DockerResolveImageIfNotCached` (multi-registry fallback: ECR → GHCR → Docker Hub). So any image the concurrent pre-pull failed to cache — a transient registry/network/rate-limit hiccup, common on a fresh machine pulling 10+ images at once — was pulled **later**, during the `Starting database… / Starting containers…` phase. That is the "start doesn't wait for pulls" behaviour from the issue. The pre-pull was added in supabase#4394, matching the reporter's "last few versions" regression window. ## The fix Add `ensureImagesCached`, a completeness pass that runs immediately after the best-effort pre-pull and before any container starts. It resolves every project image through the **same** multi-registry fallback resolver `DockerStart` already uses (`DockerResolveImageIfNotCached`), fanned out concurrently via the existing `utils.WaitAll` primitive. After it returns, every required image is guaranteed present in the local cache, so the per-container `DockerStart` calls become pure cache hits and never pull mid-start. On the happy path it is just N cheap image inspects; an image that genuinely cannot be pulled from any registry now fails the start cleanly **before** any container is created, instead of limping into a half-pulled start. The compose pre-pull (and its `IgnoreFailures`) is kept as the fast concurrent progress UI — it is simply no longer relied on for completeness. ## TypeScript port The native-TS port already pulls in a preparation phase that is awaited before startup and fails hard on pull errors, so it does not have this bug. This PR adds regression guards locking that contract in: - `Stack.unit.test.ts`: `stack.start()` aborts and starts zero containers when a docker pull fails. - `prefetch.unit.test.ts`: preparation fails with `DockerPullError` when the whole registry fallback chain fails. Fixes supabase#5068



What kind of change does this PR introduce?
supabase startall images are pulled at the time the service is starting, so each services need to wait until it's started before pulling something else. This change the logic into two steps:On my machine/network, this reduce the "cold start" (no images at all) of
supabase startfrom ~2:43.19 total to ~1:30.33 total.Note that even with all images already present there is a ~30s uncompressible delay to start all the containers. So this only bring the cold start download overhead from ~2.13min to ~1min.