Skip to content

Improve download UX and correctness#1764

Open
fl0rianr wants to merge 1 commit into
lemonade-sdk:mainfrom
fl0rianr:fix/download_manager
Open

Improve download UX and correctness#1764
fl0rianr wants to merge 1 commit into
lemonade-sdk:mainfrom
fl0rianr:fix/download_manager

Conversation

@fl0rianr
Copy link
Copy Markdown
Collaborator

@fl0rianr fl0rianr commented Apr 29, 2026

Summary

(completely reworked)

Improves backend install progress for multi-artifact downloads.

Backend installs are not always a single file: vLLM ROCm can use split archives, and ROCm-stable backends may need a TheRock runtime after the backend archive. This PR makes progress reporting consistent across those cases without changing how artifacts are selected, downloaded, or extracted.

TL;DR: Progress now follows the actual install steps instead of relying on hardcoded assumptions about one or two files.

What changed

  • Split archive parts report progress as individual logical files.
  • total_download_size is only used when the real full total is known.
  • Backend + runtime installs are treated as one logical install flow.
  • Completion is only reported after all required backend/runtime work is done.
  • Already-installed runtimes are not shown as extra download steps.
  • Newly created backend installs are rolled back if a required runtime step fails.
  • The Download Manager uses file-count progress until all file sizes are known.
  • Unknown follow-up sizes are shown as a known lower bound, e.g. 1.5 GB+.

Why this is safer

This makes backend installs safer because the UI no longer has to infer completion from partial progress data.

The install flow now has clearer invariants:

  • current-file bytes describe the current file only
  • total size is only used when it is actually known
  • completion means the whole backend install is complete
  • runtime steps are only included when they actually need to run

This reduces the risk of showing a backend as complete while required follow-up work is still pending, and it prevents partially installed backend directories from being left behind when a runtime step fails.

Why this is future-proof

The old flow effectively assumed simple cases: one backend archive, or one backend archive plus one runtime step.

The new flow models backend installation as a sequence of logical files/steps. That makes it easier to support future backend layouts, for example:

  • split archives with more than two parts
  • multiple runtime artifacts
  • backend installs where some follow-up artifacts are already present
  • future backends that mix archive parts and runtime dependencies

Adding another runtime step should not require reworking progress semantics again; it should only add another step to the install sequence.

Why this is low risk

This does not change the actual install mechanism:

  • no new download sources
  • no release selection changes
  • no archive extraction changes
  • no backend version resolution changes
  • no changes to HuggingFace/model downloads

The PR only makes progress metadata, completion timing, and cleanup behavior more accurate for existing backend install flows.

Scope

Limited to backend install progress, display, and cleanup after failed runtime installs.

Model downloads are intentionally untouched because the HuggingFace manifest path already has explicit file sizes, total size, resume handling, and validation.

Validation

Manually checked:

  • vLLM ROCm split archive progress continues across parts
  • lower-bound total size is shown while later sizes are unknown
  • ROCm-stable + TheRock completes only after TheRock finishes
  • normal single-archive backend installs still behave normally
  • llama.cpp ROCm with TheRock missing shows runtime as a follow-up step
  • llama.cpp ROCm with TheRock already installed stays a normal single backend download

- no 100% before completion
- full progress across multi-file downloads
- no hardcoded 2 limit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant