Skip to content

User-pulled embedding models missing "embeddings" label: /v1/embeddings returns 501 #1745

@huveewomg

Description

@huveewomg

Platform

Windows

Lemonade Version

10.2.0

GPU / APU Model

AMD Ryzen AI 7 Pro 350 w/ Radeon 860M

Component

Other / Not sure

Bug Description

Embedding models pulled from the HuggingFace section in the Lemonade app are missing the embeddings label in user_models.json. They only receive ["custom"]. Without this label, the llamacpp backend does not pass --embeddings to llama-server, and the /v1/embeddings endpoint returns 501.

The HF variants flow (hf_variants.cpp) generates the correct suggested_labels based on the checkpoint name containing "embed", but the labels are not carried through to register_user_model() when pulling through the app UI. Only curated server models (e.g. Qwen3-Embedding-, nomic-embed-) have the label applied correctly.

Steps to Reproduce

  1. In the Lemonade app, browse the HuggingFace model section
  2. Pull an embedding model (e.g. Abiray/zembed-1-Q4_K_M-GGUF)
  3. Check %USERPROFILE%.cache\lemonade\user_models.json
  4. Labels show ["custom"] only, "embeddings" is missing
  5. Load the model
  6. Send a request to /v1/embeddings:
curl http://localhost:13305/v1/embeddings -H "Content-Type: application/json" \
  -d '{"model":"user.zembed-1-Q4_K_M-GGUF","input":["test"]}'
  1. Response: 501 "This server does not support embeddings. Start it with --embeddings"

Expected vs Actual Behavior

Expected: Model with "embed" in its name should automatically receive the "embeddings" label, matching the behavior of curated server models.

Actual: Only ["custom"] is assigned. The suggested_labels from the HF variants endpoint are lost before reaching register_user_model(). Users must manually edit user_models.json to add "embeddings".

Log Output

Additional Context

$ curl -s http://localhost:13305/v1/embeddings -H "Content-Type: application/json" \
  -d '{"model":"user.zembed-1-Q4_K_M-GGUF","input":["test"]}'

{"error":{"details":{"response":{"error":{"code":501,"message":"This server does not support embeddings. Start it with `--embeddings`","type":"not_supported_error"}},"status_code":501},"message":"llama-server request failed","type":"backend_error"}}

user_models.json before manual fix:

{
  "zembed-1-Q4_K_M-GGUF": {
    "checkpoint": "Abiray/zembed-1-Q4_K_M-GGUF:Q4_K_M",
    "labels": ["custom"],
    "recipe": "llamacpp",
    "suggested": true
  }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions