Skip to content

GeminiEmbedder.create_batch silently returns wrong number of vectors for gemini-embedding-2-preview (and -2) #1467

@elimydlarz

Description

@elimydlarz

Summary

GeminiEmbedder.create_batch(input_data_list) silently returns fewer vectors than inputs when used with gemini-embedding-2-preview or gemini-embedding-2, causing the caller's zip(..., strict=True) to raise ValueError. The root cause is that gemini-embedding-2* does not batch via embed_content(contents=[...]) — it returns a single embedding regardless of how many strings are in the list. gemini-embedding-001 does batch correctly. graphiti already pre-empts this for -001 by hard-coding batch_size = 1; the same defence is missing for the -2* family.

Reproduction

Tested against graphiti-core 0.29.0, google-genai 1.74.0, Python 3.12.

import asyncio
from google import genai
from graphiti_core.embedder.gemini import GeminiEmbedder, GeminiEmbedderConfig

async def main():
    client = genai.Client()
    for model in ["gemini-embedding-2-preview", "gemini-embedding-001"]:
        cfg = GeminiEmbedderConfig(embedding_model=model)
        emb = GeminiEmbedder(cfg, client=client)
        print(f"{model}: batch_size={emb.batch_size}")
        vectors = await emb.create_batch(["Eli", "Anthropic", "Sydney"])
        print(f"  create_batch(3 inputs): got {len(vectors)} vectors")

asyncio.run(main())

Output:

gemini-embedding-2-preview: batch_size=100
  create_batch(3 inputs): got 1 vectors        ← BUG
gemini-embedding-001: batch_size=1
  create_batch(3 inputs): got 3 vectors        ← graphiti's special case saves it

gemini-embedding-2 (the production sibling, no -preview suffix) behaves the same as -2-preview.

Symptom in real use

When gemini-embedding-2-preview is configured and an add_episode call extracts ≥2 entities, graphiti's dedup pipeline trips on the partial result. The traceback (graphiti-core 0.29.0):

File ".../graphiti_core/utils/maintenance/node_operations.py", line 446, in _semantic_candidate_search
    for node, query_vector in zip(extracted_nodes, query_vectors, strict=True)
ValueError: zip() argument 2 is shorter than argument 1

In 0.28.2 there is no _semantic_candidate_search, but the same bug is latent in nodes.py:1079 (create_entity_node_embeddings) — zip(filtered_nodes, name_embeddings, strict=True) will fail the moment a single episode produces ≥2 entities.

Why create_batch returns fewer than expected

GeminiEmbedder.create_batch calls:

result = await self.client.aio.models.embed_content(
    model=...,
    contents=batch,                     # list of N strings
    config=types.EmbedContentConfig(output_dimensionality=...),
)
for embedding in result.embeddings:    # only iterates len(result.embeddings) times
    all_embeddings.append(embedding.values)

For gemini-embedding-2-preview/-2, result.embeddings has length 1 regardless of len(batch). The for-loop appends one item, no exception is raised, and the function returns a list shorter than input_data_list. Callers that use zip(..., strict=True) then fail.

This isn't a transient API hiccup — it's reproducible across runs and across EmbedContentConfig variations (with/without task_type, output_dimensionality). Models matching embed* reported by models.list():

models/gemini-embedding-001         supports embedContent + asyncBatchEmbedContent
models/gemini-embedding-2-preview   supports embedContent + asyncBatchEmbedContent
models/gemini-embedding-2           supports embedContent + asyncBatchEmbedContent

asyncBatchEmbedContent is a separately-supported action — the proper batched API for the -2* family. embed_content(contents=[…]) with -2* treats the list as parts of a single document, not as a batch.

Suggested fix

Either (a) extend the existing special case to the -2* family:

# graphiti_core/embedder/gemini.py
if batch_size is None and self.config.embedding_model in (
    "gemini-embedding-001",
    "gemini-embedding-2-preview",
    "gemini-embedding-2",
):
    self.batch_size = 1

or (b) detect the partial-batch return and fall back to per-input calls:

result = await self.client.aio.models.embed_content(...)
if not result.embeddings:
    raise Exception("No embeddings returned")
if len(result.embeddings) != len(batch):
    # Fall back to per-input mode
    for item in batch:
        single = await self.client.aio.models.embed_content(model=..., contents=[item], config=...)
        all_embeddings.append(single.embeddings[0].values)
    continue
for embedding in result.embeddings:
    all_embeddings.append(embedding.values)

(a) is consistent with the existing pattern. (b) is more defensive against future model behavior changes.

Workaround for users today

Pass batch_size=1 explicitly:

GeminiEmbedder(
    GeminiEmbedderConfig(embedding_model="gemini-embedding-2-preview"),
    client=genai_client,
    batch_size=1,
)

Environment

  • graphiti-core: 0.29.0 (also reproduces in 0.28.2 via the nodes.py caller)
  • google-genai: 1.74.0
  • Python: 3.12.11
  • Platform: macOS arm64

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions