Skip to content

Nullable vector compact encoding is mishandled in sparse binlog reads and result reorder #49881

@marcelo-cjl

Description

@marcelo-cjl

Summary

Nullable vector fields are encoded as compact vector data plus logical-row ValidData: len(ValidData) is the logical row count, while the vector payload contains only non-null rows. A few non-index paths still treat nullable vector data as full-size row-aligned data. This can corrupt nullable sparse vector reads and can fail result reordering when nullable vector fields are returned with order_by.

This issue tracks two concrete risks found on master at aedb81f825577a91fc78f6e8dfb104de2d839fc7. ArrayOfVector is intentionally excluded here because it is not fully supported as nullable vector end-to-end.

Problem 1: nullable SparseFloatVector binlog read expands null rows into Contents

PayloadReader.GetSparseFloatVectorFromPayload() appends nil to SparseFloatVectorFieldData.Contents for null rows:

That produces expanded row-aligned Contents. However the writer enforces compact encoding for nullable sparse vectors: len(Contents) == valid_count:

AddInsertData() then appends the read sparse data and builds L2PMapping from ValidData, assuming compact data:

After that, GetRow() maps logical row -> physical valid-row offset:

If any null row appears before a later valid row, the physical offset points to the wrong entry because Contents contains null placeholders. This can affect deserialize -> query/retrieve, compaction, and storage v2 record serialization:

Current unit coverage also asserts the expanded shape, which appears to encode the bug as expected behavior:

Expected behavior: nullable sparse payload reader should return compact Contents containing only valid rows, while returning ValidData with logical row count.

Problem 2: proxy result reorder assumes nullable vector data is full-size

pickFieldData() already uses FieldDataIdxComputer to translate logical row offsets to compact vector data offsets:

But orderByOperator.reorderFieldData() still validates and moves vector data as if every logical result row has a physical vector row. Examples:

For nullable vector output, data is compact, so an order_by search/query result that also outputs a nullable vector field with null rows may fail with an internal length mismatch or reorder the wrong data. Current ValidData reorder tests cover scalar nullable fields only, not compact nullable vectors:

Expected behavior: result reorder should preserve ValidData over logical result rows and move vector payload using logical-to-physical mapping, same invariant as AppendFieldData()/FieldDataIdxComputer.

Suggested validation

Add focused regression tests for:

  1. Sparse nullable payload round trip with null-first and interleaved null pattern, asserting Contents length equals valid count and GetRow() returns the correct rows.
  2. order_by/reorderFieldData with nullable FloatVector and SparseFloatVector output where ValidData has null rows before valid rows.
  3. End-to-end insert -> flush -> load -> query/search output for nullable sparse vector with interleaved null rows.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions