Nullable vector compact encoding is mishandled in sparse binlog reads and result reorder

## Summary

Nullable vector fields are encoded as compact vector data plus logical-row `ValidData`: `len(ValidData)` is the logical row count, while the vector payload contains only non-null rows. A few non-index paths still treat nullable vector data as full-size row-aligned data. This can corrupt nullable sparse vector reads and can fail result reordering when nullable vector fields are returned with `order_by`.

This issue tracks two concrete risks found on master at `aedb81f825577a91fc78f6e8dfb104de2d839fc7`. `ArrayOfVector` is intentionally excluded here because it is not fully supported as nullable vector end-to-end.

## Problem 1: nullable SparseFloatVector binlog read expands null rows into Contents

`PayloadReader.GetSparseFloatVectorFromPayload()` appends `nil` to `SparseFloatVectorFieldData.Contents` for null rows:

- https://github.com/milvus-io/milvus/blob/aedb81f825577a91fc78f6e8dfb104de2d839fc7/internal/storage/payload_reader.go#L1070

That produces expanded row-aligned `Contents`. However the writer enforces compact encoding for nullable sparse vectors: `len(Contents) == valid_count`:

- https://github.com/milvus-io/milvus/blob/aedb81f825577a91fc78f6e8dfb104de2d839fc7/internal/storage/payload_writer.go#L940

`AddInsertData()` then appends the read sparse data and builds `L2PMapping` from `ValidData`, assuming compact data:

- https://github.com/milvus-io/milvus/blob/aedb81f825577a91fc78f6e8dfb104de2d839fc7/internal/storage/data_codec.go#L846

After that, `GetRow()` maps logical row -> physical valid-row offset:

- https://github.com/milvus-io/milvus/blob/aedb81f825577a91fc78f6e8dfb104de2d839fc7/internal/storage/insert_data.go#L761

If any null row appears before a later valid row, the physical offset points to the wrong entry because `Contents` contains null placeholders. This can affect deserialize -> query/retrieve, compaction, and storage v2 record serialization:

- https://github.com/milvus-io/milvus/blob/aedb81f825577a91fc78f6e8dfb104de2d839fc7/internal/storage/serde.go#L1337

Current unit coverage also asserts the expanded shape, which appears to encode the bug as expected behavior:

- https://github.com/milvus-io/milvus/blob/aedb81f825577a91fc78f6e8dfb104de2d839fc7/internal/storage/payload_test.go#L2763

Expected behavior: nullable sparse payload reader should return compact `Contents` containing only valid rows, while returning `ValidData` with logical row count.

## Problem 2: proxy result reorder assumes nullable vector data is full-size

`pickFieldData()` already uses `FieldDataIdxComputer` to translate logical row offsets to compact vector data offsets:

- https://github.com/milvus-io/milvus/blob/aedb81f825577a91fc78f6e8dfb104de2d839fc7/internal/proxy/search_pipeline.go#L1231

But `orderByOperator.reorderFieldData()` still validates and moves vector data as if every logical result row has a physical vector row. Examples:

- FloatVector expects `len(data) == n * dim`: https://github.com/milvus-io/milvus/blob/aedb81f825577a91fc78f6e8dfb104de2d839fc7/internal/proxy/search_pipeline.go#L2155
- SparseFloatVector expects `len(contents) == n`: https://github.com/milvus-io/milvus/blob/aedb81f825577a91fc78f6e8dfb104de2d839fc7/internal/proxy/search_pipeline.go#L2243

For nullable vector output, data is compact, so an `order_by` search/query result that also outputs a nullable vector field with null rows may fail with an internal length mismatch or reorder the wrong data. Current `ValidData` reorder tests cover scalar nullable fields only, not compact nullable vectors:

- https://github.com/milvus-io/milvus/blob/aedb81f825577a91fc78f6e8dfb104de2d839fc7/internal/proxy/search_pipeline_test.go#L4146

Expected behavior: result reorder should preserve `ValidData` over logical result rows and move vector payload using logical-to-physical mapping, same invariant as `AppendFieldData()`/`FieldDataIdxComputer`.

## Suggested validation

Add focused regression tests for:

1. Sparse nullable payload round trip with null-first and interleaved null pattern, asserting `Contents` length equals valid count and `GetRow()` returns the correct rows.
2. `order_by`/`reorderFieldData` with nullable FloatVector and SparseFloatVector output where `ValidData` has null rows before valid rows.
3. End-to-end insert -> flush -> load -> query/search output for nullable sparse vector with interleaved null rows.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nullable vector compact encoding is mishandled in sparse binlog reads and result reorder #49881

Summary

Problem 1: nullable SparseFloatVector binlog read expands null rows into Contents

Problem 2: proxy result reorder assumes nullable vector data is full-size

Suggested validation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Nullable vector compact encoding is mishandled in sparse binlog reads and result reorder #49881

Description

Summary

Problem 1: nullable SparseFloatVector binlog read expands null rows into Contents

Problem 2: proxy result reorder assumes nullable vector data is full-size

Suggested validation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions