Summary
Nullable vector fields are encoded as compact vector data plus logical-row ValidData: len(ValidData) is the logical row count, while the vector payload contains only non-null rows. A few non-index paths still treat nullable vector data as full-size row-aligned data. This can corrupt nullable sparse vector reads and can fail result reordering when nullable vector fields are returned with order_by.
This issue tracks two concrete risks found on master at aedb81f825577a91fc78f6e8dfb104de2d839fc7. ArrayOfVector is intentionally excluded here because it is not fully supported as nullable vector end-to-end.
Problem 1: nullable SparseFloatVector binlog read expands null rows into Contents
PayloadReader.GetSparseFloatVectorFromPayload() appends nil to SparseFloatVectorFieldData.Contents for null rows:
|
for i := 0; i < binaryArray.Len(); i++ { |
That produces expanded row-aligned Contents. However the writer enforces compact encoding for nullable sparse vectors: len(Contents) == valid_count:
|
if len(data.Contents) != validCount { |
AddInsertData() then appends the read sparse data and builds L2PMapping from ValidData, assuming compact data:
|
vec.AppendAllRows(singleData) |
After that, GetRow() maps logical row -> physical valid-row offset:
|
func (data *SparseFloatVectorFieldData) GetRow(i int) interface{} { |
If any null row appears before a later valid row, the physical offset points to the wrong entry because Contents contains null placeholders. This can affect deserialize -> query/retrieve, compaction, and storage v2 record serialization:
|
// Use len(validData) as logical row count, GetRow takes logical index |
Current unit coverage also asserts the expanded shape, which appears to encode the bug as expected behavior:
|
require.Equal(t, numRows, len(readValid)) |
Expected behavior: nullable sparse payload reader should return compact Contents containing only valid rows, while returning ValidData with logical row count.
Problem 2: proxy result reorder assumes nullable vector data is full-size
pickFieldData() already uses FieldDataIdxComputer to translate logical row offsets to compact vector data offsets:
|
fieldsData := make([]*schemapb.FieldData, len(fields)) |
But orderByOperator.reorderFieldData() still validates and moves vector data as if every logical result row has a physical vector row. Examples:
- FloatVector expects
len(data) == n * dim:
|
data := vectors.GetFloatVector().GetData() |
- SparseFloatVector expects
len(contents) == n:
|
contents := sparseData.GetContents() |
For nullable vector output, data is compact, so an order_by search/query result that also outputs a nullable vector field with null rows may fail with an internal length mismatch or reorder the wrong data. Current ValidData reorder tests cover scalar nullable fields only, not compact nullable vectors:
|
// Test reorderFieldData with ValidData (nullable fields) |
Expected behavior: result reorder should preserve ValidData over logical result rows and move vector payload using logical-to-physical mapping, same invariant as AppendFieldData()/FieldDataIdxComputer.
Suggested validation
Add focused regression tests for:
- Sparse nullable payload round trip with null-first and interleaved null pattern, asserting
Contents length equals valid count and GetRow() returns the correct rows.
order_by/reorderFieldData with nullable FloatVector and SparseFloatVector output where ValidData has null rows before valid rows.
- End-to-end insert -> flush -> load -> query/search output for nullable sparse vector with interleaved null rows.
Summary
Nullable vector fields are encoded as compact vector data plus logical-row
ValidData:len(ValidData)is the logical row count, while the vector payload contains only non-null rows. A few non-index paths still treat nullable vector data as full-size row-aligned data. This can corrupt nullable sparse vector reads and can fail result reordering when nullable vector fields are returned withorder_by.This issue tracks two concrete risks found on master at
aedb81f825577a91fc78f6e8dfb104de2d839fc7.ArrayOfVectoris intentionally excluded here because it is not fully supported as nullable vector end-to-end.Problem 1: nullable SparseFloatVector binlog read expands null rows into Contents
PayloadReader.GetSparseFloatVectorFromPayload()appendsniltoSparseFloatVectorFieldData.Contentsfor null rows:milvus/internal/storage/payload_reader.go
Line 1070 in aedb81f
That produces expanded row-aligned
Contents. However the writer enforces compact encoding for nullable sparse vectors:len(Contents) == valid_count:milvus/internal/storage/payload_writer.go
Line 940 in aedb81f
AddInsertData()then appends the read sparse data and buildsL2PMappingfromValidData, assuming compact data:milvus/internal/storage/data_codec.go
Line 846 in aedb81f
After that,
GetRow()maps logical row -> physical valid-row offset:milvus/internal/storage/insert_data.go
Line 761 in aedb81f
If any null row appears before a later valid row, the physical offset points to the wrong entry because
Contentscontains null placeholders. This can affect deserialize -> query/retrieve, compaction, and storage v2 record serialization:milvus/internal/storage/serde.go
Line 1337 in aedb81f
Current unit coverage also asserts the expanded shape, which appears to encode the bug as expected behavior:
milvus/internal/storage/payload_test.go
Line 2763 in aedb81f
Expected behavior: nullable sparse payload reader should return compact
Contentscontaining only valid rows, while returningValidDatawith logical row count.Problem 2: proxy result reorder assumes nullable vector data is full-size
pickFieldData()already usesFieldDataIdxComputerto translate logical row offsets to compact vector data offsets:milvus/internal/proxy/search_pipeline.go
Line 1231 in aedb81f
But
orderByOperator.reorderFieldData()still validates and moves vector data as if every logical result row has a physical vector row. Examples:len(data) == n * dim:milvus/internal/proxy/search_pipeline.go
Line 2155 in aedb81f
len(contents) == n:milvus/internal/proxy/search_pipeline.go
Line 2243 in aedb81f
For nullable vector output, data is compact, so an
order_bysearch/query result that also outputs a nullable vector field with null rows may fail with an internal length mismatch or reorder the wrong data. CurrentValidDatareorder tests cover scalar nullable fields only, not compact nullable vectors:milvus/internal/proxy/search_pipeline_test.go
Line 4146 in aedb81f
Expected behavior: result reorder should preserve
ValidDataover logical result rows and move vector payload using logical-to-physical mapping, same invariant asAppendFieldData()/FieldDataIdxComputer.Suggested validation
Add focused regression tests for:
Contentslength equals valid count andGetRow()returns the correct rows.order_by/reorderFieldDatawith nullable FloatVector and SparseFloatVector output whereValidDatahas null rows before valid rows.