Skip to content

Add msgpack-jackson3 module#983

Draft
komamitsu wants to merge 17 commits into
msgpack:mainfrom
komamitsu:support-jackson3
Draft

Add msgpack-jackson3 module#983
komamitsu wants to merge 17 commits into
msgpack:mainfrom
komamitsu:support-jackson3

Conversation

@komamitsu
Copy link
Copy Markdown
Member

Summary

  • Adds a new msgpack-jackson3 submodule (jackson-dataformat-msgpack3) that ports the existing Jackson 2.x integration to Jackson 3.x (tools.jackson namespace)
  • Requires Java 17+ and Jackson 3.1.2; built separately so it does not affect users on Java 8
  • Uses JUnit 4 via com.github.sbt:junit-interface — no extra sbt plugins needed

Key differences from msgpack-jackson

  • Package namespace: com.fasterxml.jacksontools.jackson
  • TokenStreamLocation replaces JsonLocation; byte offset passed as columnNr (truncates for inputs > 2 GB, documented with comment)
  • JavaInfo / char-array optimization removed — always false on Java 17+ since String.value is byte[]
  • OSGi importPackage excludes android.os and sun.* matching msgpack-core
  • javacOptions set to --source 17 --target 17

Test plan

  • ./sbt "msgpack-jackson3/test" — 77 tests pass
  • ./sbt "msgpack-jackson3/test:compile" — no checkstyle errors
  • Existing msgpack-core and msgpack-jackson tests unaffected

komamitsu added 3 commits May 16, 2026 23:59
Introduces a new msgpack-jackson3 subproject that provides MessagePack
serialization for Jackson 3.x (tools.jackson), which requires Java 17+.
Tests use JUnit 4 via junit-interface to keep the build simple.
Replace the fragile JAVA17_HOME/JAVA_HOME path resolution with a simple
requirement to run sbt under Java 17+ (managed via mise). Drop redundant
-encoding/-Xlint flags already provided by buildSettings.
- Remove JavaInfo.java: STRING_VALUE_FIELD_IS_CHARS is always false on Java 17+
  since String.value is byte[], not char[]; simplify parser and generator accordingly
- Remove dead writeCharArrayTextKey method and tempChars field
- Add OsgiKeys.importPackage exclusions for android.os and sun.* packages
- Add comments on (int) cast overflow risk in currentTokenLocation/currentLocation
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new msgpack-jackson3 submodule intended to provide MessagePack <-> Jackson integration for Jackson 3.x (tools.jackson.*) while keeping the existing Jackson 2.x module intact.

Changes:

  • Adds a new msgpack-jackson3 sbt subproject targeting Java 17 and Jackson tools.jackson.core:jackson-databind:3.1.2.
  • Ports the MessagePack TokenStreamFactory, JsonParser, JsonGenerator, and mapper/key-serialization glue to the Jackson 3 API surface.
  • Adds a full test suite (plus benchmark tests) for parser/generator/factory/mapper behavior under the new module.

Reviewed changes

Copilot reviewed 25 out of 26 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
build.sbt Adds msgpack-jackson3 subproject definition and aggregates it under root.
msgpack-jackson3/src/main/java/org/msgpack/jackson/dataformat/ExtensionTypeCustomDeserializers.java Extension-type custom deserializer registry for parser embedded objects.
msgpack-jackson3/src/main/java/org/msgpack/jackson/dataformat/JsonArrayFormat.java Jackson annotation introspector to serialize POJOs as arrays (schema-less).
msgpack-jackson3/src/main/java/org/msgpack/jackson/dataformat/MessagePackExtensionType.java POJO + serializer for MessagePack extension types.
msgpack-jackson3/src/main/java/org/msgpack/jackson/dataformat/MessagePackFactory.java Jackson TokenStreamFactory implementation for MessagePack.
msgpack-jackson3/src/main/java/org/msgpack/jackson/dataformat/MessagePackGenerator.java Jackson JsonGenerator implementation that writes MessagePack.
msgpack-jackson3/src/main/java/org/msgpack/jackson/dataformat/MessagePackKeySerializer.java Key serializer that routes non-String map keys through MessagePack key handling.
msgpack-jackson3/src/main/java/org/msgpack/jackson/dataformat/MessagePackMapper.java ObjectMapper wrapper + builder helpers for BigInteger/BigDecimal formatting.
msgpack-jackson3/src/main/java/org/msgpack/jackson/dataformat/MessagePackParser.java Jackson JsonParser implementation that reads MessagePack.
msgpack-jackson3/src/main/java/org/msgpack/jackson/dataformat/MessagePackReadContext.java Read context implementation for tracking nesting/name/dup detection.
msgpack-jackson3/src/main/java/org/msgpack/jackson/dataformat/MessagePackSerializedString.java SerializableString wrapper used for MessagePack key serialization.
msgpack-jackson3/src/main/java/org/msgpack/jackson/dataformat/MessagePackSerializerFactory.java Serializer factory override to enforce MessagePack key serialization.
msgpack-jackson3/src/main/java/org/msgpack/jackson/dataformat/TimestampExtensionModule.java Jackson module for MessagePack timestamp extension <-> Instant.
msgpack-jackson3/src/main/java/org/msgpack/jackson/dataformat/Tuple.java Simple tuple used for cached unpacker bookkeeping.
msgpack-jackson3/src/test/java/org/msgpack/jackson/dataformat/ExampleOfTypeInformationSerDe.java Example test for embedding type metadata for polymorphic objects.
msgpack-jackson3/src/test/java/org/msgpack/jackson/dataformat/MessagePackDataformatForFieldIdTest.java Tests for integer-key support and mixed key handling.
msgpack-jackson3/src/test/java/org/msgpack/jackson/dataformat/MessagePackDataformatForPojoTest.java POJO round-trip tests (normal, nested, schema-less array format, etc.).
msgpack-jackson3/src/test/java/org/msgpack/jackson/dataformat/MessagePackDataformatTestBase.java Shared test fixtures and helper POJOs.
msgpack-jackson3/src/test/java/org/msgpack/jackson/dataformat/MessagePackFactoryTest.java Factory copy/creation tests.
msgpack-jackson3/src/test/java/org/msgpack/jackson/dataformat/MessagePackGeneratorTest.java Generator behavioral tests (objects/arrays/primitives/keys/big numbers/etc.).
msgpack-jackson3/src/test/java/org/msgpack/jackson/dataformat/MessagePackMapperTest.java Mapper builder behavior tests for BigInteger/BigDecimal-as-string options.
msgpack-jackson3/src/test/java/org/msgpack/jackson/dataformat/MessagePackParserTest.java Parser behavioral tests for objects/arrays/primitives/extensions/errors.
msgpack-jackson3/src/test/java/org/msgpack/jackson/dataformat/TimestampExtensionModuleTest.java Tests for MessagePack timestamp extension serialization/deserialization.
msgpack-jackson3/src/test/java/org/msgpack/jackson/dataformat/benchmark/Benchmarker.java Benchmark harness used by benchmark tests.
msgpack-jackson3/src/test/java/org/msgpack/jackson/dataformat/benchmark/MessagePackDataformatHugeDataBenchmarkTest.java Benchmark test for huge data (JSON vs MessagePack).
msgpack-jackson3/src/test/java/org/msgpack/jackson/dataformat/benchmark/MessagePackDataformatPojoBenchmarkTest.java Benchmark test for POJO ser/de (JSON vs MessagePack).
Comments suppressed due to low confidence (2)

msgpack-jackson3/src/main/java/org/msgpack/jackson/dataformat/MessagePackGenerator.java:501

  • getBytesIfAscii writes into bytes[i] while iterating i from offset to offset+len. When offset != 0 this will write past the end of the newly allocated bytes array (length len) and can throw ArrayIndexOutOfBoundsException. Index into the destination array relative to offset (or use a separate destination index).
    private byte[] getBytesIfAscii(char[] chars, int offset, int len)
    {
        byte[] bytes = new byte[len];
        for (int i = offset; i < offset + len; i++) {
            char c = chars[i];
            if (c >= 0x80) {
                return null;
            }
            bytes[i] = (byte) c;
        }
        return bytes;

msgpack-jackson3/src/main/java/org/msgpack/jackson/dataformat/MessagePackGenerator.java:531

  • writeByteArrayTextValue ignores the provided offset/len when the bytes are ASCII: it wraps the entire text array in AsciiCharString. This can serialize extra bytes outside the intended slice. Consider copying the [offset, offset+len) range (or store the slice metadata) when creating AsciiCharString.
    private void writeByteArrayTextValue(byte[] text, int offset, int len) throws IOException
    {
        if (areAllAsciiBytes(text, offset, len)) {
            addValueNode(new AsciiCharString(text));
            return;
        }
        addValueNode(new String(text, offset, len, DEFAULT_CHARSET));
    }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +388 to +400
else if (v instanceof ByteBuffer) {
ByteBuffer bb = (ByteBuffer) v;
int len = bb.remaining();
if (bb.hasArray()) {
messagePacker.packBinaryHeader(len);
messagePacker.writePayload(bb.array(), bb.arrayOffset(), len);
}
else {
byte[] data = new byte[len];
bb.get(data);
messagePacker.packBinaryHeader(len);
messagePacker.addPayload(data);
}
Comment on lines +141 to +151
private String unpackString(MessageUnpacker messageUnpacker) throws IOException
{
int strLen = messageUnpacker.unpackRawStringHeader();
if (strLen <= tempBytes.length) {
messageUnpacker.readPayload(tempBytes, 0, strLen);
return new String(tempBytes, 0, strLen);
}
else {
byte[] bytes = messageUnpacker.readPayload(strLen);
return new String(bytes, 0, strLen, StandardCharsets.UTF_8);
}
Comment on lines +18 to +24
import com.fasterxml.jackson.annotation.JsonFormat;

import tools.jackson.core.Version;
import tools.jackson.databind.ObjectMapper;
import tools.jackson.databind.cfg.MapperBuilder;
import tools.jackson.databind.cfg.MapperBuilderState;

Comment on lines +6 to +10

import com.fasterxml.jackson.annotation.JsonFormat;

import static com.fasterxml.jackson.annotation.JsonFormat.Shape.ARRAY;

Comment on lines +69 to +111
return 0;
}

@Override
public int appendQuoted(char[] chars, int i)
{
return 0;
}

@Override
public int appendUnquotedUTF8(byte[] bytes, int i)
{
return 0;
}

@Override
public int appendUnquoted(char[] chars, int i)
{
return 0;
}

@Override
public int writeQuotedUTF8(OutputStream outputStream)
{
return 0;
}

@Override
public int writeUnquotedUTF8(OutputStream outputStream)
{
return 0;
}

@Override
public int putQuotedUTF8(ByteBuffer byteBuffer)
{
return 0;
}

@Override
public int putUnquotedUTF8(ByteBuffer byteBuffer)
{
return 0;
MessagePackExtensionType ext = p.readValueAs(MessagePackExtensionType.class);
if (ext.getType() != EXT_TYPE) {
throw new RuntimeException(
String.format("Unexpected extension type (0x%X) for Instant object", ext.getType()));
Comment on lines +18 to +20
import com.fasterxml.jackson.annotation.JsonIgnore;
import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import com.fasterxml.jackson.annotation.JsonProperty;
komamitsu added 5 commits May 17, 2026 12:19
- MessagePackParser.unpackString: use UTF-8 for short strings (<=64 bytes)
  to match the long-string path; platform default charset was used before
- MessagePackGenerator.writeNumber(String): try BigInteger before Double
  to avoid precision loss for large integer strings
- MessagePackGenerator.writeString(Reader, int): handle len=-1 (unknown
  length) by reading until EOF instead of throwing NegativeArraySizeException
Implement MessagePackFactoryBuilder extending DecorableTSFBuilder, wire
it into MessagePackFactory via a builder constructor and rebuild(), and
fix snapshot() to return a copy instead of this. Add tests covering
rebuild() and snapshot() behavior.
Three bugs where non-zero offset/position was ignored:
- getBytesIfAscii: bytes[i] -> bytes[i - offset]
- writeByteArrayTextValue: store slice instead of whole array
- ByteBuffer: use arrayOffset() + position() as payload start

All three bugs exist identically in msgpack-jackson; documented in
plans/preexisting-issues.md with practical impact notes.
@github-actions github-actions Bot added the doc label May 17, 2026
komamitsu added 8 commits May 17, 2026 18:41
- close(): set _closed=true directly instead of calling super.close(),
  which would unconditionally close the underlying stream regardless of
  AUTO_CLOSE_TARGET
- endCurrentContainer(): reset currentState to IN_ROOT when closing the
  root container, preventing stale state if the generator is reused

Both bugs exist in msgpack-jackson; documented in plans/preexisting-issues.md
Use getChars(offset, offset+len) instead of toCharArray() which copies
the entire string unnecessarily.
Replace the string-based lossy check with `decimal.compareTo(BigDecimal.valueOf(doubleValue)) != 0`,
avoiding two stripTrailingZeros + toEngineeringString allocations per BigDecimal write.
…dString stubs, and Version

- MessagePackGenerator: implement streamWriteContext() using SimpleStreamWriteContext;
  wire writeStartArray/Object, endCurrentContainer, writeName, _verifyValueWrite,
  currentValue, and assignCurrentValue to track write context correctly
- MessagePackParser: on close, null out the byte-array source in the ThreadLocal to
  release large payload references; InputStream sources are preserved to keep the
  sequential-read-from-same-stream behavior working
- MessagePackSerializedString: implement all stub append/write/put methods
- Add PackageVersion (0.9.12) and use it in generator, parser, and factory
- Note: messageBufferOutputHolder ThreadLocal not fixed — clearing it causes 23%
  serialization regression by defeating the OutputStreamBufferOutput reuse cache
- MessagePackMapper.version() now returns PackageVersion.VERSION instead
  of Version.unknownVersion(), consistent with factory and parser.
- writeName(SerializableString): remove IllegalArgumentException for
  unknown implementations (fall back to writeName(getValue())); also
  fix missing writeContext.writeName() call for MessagePackSerializedString.
- unpackString(): delegate to messageUnpacker.unpackString() instead of
  manual header-read + payload copy; removes tempBytes allocation.
  ~25% deserialization speedup on pojo benchmark (fewer copies).
- Rename local variable 'scheme' -> 'schema' in test.
msgpack-jackson3 requires Java 17+, so char[]-backed strings (Java 8)
are not a concern. Java 9+ compact strings make the old AsciiCharString
optimization redundant for the char[] path.

- Rename AsciiCharString -> RawUtf8String; extend its use to all raw
  UTF-8 byte[] inputs (not just ASCII), eliminating the areAllAsciiBytes
  scan and the wasteful decode+re-encode for non-ASCII content.
- Simplify writeCharArrayTextValue: drop getBytesIfAscii scan and just
  use new String(char[], offset, len); packString handles encoding
  efficiently via compact strings.
- Simplify writeRaw(String, int, int): drop getChars -> char[] detour,
  use substring directly.
- Simplify writeString(Reader, int) len<0 path: drop StringBuilder ->
  char[] copy, use sb.toString() directly.
- Remove unused imports and fields (Nullable, Charset, StandardCharsets,
  DEFAULT_CHARSET).
@komamitsu komamitsu marked this pull request as draft May 17, 2026 15:20
The previous Benchmarker/System.nanoTime() approach had no forking, no
dead-code elimination (Blackhole), and no JIT-aware warmup, producing
~25ms stdev results. Replace with proper JMH benchmarks modelled on
https://github.com/FasterXML/jackson-benchmarks.

Changes:
- Add sbt-jmh 0.4.7 plugin; enable JmhPlugin on msgpack-jackson3
- Add --add-opens JVM args for forked JMH processes (required for
  msgpack-core's Unsafe usage)
- Remove commons-math3 test dependency (no longer needed)
- Delete Benchmarker, MessagePackDataformatPojoBenchmarkTest,
  MessagePackDataformatHugeDataBenchmarkTest
- Add src/jmh/java benchmark sources:
  * model/: MediaItem, MediaContent, Image, Size, Player, MediaItems
    (canonical jvm-serializers dataset, same as jackson-benchmarks)
  * BenchmarkState: pre-serialized bytes + ObjectMapper instances
  * MsgpackReadBenchmark: readPojoMsgpack / readPojoJson
  * MsgpackWriteBenchmark: writePojoMsgpack / writePojoJson
  * NopOutputStream: /dev/null sink for write benchmarks

Run: ./sbt "msgpack-jackson3/jmh:run"
Smoke-test: ./sbt 'msgpack-jackson3/jmh:run -f 1 -wi 1 -i 1'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants