ClickHouse: add PQS/CERT/CODDTest oracles and lift type system to ADT#4
Open
fm4v wants to merge 39 commits into
Open
ClickHouse: add PQS/CERT/CODDTest oracles and lift type system to ADT#4fm4v wants to merge 39 commits into
fm4v wants to merge 39 commits into
Conversation
The ClickHouse provider previously supported only the five TLP variants (Where / Distinct / GroupBy / Aggregate / Having) and NoREC. This change adds three more general-purpose oracles to ClickHouseOracleFactory. PQS (Pivoted Query Synthesis, Rigger & Su, OSDI 2020). The classical SQLancer PQS implementation requires every AST node to expose a Java-side getExpectedValue() that mirrors the DBMS' evaluation semantics. ClickHouse's expression AST does not provide this for most of the generated tree, and reproducing ClickHouse's coercion / NULL / arithmetic rules in Java would be an open-ended effort. This implementation delegates rectification to the server: for each randomly-generated predicate the pivot row's values are embedded as literals in a one-row subquery and ClickHouse itself evaluates the predicate. Based on TRUE / FALSE / NULL the predicate is kept, negated, or wrapped in IS NULL so the conjunction is guaranteed to hold for the pivot row. Containment is checked with INTERSECT, which handles NULL semantics correctly in ClickHouse. CERT (Cardinality Estimation Restriction Testing). Builds a random SELECT, mutates it through a WHERE / AND / OR / DISTINCT toggle with a known monotonicity direction, then asserts the actual row count moves the way the mutation predicts. A plan-similarity gate on EXPLAIN PLAN output skips cases where the plan diverges enough that the comparison stops being meaningful. ClickHouse doesn't surface single-number cardinality estimates that the JDBC client can read, so this uses actual row counts -- still catches optimizer-driven row loss such as predicate-pushdown bugs and faulty DISTINCT dedup. JOIN / GROUPBY / HAVING / LIMIT mutators are skipped: LIMIT isn't serialized by the visitor, the others need richer query shapes than the existing generator produces. CODDTest (Cross-Optimization Decision Differential Testing). Runs the same query twice with a random subset of optimizer flags toggled on vs off (injected as a per-query SETTINGS clause to avoid leaking state into neighbouring oracle runs sharing the same connection) and asserts the two result sets are identical. The flag list is deliberately conservative -- rewrites with high blast radius (analyzer enable/disable, JOIN algorithm) are excluded because they tend to surface stylistic differences rather than correctness bugs. ClickHouseSchema.ClickHouseRowValue's constructor is promoted from package-private to public so the PQS oracle in oracle/pqs/ can construct it for the base class diagnostic logging. Smoke-tested against a release ClickHouse 26.5 server, single thread, 30-second budget per oracle: PQS ~10 q/s with 94% successful statements; CERT ~12 q/s with 98%; CODDTest ~16 q/s with 97%. No false positives in any run. Checkstyle clean (`mvn checkstyle:check`), naming convention check passes (`python src/check_names.py`).
fm4v
added a commit
to ClickHouse/ClickHouse
that referenced
this pull request
May 15, 2026
Removes the ci/docker/sqlancer-test/overlay/ Java sources, the Dockerfile COPY step that overlays them onto the cloned fork, and the PQS/CERT/CODDTest entries from the TESTS array. The three new oracles are being added to the ClickHouse fork of SQLancer directly (ClickHouse/sqlancer#4). Once that lands, this PR will bump the pinned fork commit and re-add the names to TESTS.
The initial CERT and CODDTest implementations diverged from their papers in ways that defeated the test signal: CERT was using actual row counts from running the queries and a bidirectional mutator framework. Per Ba and Rigger, ICSE 2024 the property under test is `EstCard(Q', D) <= EstCard(Q, D)` -- the *estimator's* projection, with Q' strictly more restrictive than Q, and "CERT eschews executing queries". This rewrite: * Reads cardinality from `EXPLAIN ESTIMATE`, summing `rows` across the per-table tuples it returns. The query is never executed. * Restricts mutations to one direction. `mutateWhere`/`mutateAnd` always AND-tighten or introduce a WHERE; `mutateOr` drops an OR operand (per the paper's restrictive-OR rule) or falls back to AND; `mutateDistinct` only promotes ALL -> DISTINCT. All return `increase=false`. * Skips runs where the estimator returns nothing (non-MergeTree engines, `ORDER BY tuple()`, unsupported expressions), and skips runs where the structural-similarity gate on `EXPLAIN PLAN` shows too much drift. CODDTest was toggling random optimizer flags via per-query `SETTINGS` clauses and comparing results. Per Zhang and Rigger, SIGMOD 2025 the oracle is constant-folding-driven: take a subexpression E in Q, evaluate E to a value V via an auxiliary query A, build a folded query F by substituting V for E, then assert results of Q and F are identical. This rewrite implements the scalar-subquery variant (same as DuckDBCODDTestOracle in the upstream PR sqlancer#1054): aux: SELECT min(c)/max(c) FROM t -> V Q: SELECT * FROM t WHERE col op (SELECT min/max(c) FROM t) F: SELECT * FROM t WHERE col op V Only `Int32`/`String` columns are folded since they are the only types the existing schema generator and `ClickHouseSchema.getConstant` support; NULL auxiliary results are skipped (NULL-propagation would make the predicate UNKNOWN for every row and the equivalence does not hold). Verified locally against a release ClickHouse 26.5 server: * CERT: ~6 q/s effective (most attempts skip because no estimate responds to the random mutation), 0 false positives in a 30s window. * CODDTest: ~22 q/s, 96-97% successful statements, 0 false positives. `mvn checkstyle:check` clean, `mvn package -Dmaven.test.skip=true` succeeds. Papers: CERT https://doi.org/10.1145/3597503.3639076 CODDTest https://doi.org/10.1145/3709674
1 task
…pqs-cert-coddtest
Extends the PQS oracle to cover three paper elements that the initial landing skipped, with no change to the rectification contract or the existing single-table behavior. * Multi-table pivot rows (Section 3.1): 1-3 random non-empty tables. Predicates are generated over the union of their table-qualified columns, the rectification probe stitches one literal-alias subquery per pivot table, and the rectified query uses an implicit cross-join with all rectified predicates in WHERE. * Optional elaborations (Section 3.2) attached probabilistically: DISTINCT, GROUP BY of all pivot columns, ORDER BY. Each preserves the pivot row's presence in the result set by construction. * IS NULL rectification path is now reachable: ClickHouseExpressionGenerator gains an opt-in allowNullLiterals flag (default false to keep existing oracles unchanged) which, when enabled by PQS, occasionally injects a NULL leaf so the probe can legitimately return SQL NULL. Validated against ClickHouse 26.5.1.111: 4 threads x 5 minutes = 21,211 oracle queries; multi-table FROM hits 11,518 (3-table: 2,762), DISTINCT 510, GROUP BY 474, ORDER BY 511, IS NULL rectifications 154, zero AssertionError / reportMissingPivotRow / Exception.
Empirical probe of `EXPLAIN ESTIMATE` shows it only responds to predicates that prune MergeTree primary-key granules; LIMIT, DISTINCT, JOIN-type, and bare GROUP BY are invariant. With the default `index_granularity=8192` and the ~10-30 row inserts the schema generator emits, every table fits in a single granule and the estimate cannot move regardless of restriction, so the oracle was trivially passing every run. This commit fixes that and also covers the only paper rule (HAVING) that moves the estimate. * Bulk-load 50,000 rows from `numbers()` into the chosen table on each check if it sits below that threshold. Idempotent and bounded; tables already amplified are left alone. Verified by inspection: 8 of 9 test tables hit 50k rows on a 60s smoke run; the 9th had MATERIALIZED columns that legitimately rejected `INSERT SELECT`. * Query `system.columns.is_in_primary_key` to discover the table's PK columns and duplicate them 4x in the column list passed to the expression generator. A random leaf is now ~67% likely to be a PK column (1 PK out of 3 cols, otherwise 33%), so a generated predicate much more often hits the granule pruner. * Sometimes (25%) build Q with `GROUP BY <pk_col>` so the new HAVING mutator can fire. The HAVING mutator AND-tightens HAVING with a fresh PK-biased predicate; ClickHouse pushes PK predicates in HAVING down through the optimizer to the scan, making them granule-prune-capable. Falls back to AND-tightening WHERE when no GROUP BY is present so the mutator is never a no-op. * Apply 1-3 random restriction rules per attempt (paper allows multiple). JOIN, bare GROUPBY, and LIMIT remain excluded because they are invariant under ClickHouse's `EXPLAIN ESTIMATE` and would add no bug-finding power. This decision is captured inline.
JDK 26 is the current release (March 2026); CI was pinned to JDK 11 and the pom enforced `source/target=11` via the Eclipse compiler (ecj). The latest ecj on Maven Central (3.45.0) only goes up to JDK 24, so to move forward we switch the maven-compiler-plugin to standard javac, drop the ecj/plexus dependencies and the `org.eclipse.jdt.core.prefs` compiler arguments, and set `<release>26</release>`. The `.settings/` directory remains for IDE use only. * `maven-compiler-plugin` 3.10.1 -> 3.13.0, `<release>26</release>`, no compilerId override, no ecj deps. * `maven-javadoc-plugin` 3.4.1 -> 3.11.2 and `<source>` bumped to 26. * `.github/workflows/main.yml` and `release.yml`: every `java-version: '11'` -> `'26'` (27 occurrences total). Distribution remains Adoptium Temurin, which ships JDK 26 binaries. Verified locally on Temurin 26.0.1: `mvn package -Dmaven.test.skip=true` clean, sqlancer jar runs both TLPWhere and CERT oracles against ClickHouse 26.5 with no JVM-level errors. The few remaining warnings (`System::loadLibrary` from the ClickHouse JDBC LZ4 native, `Unsafe` from Guava) are advisory and unrelated to this change.
Use JDK 25 (the current LTS, released Sept 2025) rather than JDK 26 (non-LTS, released March 2026). Same javac-via-maven-compiler-plugin setup as the previous commit; just flips the source/release level and the CI java-version. * `pom.xml`: maven-compiler-plugin `<release>` 26 -> 25; release profile maven-javadoc-plugin `<source>` 26 -> 25. * `.github/workflows/main.yml` and `release.yml`: every `java-version: '26'` -> `'25'` (27 occurrences). Distribution remains Adoptium Temurin. Verified locally on Temurin 25.0.3: `mvn clean compile test-compile` green; produced jar is class-file major version 69 (Java 25).
The initial implementation covered only the scalar non-correlated subquery case (Section 3.1 case 2 of Zhang & Rigger SIGMOD '25). Extend to follow Algorithm 1 from the paper, picking one mode uniformly per check. Modes: 1. Constant expression (Section 3.1 case 1, was missing). Generates a random column-free expression via the existing `ClickHouseExpressionGenerator`, evaluates it with `SELECT toTypeName(phi), phi`, and substitutes the literal back. The generator's `generateExpressionWithExpression` is seeded with a few typed constant leaves -- this is necessary because `generateExpressionWithColumns` short-circuits to a single constant when called with an empty column list. 2. Scalar non-correlated subquery (Section 3.1 case 2). The previous implementation's `min/max(col)` path, restated in the new framework. 3. Dependent expression (Section 3.2, was missing). Generates a random expression over one outer column k, builds a `SELECT DISTINCT k, phi FROM t` mapping, folds phi to a `CASE WHEN k = v_i THEN r_i ...` wrapped in `cast(..., 'expectedType')` so the folded predicate sees the same operand type as the original through compound predicates. The outer predicate template is also varied (bare comparison, AND/OR compounds, NOT) so phi passes through richer constant-folding paths than the previous fixed `col op phi`. Validated against ClickHouse 26.5.1.111: 4 threads, 5 minutes, 64,071 queries executed, 98% successful statement rate, 0 false positives.
Replace the flat (ClickHouseDataType, String) representation in
ClickHouseLancerDataType with a recursive ClickHouseType ADT (Primitive,
Nullable, LowCardinality, Unknown) plus a four-predicate capability layer.
Re-route every dispatch site that previously AssertionError'd on anything
outside {Int32, String}, add a defensive reflection parser, and extend
ClickHouseCast to cover every v1 primitive kind via a propagating
ClickHouseUnsupportedConstant sentinel.
Activates two new feature flags (--test-nullable-types,
--test-lowcardinality-types, both on by default) so the generator now emits
Nullable and LowCardinality columns. CODDTest's filter and legacy string
parser, CERT's generatorExprFor, and the table generator's PARTITION/SAMPLE/
ORDER clause emission are all rewritten to dispatch via the new capabilities.
Live SQLancer smoke against ClickHouse 26.5 (10 min, 4 oracles, 70k+
queries) surfaced three v1-introduced rejections and they are now handled:
allow_suspicious_low_cardinality_types is set on the JDBC URL when the LC
flag is on; allow_nullable_key=1 is added to MergeTree SETTINGS so wrapped
columns can participate in PARTITION/ORDER/SAMPLE; the
CANNOT_INSERT_NULL_IN_ORDINARY_COLUMN family is added to ClickHouseErrors.
Plan and brainstorm documents that drove the implementation are included
under docs/. CI test enumeration in .github/workflows/main.yml is extended
to run the seven new test classes.
This fork only ships changes to the ClickHouse provider, so the per-DBMS matrix in .github/workflows/main.yml was 19 jobs we never read. Removes citus, cockroachdb, databend, datafusion, duckdb, hive, spark, hsqldb, mariadb, materialize, mysql, oceanbase, postgres, presto, sqlite, tidb, yugabyte, and doris. Keeps `misc` (project-wide style/PMD/Checkstyle/SpotBugs via `mvn verify` plus the misc unit tests and naming convention check) and `clickhouse` (the DBMS job that exercises the type-system foundation tests).
Add two complementary differential-testing capabilities: 1. SEMR oracle (--oracle SEMR) picks one "should-be-result-preserving" ClickHouse optimizer setting from a curated list, runs the same generated SELECT once with the setting forced 0 and once forced 1, and fails when the two multisets diverge. Targets cross-configuration consistency bugs of the shape documented at ClickHouseTLPHavingOracle.java:42 (ClickHouse#12264). 2. --random-session-settings + --random-session-settings-budget apply a random subset of a curated execution-mode catalog via SET k=v on the per-database JDBC connection. Every other oracle (TLP*, NoREC, PQS, CERT, CODDTest) implicitly runs under a different setting profile each database. The two features are mutually exclusive in a single run (rejected at startup with a single clear error). The catalog excludes optimizer-rewrite settings from the randomization list to protect CERT/CODDTest invariants, and excludes settings hardcoded by TLPHaving/TLPAggregate from both lists. Setting churn (unknown setting, out-of-range value) is absorbed via a new expected-error catalog so it never surfaces as an oracle failure. Plan: docs/plans/2026-05-17-001-feat-clickhouse-semr-oracle-settings-randomization-plan.md
The expression generator picked column leaves and operators independently of type, so a String column could feed an arithmetic operator and a Float column could feed gcd/lcm/intDiv. Against ClickHouse 26.2, system.query_log showed ~96% of SQLancer failures were ILLEGAL_TYPE_OF_ARGUMENT (Code 43) from this mismatch, with smaller contributions from NO_COMMON_TYPE join keys (386) and typed-comparison constants (53/32). Four mechanical fixes against the same workload (--oracle TLPDistinct --random-session-settings true, 400 queries, seed 12345): * generateExpressionWithColumns filters to numeric columns and the recursive descent stays in the numeric pool. Falls back to an Int32 constant when the table has only non-numeric columns. * BINARY_FUNCTION splits into integer-only (intDiv/gcd/lcm with plain integer column refs) and any-numeric (max2/min2/pow with the recursive descent). ClickHouse promotes most math wrappers (sin, cos, sqrt, log...) to Float64, so the integer-only branch keeps leaves as bare column refs to stay integer-typed end to end. generateExpressionWithExpression also routes through getRandomAnyNumeric since its pre-built expression leaves are usually aggregate Floats. * generateExpression(type, depth) now defaults rightLeafType to leftLeafType, inverting the previous "force same type with low probability" coin flip that produced Int32-vs-String comparisons. * generateJoinClause enumerates (left, right) column pairs, prefers same-type, falls back to numeric-vs-numeric, and throws IgnoreMeException when no compatible key combination exists. Avoids server roundtrips for joins that would error with NO_COMMON_TYPE. * Off-by-one in four column-picker call sites: getNotCachedInteger(0, size-1) excluded the last index; corrected to size. Result: SELECT failure rate against ClickHouse 26.2.17.31 dropped from 41.6% to 0.09% on the same seeded workload, with the remaining 4 failures being runtime division-by-zero (out of scope for type fixes) and stray edge cases.
…atalog entry Two infrastructure changes that benefit every ClickHouse oracle: - Bump the CI ClickHouse image from 24.3.1.2672 to :head so wrong-result bugs in the active stable line surface earlier. The pin sacrificed reproducibility for stability; we now accept slight CI churn in exchange for catching regressions before they reach a tagged release. - Add "is found in GROUP BY in query" and "(ILLEGAL_AGGREGATION)" to the expected-expression-error catalog. ClickHouse 26's new analyzer raises a different error string than the 24.x branch when a positional GROUP BY reference (GROUP BY 1) resolves to an aggregate SELECT-list column -- the old "Illegal value (aggregate function) for positional argument in GROUP BY" pattern was the 24.x form; both must be absorbed so the generator's harmless aggregate-positional output doesn't surface as an oracle finding in 26+. Surfaced via the EET HAVING-mode regression run but benefits TLPHaving and any future HAVING-using oracle equally.
Add the SIGMOD '25 paper's companion to CODDTest. Where CODDTest folds a sub-expression to its precomputed value and asserts the result is unchanged, EET goes the inverse direction: inject an expression that should fold to a fixed value (tautology, contradiction, or algebraic identity) and assert the rewrite is semantics-preserving. Same target bug class (optimizer constant-folding / short-circuit / partial-eval), orthogonal attack axis. Selectable via --oracle EET. Each check() picks one of four modes uniformly: - WHERE injection. Generate a base predicate `predQ` and random `e`; conjoin `pred AND (3VL-tautology over e)` and assert rows unchanged, or `pred AND (3VL-contradiction over e)` and assert rows empty. The 3VL shapes are `(((e) OR NOT (e)) OR (e) IS NULL)` and `(((e) AND NOT (e)) AND (e) IS NOT NULL)` with binding-tight parens on every reference to `e` -- ClickHouse's parser binds OR looser than NOT and tighter than AND, so an unparenthesized injection inside `pred AND ...` would parse the wrong way. - HAVING injection. Same shapes injected into an aggregated query's HAVING clause. Reuses TLPHaving's `aggregate_functions_null_for_empty=1, enable_optimize_predicate_expression=0` SETTINGS suffix on both sides of the comparison to dodge ClickHouse issue #12264; not applying it produces false positives indistinguishable from EET findings. - Expression-position rewrite. Pick a SELECT-list column `x`, probe its runtime type via `toTypeName`, wrap as `if(taut, x, x)`, `multiIf(taut, x, junk, x)`, or `CASE WHEN taut THEN x ELSE x END` (and the contradiction-negated form). Both arms share `x`'s type; the junk-branch value is `defaultValueOfTypeName(typeOfX)` -- a typed non-NULL default, picked because `cast(NULL, 'LowCardinality(...)')` is rejected at parse time (LowCardinality is not nullable). Each rewrite is wrapped in `cast(..., 'TypeOfX')` to neutralize the type widening some identities introduce. - Algebraic identity. Type-safe substitution from a five-entry catalog (`ClickHouseEETIdentities`): `plus(x,0)`, `multiply(x,1)`, `concat(x,'')`, `coalesce(x,x)`, `if(true,x,x)`. Each entry carries a predicate that gates application to a safe type family. Float and Decimal are excluded from `plus`/`multiply` (NaN / -0.0 formatting and scale-coercion false positives). String only for `concat`. Reuses `CODDTestBase` for failure-attribution fields; the naming mismatch is a deliberate trade-off acknowledged in the plan rather than mechanically duplicating six fields for the second oracle in this family. Validated against ClickHouse 26.5.1.111 with a 27K-query burn-in plus the 1000-query integration test (T18_, --num-threads 1). No oracle assertion failures. Plan in docs/plans/2026-05-18-001-feat-clickhouse-eet-oracle-plan.md. Paper: Zhang and Rigger, "Constant Optimization Driven Database System Testing", SIGMOD '25 (DOI 10.1145/3709674).
Adds max_execution_time=120 to the JDBC URL. Without this cap, occasional heavyweight random queries hit the 300s socket_timeout and produce ambiguous client-side timeout exceptions instead of clean server-side error codes (3 such timeouts observed in a 15-min 2026-05-18 baseline run). The server-side cap surfaces as TIMEOUT_EXCEEDED, absorbed by the matching "Timeout exceeded: elapsed" + "(TIMEOUT_EXCEEDED)" multi-word substrings added to ClickHouseErrors.
Adds the implementation plan for three orthogonal query-generator additions: aggregate combinator chains (-If, -OrNull, -OrDefault, -Distinct, -Array, -State, -Merge, -ForEach, -Resample, -Map), set operations with explicit ALL/DISTINCT keywords (UNION ALL/DISTINCT, INTERSECT, EXCEPT) plus a new ClickHouseTLPSetOpOracle, and ARRAY JOIN structural plumbing (blocked on type-system v2 for activation). Sequenced as commit-level milestones on this branch, with per-phase yield gates measured against a pre-Phase-A baseline. Deepened against five reviewer agents; auto-fixes applied silently, strategic decisions integrated based on user direction (full combinator matrix, single-PR bundling, per-phase yield gates, EXCEPT operator coverage).
Adds compress=false to the JDBC URL. clickhouse-jdbc 0.9.6 has a defect in its LZ4-over-chunked-HTTP decoder (ClickHouseLZ4InputStream + ChunkedInputStream interaction) that fires MalformedChunkCodingException: CRLF expected at end of chunk mid-response, surfaced at the JDBC layer as SQLException: Failed to read value for column. Observed 16 times across the 2026-05-18 15-min baseline (0.33% per-query rate); validated server-side via clickhouse-client (native protocol) which returns valid data for every failing query — confirming the bug is in the driver, not in ClickHouse. With compression off the buggy code path is bypassed entirely: the response stream becomes the raw chunked HTTP body, no LZ4 frame parsing. Trade-off: ~3x larger responses on the wire, but SQLancer's queries are small and the connection is loopback, so net throughput is unaffected. Revisit when clickhouse-jdbc fixes the LZ4 decoder upstream.
The 0.9.8 driver tightened URL-param validation (ClientConfigProperties at 0.9.8 rejects unknown keys with ClientMisconfigurationException, whereas 0.9.6 silently forwarded them as server settings). The pre-existing URL params allow_suspicious_low_cardinality_types, allow_experimental_analyzer, and max_execution_time are ClickHouse server settings and must now be prefixed with `clickhouse_setting_` to pass through. Reconfirmed during the bump: ClickHouseLZ4InputStream.class is byte-identical between 0.9.6 and 0.9.8 (md5 3519c1f7…), so the LZ4-over-chunked-HTTP decoder bug persists. `compress=false` remains load-bearing.
…umbing Add three orthogonal generator-surface expansions and two new TLP-family oracles. Set operations: - ClickHouseSetOperation AST with explicit ALL/DISTINCT keyword variants for UNION / INTERSECT / EXCEPT (six SetOpKind values). Visitor + ToString folded so nested set-ops auto-parenthesise; a top-level set-op renders without outer parens. - ClickHouseTLPSetOpOracle exercises four invariants on the canonical TLP partition (p, NOT p, p IS NULL): UNION ALL multiset equality, UNION DISTINCT set equality, INTERSECT pairwise disjointness, EXCEPT coverage + pairwise disjointness. Renders explicit operator keywords; SETTINGS pinning is belt-and-suspenders. Local guards reject aggregate fetch-columns, non-deterministic predicates, and multi-column shapes that would mask bugs. Startup probe disables the oracle when *_default_mode settings are unknown. Aggregate combinators: - ClickHouseAggregateCombinator + chain field on ClickHouseAggregate. Backward-compatible: empty chain renders plain SUM(x) form; non-empty folds the suffixes into the camelCase function name (sumIf, sumIfArray, etc.) and appends per-suffix extra args inside one paren group. Order-preserving. - Generator emits chains under --test-aggregate-combinators, default off. Weighted suffix picker; per-suffix extra-arg grammar (-If takes one boolean, -Resample takes three integers, all others none). - ClickHouseTLPCombinatorOracle catalog: sumIf, countIf, avgOrNull, sumOrNull, minIf, maxIf. -OrNull family forces aggregate_functions_null_for_empty=0 to avoid double-encoding the empty-NULL semantics. ARRAY JOIN structural plumbing: - arrayJoinExprs + arrayJoinLeft on ClickHouseSelect, emitted between FROM and any regular JOIN clauses. Default empty -- the generator never populates it until type-system v2 introduces Array column generation. Error catalog: - getSetOpErrors / getCombinatorErrors / getArrayJoinErrors with multi-word substring discipline. UNKNOWN_SETTING family deliberately excluded so the set-op startup probe's signal stays visible to future audits. Oracle factory: --oracle=SetOpTLP, --oracle=CombinatorTLP wired through. Options: --test-set-op-tlp, --test-aggregate-combinators, --test-combinator-tlp, --test-array-join (all default off).
Iterated against clickhouse-server:head (26.5.1) and surfaced three issues in my own work plus one candidate ClickHouse behaviour anomaly. SetOpTLP — drop unsound pairwise invariants. The pairwise `Tp INTERSECT Tnp ≡ ∅` and `DISTINCT(Tp) EXCEPT DISTINCT(Tnp) ≡ DISTINCT(Tp)` forms in the original plan are unsound on projections: TLP partitions rows, but a SELECT-list expression can collapse rows from disjoint partitions to identical projected values (constant fetchCol, `c0/c0`, etc.). Replace with: - INTERSECT subset: `DISTINCT(branch) INTERSECT DISTINCT(T) ≡ DISTINCT(branch)`. - EXCEPT coverage only (the chained 4-way form already valid). SetOpTLP — NaN/Infinity guard. SQL equality says `NaN != NaN`, so a single NaN value breaks INTERSECT/EXCEPT-routed comparisons even when row sets are correct. Skip via `IgnoreMeException` when any result row is `NaN`, `inf`, `Infinity`, or their negative forms. Applies to UNION_DISTINCT, INTERSECT, and EXCEPT modes; UNION_ALL multiset equality remains sound. CombinatorTLP — fix countIf identity. `countIf` over empty input always returns 0 (count's identity), but `sum(toUInt64(c))` over empty with `aggregate_functions_null_for_empty=1` returns NULL. The two sides diverge exactly on empty branches (e.g., LEFT ANTI JOIN with no unmatched rows). Pin `countIf`'s identity to `null_for_empty=0` so the sum-based rewrite also returns 0 on empty. Other -If identities (sumIf, minIf, maxIf) stay at =1 because their aggregate's empty-input return (NULL) coincides on both sides. Table generator — restrict engines to MergeTree only. Log/TinyLog/StripeLog and Memory engines diverge from MergeTree on parts, projections, skipping indexes, and mutation semantics; oracle-level false positives swamp any bug-finding signal. Validation also surfaced one candidate ClickHouse anomaly worth filing separately: with `LEFT ANTI JOIN`, the value of `right_side.<join-key>` in the SELECT projection differs between the no-WHERE form (returns 0 / default) and the with-WHERE form (returns the left-side join key value). Reproducible on 26.5.1.761 with both old and new analyzer. Not addressed by this commit.
… buffering Root-cause analysis of the ~10% JDBC-stream failures from the 2026-05-18 SetOpTLP/TLPWhere baselines: the underlying cause is *not* the JDBC driver or the LZ4 path -- it's a fundamental quirk of ClickHouse's HTTP protocol. When a query starts producing rows, the server has already committed to `HTTP/1.1 200 + Transfer-Encoding: chunked`. If execution then errors on some row (e.g., a generated `intDiv(c0, c0)` row hits c0=0 → ILLEGAL_DIVISION mid-stream), the server can't change its mind on the HTTP status -- it writes the plain-text exception into the already-binary response body and closes the connection without the proper terminating chunk. clickhouse-jdbc 0.9.8's `BinaryStreamReader.readDoubleLE` / `readIntLE` then hits EOF and surfaces as `SQLException: Failed to read value for column ...` with `ConnectionClosedException: Premature end of chunk coded message body` underneath. Verified deterministically with a 10M-row table whose mid-row triggers ILLEGAL_DIVISION: hits the exact same exception chain we saw in the wild. Fix: raise `http_response_buffer_size` to 100 MB and pin `wait_end_of_query=1` on the JDBC connection URL. With this, ClickHouse buffers the full response server-side before sending; if execution errors, the server can still emit HTTP 500 + a parseable error body, and the JDBC driver decodes it as a proper `ServerException`. `wait_end_of_query=1` alone is insufficient -- it only takes effect when the response fits the default buffer (a few MB). 100 MB covers every SQLancer-generated result observed so far without giving the server license to allocate gigabytes per query under concurrent load. Verification: 4-thread / 3-min SetOpTLP burn-in went from ~10% JDBC-stream failure rate to zero. Throughput per database appears lower in the new runs because per-DB sessions now complete their full --num-queries budget instead of terminating early on transport failures -- aggregate query throughput is unchanged. The earlier LZ4-decoder hypothesis (kept for the `compress=false` rationale) was a contributing path on 0.9.6 but not the root cause; the LZ4 stack trace was visually similar but came from a separate driver defect.
New oracles registered in ClickHouseOracleFactory:
- QccCache — cross-query cache-poisoning differential (ClickHouse#104781)
- SortedUnionLimitBy — sorted UNION ALL + outer LIMIT BY/DISTINCT (#103231)
- RowPolicy — `USING p` filters identically to `WHERE p` (#97076)
- TableFunctionIN — numbers(N) + IN-vs-OR equivalence (#103835)
- ViewEquivalence — view read == inlined SELECT (#100390)
Generator surface:
- PREWHERE clause now emitted by TLPBase (10%); unlocks #104781 shape.
- FINAL emitted on engines where it is legal — Replacing/Summing/
Aggregating/Collapsing variants. Engine tracked via system.tables.
- Skip indexes (bloom_filter, set, minmax, ngrambf_v1) and projections
(column-subset / aggregating) emitted at CREATE TABLE.
- ReplacingMergeTree and SummingMergeTree added to the engine pool.
- Bare large-integer literals (256/65536/2147483648) folded into
predicates ~10% — targets #101287.
SEMR expansion (Settings-Equivalence-Multiset-Result oracle):
- use_query_condition_cache, use_skip_indexes_on_data_read,
use_index_for_in_with_subqueries, optimize_use_implicit_projections —
cache and skip-index toggles.
- transform_null_in, lazy_columns_replication, compile_expressions,
compile_aggregate_expressions, optimize_aggregators_of_group_by_keys,
optimize_trivial_count_query — bug-history-driven additions covering
#95674, #94339, #103809, #105054, #100794.
Schema:
- ClickHouseTable now records the engine string (read from
system.tables) and exposes supportsFinal() for shape gating.
Verification: ~4000-query multi-oracle burn on ClickHouse 26.5.1.111
(TLPDistinct, TLPGroupBy, TLPHaving, SEMR, QccCache, SortedUnionLimitBy,
RowPolicy, TableFunctionIN, ViewEquivalence) — zero regressions.
v1 emitted only Primitive(Int32|String) optionally wrapped in Nullable/LowCardinality. v2 extends the ADT with four new constructors (FixedString(N), Decimal(P,S), DateTime64(prec), Array(T)) plus a new DateTime Kind, and widens the generator's pick set to every entry of Kind with weighted distribution biased toward bug-bait surfaces (UInt32/UInt64 for mixed-width JOIN keys, Date/DateTime for partition expressions, FixedString/Decimal/DateTime64 at low rate). ReplacingMergeTree(ver) and SummingMergeTree(col) engine args now pick a viable column from the new type pool; previously both were emitted with empty args because the only numeric the v1 picker chose (Int32) was signed. ARRAY JOIN, plumbed earlier behind --test-array-join, now fires when the FROM table carries an Array column. Refactor: pre-build the dummy column list with its final types and hand that to ColumnBuilder so the in-memory column list matches what the server sees -- the previous flow generated types twice and engine args picked from a list that didn't match the emitted DDL. Cast rendering uses the compound type (FixedString(5), Decimal(9,3)) not the bare JDBC enum tag, otherwise the cast text drops the parameter slots and CH rejects with NUMBER_OF_ARGUMENTS_DOESNT_MATCH. Error catalog grows with the new emission surface: BAD_ARGUMENTS substrings for engine-arg / PK overlap, decimal "Too many digits" + ARGUMENT_OUT_OF_BOUND for over-magnitude decimal literals (also clamped at emission), and Date / DateTime parse failures. Smoke run on live ClickHouse 26.5.1.111: 4324 queries / 98% successful-statement rate on TLPWhere with all v2 flags on, zero uncaught AssertionErrors across TLPWhere / QccCache / PQS.
Ten profile/fix iterations against clickhouse-server:head (26.5.1.779), 6
sqlancer threads, --oracle TLPWhere. Two-run 90s A/B with stock HEAD vs.
this branch: mean CH-side stmts 13,942 -> 15,581 (+11.8%); thread-death
rate 2/3 runs -> 1/3 runs.
ClickHouseErrors.java
Absorb TOO_LARGE_STRING_SIZE (Code 131). CH 26.5 surfaces oversized
FixedString literals with that wrapper rather than FIXED_STRING; without
it the generator's expected-errors filter let two of six threads die on
the very first INSERT batch (~33% capacity gone at t=0).
ClickHouseProvider.java
- DROP DATABASE ... SYNC + drop the two Thread.sleep(1000) calls in
createDatabase. The sleeps date to the 2020 module rewrite with no
justifying comment; modern Atomic-engine rename-on-drop makes them
unnecessary, and they were costing ~84 thread-seconds per 1080s budget.
- Move clickhouse_setting_max_execution_time / allow_experimental_analyzer
/ allow_suspicious_low_cardinality_types off the JDBC URL into SET
statements at connect time. http_response_buffer_size and
wait_end_of_query stay on the URL: the former is consumed at the moment
the server commits to a chunked response (re-SETting too late) and the
latter is HTTP-only (SET returns UNKNOWN_SETTING).
ComparatorHelper.java
Replace replaceAll("[\\.]0+$", "") in getResultSetFirstColumnAsString
with a constant-time char-from-end scan. Was ~24% of all sqlancer-side
CPU samples because it ran on every row of every oracle result set
(Pattern.compile + Matcher.<init> showed up 400+ times per profile).
Main.java
Drop per-write flush in StateLogger.write. The current-database log
file is closed (and therefore implicitly flushed) on both success and
AssertionError paths of DBMSExecutor.run, so reproducer integrity is
preserved -- the only behavior we lose is the very last few queries
being durable if the JVM is hard-killed before the finally-block close
runs.
common/log/SQLLoggableFactory.java
Fast-path createLoggable for newline-free input (the overwhelming
common case): a single LoggedString allocation when the query already
ends with ';' and has no suffix, otherwise one StringBuilder pass that
escapes \n / \r inline instead of two separate String.replace passes.
Each logged statement funnels through here twice when
--log-execution-time=true.
common/query/SQLQueryAdapter.java
setEscapeProcessing(false) on every Statement / PreparedStatement.
SQLancer never emits JDBC escape syntax ({fn ...}, {call ...},
{escape '\'}), so the driver's escape-to-native preprocessor is pure
overhead -- worth ~70 String.replaceAll samples per profile on
clickhouse-jdbc 0.9.8. Wrapped in try/catch in case a driver throws
on the setter.
common/visitor/ToStringVisitor.java
Pre-size the shared StringBuilder to 512 chars. Observed sqlancer SQL
is p50=49, p90=176, p99=223, max=769; the default capacity of 16 was
forcing seven grow-and-arraycopy cycles per AST -> SQL render.
Captures the non-obvious traps from the perf workstream so a future
session (human or agent) doesn't have to rediscover them:
- ClickHouse head container needs CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1
+ CLICKHOUSE_SKIP_USER_SETUP=1 or the default user has no network
access ("password is incorrect" surface error).
- wait_end_of_query is HTTP-only; http_response_buffer_size is consumed
at the moment the server commits to a chunked response, so both have
to stay on the JDBC URL. Other clickhouse_setting_* params are safe to
apply via SET after connect.
- mvn package must include -Djacoco.skip=true under JDK 25 (JaCoCo
0.8.12 chokes on class file major version 69).
- Sqlancer's CLI parser is positional: global options before the DBMS
subcommand, DBMS options after, otherwise jcommander rejects them.
- This shell ships a poisoned JAVA_TOOL_OPTIONS / ASAN_OPTIONS that
every java/mvn/jfr call has to unset.
…review
Targets the 36 wrong-result bugs surveyed in the QA-intel yearly review.
Each oracle maps to a specific bug cluster from the report:
* SEMRMulti -- pairwise/tripled SEMR over expanded SEMR_SETTINGS, catches
optimizer-pass-interaction regressions like #100029 / #93483.
* KeyCondition -- wraps predicate column refs in materialize() and diffs
against the indexed path; catches #92492 (KeyCondition regex pruning).
* PartitionMirror -- builds a no-PARTITION-BY sister table per iteration
and diffs SELECTs; catches #90240 and any future partition-pruning or
physical-INSERT-routing wrong-result bug.
* Parallelism -- same SELECT under serial / parallel / two-level GROUP BY
profiles; catches #99109, #99111, #80439.
* Cast -- diffs accurateCast vs accurateCastOrNull on the fitting rows;
catches #100697 / #100471 / #100740 / #101763 / #100049.
* JoinAlgorithm -- diffs JOIN result multisets under {hash, partial_merge,
grace_hash}; catches #100781 and the SEMI/ANTI conversion family.
* SchemaRoundtrip -- creates two parallel tables under
data_type_default_nullable=0 vs =1 with explicit NOT NULL and asserts
the resulting column types match; catches #97287 / private#53340.
Supporting changes:
- SEMR_SETTINGS expanded with 9 result-preserving optimizer flags
(optimize_skip_unused_shards, optimize_functions_to_subcolumns,
optimize_rewrite_regexp_functions, query_plan_use_logical_join_step,
query_plan_direct_read_from_text_index, query_plan_text_index_add_hint,
read_in_order_use_buffering, compile_sort_description).
- ClickHouseJoin.JoinType gains LEFT_ANY, RIGHT_ANY, ANY_INNER,
LEFT_SEMI, RIGHT_SEMI; the visitor renders all five. The existing
join generator (Randomly.fromOptions(values())) picks them
automatically, so every JOIN-aware oracle (TLP, NoREC, SEMR, JoinAlg)
now covers ANY/SEMI shapes.
- --semr-arity flag controls the SEMRMulti hypercube arity.
Smoke-tested against clickhouse/clickhouse-server:head 26.5.1.805 over
4 threads x ~55s: ~10k oracle iterations across all 7 oracles, 0
threads shut down, 95% successful-statements rate. Pre-existing SEMR
and TLPWhere oracles re-checked, no regression.
Three structural issues surfaced during the first 25-oracle 15-minute run
and three regressions from W3 (ANY/SEMI join expansion) surfaced once the
existing TLP / NoREC / SEMR oracles started picking the new join types.
Fixes here are scope-minimal: each addresses one observed failure mode
with the smallest change that actually held up across a re-run.
* ClickHouseProvider.getDatabaseName(): when the comma-joined --oracle
list has 25 entries, the resulting database name plus the appended ".sql.tmp"
metadata suffix overflows the ext4 255-byte filename limit. ClickHouse
surfaces this as "Code: 458 CANNOT_UNLINK" on every DROP/CREATE DATABASE
and the worker thread dies. Substitute a stable short hash when the
suffix would push the name past 200 bytes; single-oracle runs keep the
readable suffix.
* ClickHouseProvider: max_execution_time lowered from 120s to 30s. With
the W3 JOIN-shape expansion the generator now emits multi-table FROM
clauses ("SELECT * FROM t1, t2, t3") regularly; at 120s the Cartesian
result can monopolise a worker thread for the full 2 minutes draining
the JDBC stream. 30s preserves the "clean TIMEOUT_EXCEEDED rather than
ambiguous socket_timeout" property of the original cap with bounded
per-thread blockage.
* ClickHouseExpressionGenerator.getRandomJoinClauses (both call sites):
restrict the random pick to DETERMINISTIC_JOIN_TYPES. ANY / SEMI break
TLP / NoREC / SEMR multiset equality by construction (their per-row
match choice is algorithm-dispatched); the dedicated JoinAlgorithm
oracle already filters these at oracle level. Caught as TLPWhere
"size of the result sets mismatch" with RIGHT ANY JOIN in run ClickHouse#1.
Plus three disk-pressure mitigations for the dev container (.claude/
clickhouse-config/). Without them, a 6-thread 15-minute run produces
~1 GB of /var/lib/clickhouse + /var/log/clickhouse-server cruft (>98%
observability, not user data); with them, ~150 MB:
- log_level.xml drops the server file logger from trace to warning.
Kills ~80% of system.text_log growth (the table-shaped mirror of the
file logger). File-log growth is dampened too but is dominated by
ERROR-level stack traces from sqlancer's malformed queries, which the
level cap can't touch.
- trace_log_disabled.xml uses <trace_log remove="remove"/> to remove the
table at config-merge time. On a fresh container the table does not
exist; on a retrofit the write pipeline is short-circuited and the
table sits at 0 rows.
- system_log_ttl.xml caps processors_profile_log retention at 1 hour via
the config-driven <ttl> element. ALTER TABLE ... MODIFY TTL is NOT
durable for system tables (observed on 26.5.1.805) -- the server
reapplies the config-defined engine on restart and the ALTER is lost.
clickhouse-disk-cleanup.sh is the manual sibling: idempotent, drops
orphan sqlancer databases + truncates system *_log tables + in-place
truncates the file logs. Used between long runs.
CLAUDE.md docker-run snippet updated to mount the three XML files
individually into /etc/clickhouse-server/config.d/ (subdirectory mounts
don't work -- ClickHouse's config processor scans flat *.xml only).
Both failures predate the W1-W4 oracle work; the branch had been failing on every commit since 2026-05-18 for these reasons. Confirmed via gh run list against main (last green 2026-05-16) -- the regressions are infrastructure-level on the branch, not introduced by code changes. * .github/workflows/main.yml: pass CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1 and CLICKHOUSE_SKIP_USER_SETUP=1 to the test ClickHouse container. The clickhouse/clickhouse-server:head entrypoint tightened sometime between 2026-05-16 and 2026-05-18: when no user/password is set AND these flags are absent, network access for the `default` user is disabled and every JDBC connection fails with `Code: 194 REQUIRED_PASSWORD`. This was already documented in .claude/CLAUDE.md for local dev; bringing CI in line. * mvn formatter:format applied to satisfy the eclipseformat validate hook. Eighteen files reformatted; no behavioural change. The formatter primarily reflows multi-line javadoc paragraphs and the occasional break-around-string-concat. ClickHouseTLPBase.java was the first reported offender; running format surfaced the rest in one pass. Both fixes are independent of the W1-W4 oracle additions and the subsequent dev-environment hardening commit -- they fix the workflow, not the engine or generator.
* New `sqlancer.clickhouse.transport` package (~1100 LOC):
- `ClickHouseTransport` interface (executeUpdate / executeQuery / close).
- `ClickHouseHttpTransport`: raw HTTP POST to `/`, format
`TabSeparatedWithNamesAndTypes`. Uses `java.net.HttpURLConnection`
(no Apache HC, no clickhouse-jdbc), explicit keep-alive, JVM-wide
`http.maxConnections=32` so sqlancer's 6+ workers share one TCP
socket per thread instead of churning a new one per request.
- `ClickHouseTransportConnection` / `ClickHouseTransportStatement` /
`ClickHouseTransportResultSet`: minimum-viable `java.sql.*` shims
so the rest of sqlancer's code (`SQLConnection`, `SQLQueryAdapter`,
`ComparatorHelper`, ClickHouseSchema, every oracle) consumes the
HTTP backend transparently. Unused JDBC methods throw, so coverage
gaps surface loudly. Reads UInt64 via BigInteger -> long with
clamp so legitimate values above `Long.MAX_VALUE` don't crash PQS.
* `ClickHouseOptions`: new `Transport` enum + `--transport` flag.
Default is `JDBC` for backwards compatibility; opt into HTTP with
`--transport http` when the clickhouse-jdbc 0.9.8 chunked-transfer
failures (MalformedChunkCodingException family) become a problem.
* `ClickHouseProvider.createDatabase` forks on the option: HTTP path
builds a transport with the same per-connection settings the JDBC
URL carried (max_execution_time=30, wait_end_of_query=1,
http_response_buffer_size=100MB, analyzer/LowCardinality flags),
wraps it in the Connection shim, and returns the same `SQLConnection`.
JDBC path is the existing code with two small fixes folded in:
1. `clickhouse_setting_max_execution_time=30` moved from a session
SET to the JDBC URL so it applies to the first request rather
than after a SET handshake (closes the SocketTimeout / 5-min
JDBC-read-timeout window observed in the 2026-05-19 baseline).
2. `socket_timeout` lowered to 60000 ms, paired tighter around the
30 s server cap so a stalled connection releases the thread in
~30 s instead of 5 min.
* `ClickHouseErrors`: absorb the HTTP/transport flakes that surfaced
as false-positive oracle trips in the 2026-05-19 run:
- `MalformedChunkCodingException` / `CRLF expected at end of chunk`
- `TruncatedChunkException` / `Truncated chunk (expected size:`
- `ConnectionClosedException` / `Premature end of chunk coded ...`
- `SocketTimeoutException` / `Read timed out` /
`Query request failed (attempt:` / `DataTransferException`
- `cannot be presented as long` (PQS pivot row, UInt64 > 2^63-1).
Smoke (60s, 6 threads, all 25 oracles, fresh head image):
AssertionErrors 38 (baseline) -> 1 (post-fix).
Threads-shut-down 22 -> 1.
The lone remaining trip is a real ClickHouse bug filed as #105355.
Three independent fixes for false-positive AssertionError trips that were burying real bug signal in the 2026-05-19 head-ClickHouse baseline. * ClickHouseTableGenerator: refuse `ORDER BY tuple()` for ReplacingMergeTree / SummingMergeTree. With an empty sort key these engines dedupe/sum across ALL rows, so visible cardinality drifts with the merge schedule; any oracle that compares two SELECTs against the same table sees racy row counts. Fingerprint: the false (756/126), (78/13), (5/1) TLPWhere/TLPDistinct trips in the 2026-05-19 run all had ratios matching one cartesian factor's pre-merge part count. Fall back to `ORDER BY <first column>` for the engines that need a real key; plain MergeTree is untouched. * NoRECOracle: snapshot per-table `SELECT count()` immediately before Q1 and again after Q2. If the count drifts between the two reads, the divergence is a state-drift artifact (the same merge race the table generator above is now closing off, plus any future race we haven't enumerated). Absorb with `IgnoreMeException` rather than asserting. The check is generic across DBMSes -- it's cheap (two `count()` queries per oracle iteration) and the failure mode it guards against is universal. * .claude/CLAUDE.md: document `-Xmx4g` for long sqlancer runs. The default heap fills mid-run on dense reproducer dumps and 37 of 38 saved `logs/clickhouse/database*.log` files in the 2026-05-19 48-min baseline were OOM-truncated -- the AssertionError reproducer wrote schema + INSERTs successfully but the JVM died before serialising the failing query. 4 GiB is enough for a 25-oracle composite x 6 threads x multi-hour run; the truncated reproducers were the single biggest blocker to acting on the divergences. Smoke (60 s x 6 threads x all 25 oracles, post-fix): ReplacingMergeTree/SummingMergeTree+`ORDER BY tuple()` in schemas: 0 `the counts mismatch (X and Y)!` NoREC asserts: 0 OOM-truncated reproducer files: 0 The remaining trips point at real or otherwise-uncovered bugs (one is ClickHouse #105355).
Switch the JDBC connection from `compress=false` (workaround for the ClickHouseLZ4InputStream + Apache HC ChunkedInputStream interaction bug in clickhouse-jdbc 0.9.6/0.9.8) to `compress=true&client.use_http_compression=true`. That routes response decoding through Apache Commons Compress's lz4-framed decoder via the `CompressedEntity` HTTP path, leaving the buggy native-protocol class out of the picture entirely. Verified on the wire via a local proxy: the driver now sends `Accept-Encoding: lz4` and the server responds with `Content-Encoding: lz4`, auto-appended `enable_http_compression=1` per request. A 500-row × 1 KB SELECT smoke and a 315-query sqlancer run finished with 0 thread shutdowns and no `Premature end of chunk coded message body` failures.
When --log-each-select=true (default), the per-thread reproducer file should be self-sufficient: schema setup + bulk load + failing query in order. Three oracles were silently running DDL/data statements via SQLQueryAdapter(..., useLogger=false), so when one of them tripped an AssertionError the saved logs/clickhouse/database*.log was missing prerequisite statements and could not be replayed standalone. - CERTOracle: ANALYZE TABLE - PartitionMirrorOracle: DROP/CREATE of the mirror table plus INSERT INTO mirror SELECT * FROM source - SchemaRoundtripOracle: the round-trip CREATE pair plus cleanup DROPs Each new write is gated on state.getOptions().logEachSelect() to match the existing SQLQueryAdapter logging convention. Also document the locally-patched clickhouse-jdbc 0.9.8 jar in .claude/CLAUDE.md (upstream PR ClickHouse/clickhouse-java#2857) so a fresh checkout knows the jar shipped in target/lib/ is not stock and records the rebuild recipe + verification command.
…e oracle trips Replacing/Summing/Aggregating/Collapsing tables collapse same-ORDER-BY-key rows only at merge time, so two consecutive SELECTs against the same table can see different cardinalities if a background merge fires between them. TLPWhere / NoREC / CODDTest each compare two such reads and so each periodically asserted on a transient pre-merge cardinality. ClickHouseOptimizingOracle is a thin decorator wrapping every oracle the provider hands sqlancer's main loop: on each iteration it runs OPTIMIZE TABLE <t> FINAL for every dedupe-engine table (gated by the existing supportsFinal() check) before delegating. After the first iteration the optimize is a server-side no-op (one round-trip per dedupe table) because no inserts happen during the oracle phase. Optimize errors are absorbed. Verified end-to-end: in a 25-oracle composite smoke run over head ClickHouse, system.query_log records ~1.9k successful OPTIMIZE FINAL calls and system.part_log shows 1.5k MergeParts events -- the wrapper is doing real merging work. The merge-race trips from the 2026-05-19 baseline (db13's 240/208 etc.) no longer reproduce at converged state.
Replaces clickhouse-jdbc 0.9.8 (the historical JDBC driver) with two
interchangeable transports, both requesting TabSeparatedWithNamesAndTypes
and parsed through a shared ClickHouseTsvParser:
--transport client (default): backed by com.clickhouse.client.api.Client
(clickhouse-java client-v2 0.9.8). Brings
httpclient5 + connection pooling.
Server-side settings (max_execution_time,
wait_end_of_query, http_response_buffer_size,
analyzer, suspicious-LowCardinality) attach
per-query via QuerySettings.serverSetting
so pooled connections all carry them.
--transport http : raw HttpURLConnection, zero extra deps.
Fallback when an Apache HC regression
appears under client-v2.
The dropped jdbc-v2 was the source of every infrastructure-noise reproducer
in the 2026-05-20 1-hour 25-oracle baseline (9 of 12 trips):
* UInt64 -> long overflow in PivotedQuerySynthesisOracle.fetchPivotRow
(jdbc-v2's getLong() can't represent values >= 2^63). client-v2 hands
us textual values via the TSV parser; no primitive coercion. (4 trips
-> 0)
* java.time.DateTimeException: Instant exceeds minimum or maximum from
JDBC's getTimestamp() on out-of-range DateTime64. Same fix. (1 trip -> 0)
* java.io.EOFException / chunked-decoder EOF in AbstractBinaryFormatReader.
client-v2 owns the response stream directly; the binary reader path is
not on the call chain. (1 trip -> 0)
* OutOfMemoryError in AbstractBinaryFormatReader.getString during eager
result-set materialisation. TSV parsing happens once per row and writes
to a List<String>; same data, smaller graph. (3 trips with -Xmx4g -> 0)
* ConnectionClosedException: Premature end of chunk coded message body
close-time noise -- previously suppressed by a locally-patched build of
clickhouse-jdbc 0.9.8 (the isStreamDrainException classifier in
ResultSetImpl.close). That patch is no longer needed; the local jar
burden disappears with it.
Other changes:
* pom.xml: drop clickhouse-jdbc:0.9.8:all (the uber-jar of jdbc-v2 +
client-v2 + http-client + data); add client-v2:0.9.8 directly. Also drop
log4j-slf4j2-impl -- it was an SLF4J 2.x provider competing with
slf4j-simple at every JVM start ("Class path contains multiple SLF4J
providers"). The transitively-present slf4j-log4j12 and slf4j-reload4j
are 1.x bindings that SLF4J 2.x's ServiceLoader does not pick up, so
they are inert and need no exclusion. Startup is now silent.
* ClickHouseHttpTransport: TSV parsing extracted into the shared
ClickHouseTsvParser; both transports feed bytes through it.
* ClickHouseOptions.Transport: JDBC enum value renamed to CLIENT; default
stays the new value. --transport flag description rewritten.
* ClickHouseProvider: createDatabaseJdbc + applyConnectionLevelSettings
deleted. createDatabaseClient mirrors createDatabaseHttp -- builds the
settings map, hands it to the transport, runs the DROP/CREATE/USE,
wraps the resulting Connection in SQLConnection.
* .claude/CLAUDE.md: dropped the "Local-patched clickhouse-jdbc driver"
section; replaced with "Wire transports" describing the two modes and
what JDBC-specific failure classes go away.
Verified end-to-end via a 2-minute 25-oracle 4-thread smoke against head
ClickHouse: zero JDBC-noise reproducers; 4 clean NoREC count-mismatch
trips surfaced (all attributable to sqlancer-side ORDER BY-on-non-
deterministic-expression false positives, separately).
The 2026-05-20 25-oracle smoke had 4 of 4 NoREC reproducers attributable to Replacing/Summing/Aggregating/Collapsing tables with non-deterministic ORDER BY keys (NaN-producing functions, many-duplicate single-column ORDER BY). At converged state none reproduced as oracle bugs -- they were all sqlancer false positives stemming from the engine's collapse-at-merge-time semantics. The OPTIMIZE FINAL pre-iteration wrapper added in 47b4bb1 helped catch the before-Q1 race but cannot defeat NaN-keyed ORDER BY (collapse remains non-deterministic after FINAL). Until the table generator refuses function-of-numeric ORDER BY for dedupe engines, the simpler fix is to pin to plain MergeTree. - ClickHouseTableGenerator.start(): always pick ClickHouseEngine.MergeTree. The enum values stay so downstream helpers (isMergeTreeFamily, renderEngineArgs, the Replacing/Summing-specific fallbacks) still compile; re-enabling later is just a one-line revert. - ClickHouseProvider.getTestOracle(): the @OverRide registration of ClickHouseOptimizingOracle is commented out. With supportsFinal() returning false for every emitted table, the wrapper's optimize loop is a no-op; the decorator class itself is kept for the re-enable path. Verified: 3-hour 25-oracle composite run over head ClickHouse (commit following) generated 44 databases, none of which used a dedupe engine in the saved reproducer logs.
…s to reproducer log Commit 406aa06 taught the three oracles to call state.getLogger().writeCurrent(stmt) for their DDL/INSERT prerequisites. That writes to database<N>-cur.log (the live tail file, truncated each iteration), but the persisted database<N>.log dumped on AssertionError is built from state.getStatements() via logger.logException -- and only state.getState().logStatement(...) adds to that list. So when a saved reproducer came from a non-CERT iteration (e.g. NoREC tripped after CERT had bulk-loaded the table), the persisted .log was missing the bulk INSERT, and replays produced low-cardinality data that could not reproduce the original mismatch. The fix is to call BOTH paths from each of: - CERTOracle: ensureLargeEnough's INSERT INTO ... SELECT ... FROM numbers(N) bulk-load. - PartitionMirrorOracle: dropMirror / mirrorDdl / insertMirror. - SchemaRoundtripOracle: createOff / createOn / dropOff / dropOn. Verified on a 3-hour 25-oracle run: 32 of 38 saved reproducer logs now contain the `FROM numbers(` bulk INSERT (was 0 in the 2026-05-20 run), making replays self-sufficient.
The previous parse() materialised the whole response body into a single
`byte[]` via `ByteArrayOutputStream.toByteArray()`, then decoded it to a
single `String`, then split that string into a List<String> of lines, then
parsed each line. With a result body > 2 GiB the `ByteArrayOutputStream`
hits Java's `Integer.MAX_VALUE - 8` array-length cap and the JVM throws
`OutOfMemoryError: Required array length 2147483639 + 9 is too large` --
no amount of `-Xmx` helps because the limit is a fundamental property of
single-array allocation, not heap headroom.
In the 2026-05-21 3-hour 25-oracle baseline this was 12 of 38 reproducer
trips (32%).
Refactor: parse one TSV line at a time via `BufferedReader.readLine()` over
the `InputStream`. The transient buffer is bounded by BufferedReader's
default (8 KB) regardless of total response size. The only memory still
proportional to the result is the materialised `List<List<String>>` that
the oracles ultimately consume, which scales linearly and fits in the
8 GiB heap for everything sqlancer generates.
Safety of `readLine()` over TSV: literal newlines and carriage returns are
escaped as `\n` / `\r` (two-character sequences) inside values per the
TabSeparated spec, so `BufferedReader` only splits on the row delimiter --
never inside a value.
Verified on the live server via reflective parser invocation:
1 M rows: 344 ms, 186 MB heap (was OK before)
10 M rows: 3.8 s, 1.5 GB heap (was OK before)
50 M rows: 18 s, 7.0 GB heap (was OOM at 2 GiB byte-buffer before)
300 M rows: exits with `OutOfMemoryError: Java heap space` on List
growth at -Xmx4g (a soft heap limit, not the hard array-
length cap -- still tractable by raising -Xmx, or by a
future lazy-iterator transport interface).
Also removes the now-unused `splitTsvLines(String)` helper.
A 30-min × 25-oracle × 6-thread run captured 8 reproducers; reproducing each against CH directly showed all 8 were sqlancer-side artifacts: 1. NoREC (N-K vs N) counts — TSV parser dropped single-column empty strings (3 of 8 reproducers). The skip in ClickHouseTsvParser was justified by a "trailing newline cleanup" comment, but BufferedReader .readLine() doesn't produce a phantom trailing-empty read. The skip silently lost legitimate '' rows. 2. NoREC `WHERE abs(sin(c1))` ILLEGAL_TYPE_OF_COLUMN_FOR_FILTER (1) — the WHERE-filter variant of this error wasn't in the expected-errors catalog. 3. CODDTest `'' < (true)` CANNOT_PARSE_BOOL (1) — generator composes String/Bool comparisons that CH legitimately rejects. 4. TLPCombinator content mismatches (2) — `ComparatorHelper .assumeResultSetsAreEqual` did lexical string compare on float columns, firing on 1-ULP rendering differences between equivalent forms like avgOrNull vs sum/count. 5. JoinAlgorithm OOM (1) — JVM heap exhaustion on cartesian self-join result, not a CH bug; documented in CLAUDE.md. Rather than just patch the TSV skip, replace the hand-rolled parser with client-v2's RowBinaryWithNamesAndTypesFormatReader, already on the classpath. Binary length-prefixed strings make empty vs NULL structurally distinct, the type ladder handles UInt64/BigInteger and DateTime/LocalDateTime correctly (eliminating the open PQS pivot-row caveat), and we lose the hand-rolled escape parser entirely. Two build-side adjustments needed: - pom.xml pins guava 33.4.0-jre. client-v2's reader uses ImmutableMap.Builder.buildKeepingLast() (guava 31.1+); the transitive auto-service:1.0.1 pulls 31.0.1 which lacks it, raising NoSuchMethodError on every binary SELECT. - The reader's ctor requires a timezone (refuses to build otherwise); ClickHouseRowBinaryParser passes UTC. Sqlancer only uses textual values via getString(int) so the zone choice is cosmetic. Comparator side: when set equality fails on string compare, retry with Double.parseDouble + Double.toString canonicalisation on numeric-looking entries. Non-numeric strings pass through, so non-float mismatches still fire. Errors catalog: add CANNOT_PARSE_BOOL, "Expected boolean value but get", and "(ILLEGAL_TYPE_OF_COLUMN_FOR_FILTER)". Validation: 15-min × 25-oracle × 6-thread run with the new transport ran the full duration with 0 reproducers / 0 thread shutdowns / 0 top-level errors. CLAUDE.md updated with the new wire-transport specifics, corrected --timeout-seconds description (it's the total run wall-clock, not per-statement), and removed the stale JDBC OOM note that no longer applies.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The ClickHouse provider previously only supported the five TLP variants (
Where/Distinct/GroupBy/Aggregate/Having) andNoREC, and its schema generator only emitted two column types (Int32andString). This PRNullable(T)andLowCardinality(T)columns, and every dispatch site that previouslyAssertionError'd on anything outside{Int32, String}is re-routed through the new capabilities.1. New oracles
PQS — Pivoted Query Synthesis (Rigger & Su, OSDI '20)
The classical SQLancer PQS implementation requires every AST node to expose a Java-side
getExpectedValue()that mirrors the DBMS' evaluation semantics. ClickHouse's expression AST does not provide that for most of the generated tree, and reproducing ClickHouse's coercion / NULL / arithmetic rules in Java would be open-ended. This implementation delegates rectification to the server: for each randomly-generated predicate the pivot row's values are embedded as literals in a one-row subquery and ClickHouse itself evaluates the predicate. Based on TRUE / FALSE / NULL the predicate is kept, negated, or wrapped inIS NULL. Containment is checked withINTERSECT.CERT — Cardinality Estimation Restriction Testing (Ba & Rigger, ICSE '24, DOI)
Generates a random query Q, derives a strictly more restrictive Q' from it, and asserts the cardinality restriction monotonicity property
EstCard(Q', D) ≤ EstCard(Q, D). The estimate is read fromEXPLAIN ESTIMATEand the query is never executed — this oracle tests the estimator, not the runtime. Mutations are one-directional and restrictive: AND-tighten the WHERE, drop an OR operand from an existing disjunction, or promoteALL→DISTINCT. Structural-similarity gate onEXPLAIN PLANskips runs where the plan diverges enough to make the two estimates incomparable.EXPLAIN ESTIMATEonly meaningfully responds to PK/partition-key/index filters on MergeTree tables. For non-MergeTree engines orORDER BY tuple()tables it returns an empty result; those attempts are dropped viaIgnoreMeException.CODDTest — Constant-Optimization-Driven Database System Testing (Zhang & Rigger, SIGMOD '25, DOI)
For a query Q with a sub-expression E, builds an auxiliary A that evaluates E in isolation, reads the resulting constant V, then builds a folded query F by substituting V for E in Q. Constant folding is semantics-preserving, so any discrepancy between Q's and F's result sets is a logic bug.
All three flavors from the paper land: constant expression (Section 3.1 case 1), scalar non-correlated subquery (Section 3.1 case 2), and dependent expression with a
CASEmapping (Section 3.2). The outer predicate template varies — bare comparison, AND/OR compounds, NOT — so phi passes through richer constant-folding paths than a fixedcol op phi.2. Type-system foundation v1 (replaces the flat
(ClickHouseDataType, String)model)Plan:
docs/plans/2026-05-16-001-feat-clickhouse-type-system-foundation-plan.md. Brainstorm origin:docs/brainstorms/clickhouse-type-system-foundation-requirements.md.The previous model —
ClickHouseLancerDataTypewrapping aClickHouseDataTypeenum + textual representation — could not encode parameters (soClickHouseDataType.of("Decimal(9,2)")silently normalised toDecimal) and could not express wrappers likeNullableorLowCardinality. Every oracle bailed on anything outside{Int32, String}because the constant emitters, cast paths, and oracle filters all defaulted toAssertionError.This PR introduces:
ClickHouseType— a sealed recursive ADT with four constructors:Primitive(Kind)for atomic types (Int8…Int256, UInt8…UInt256, Float32, Float64, Bool, String, UUID, Date, Date32, IPv4, IPv6),Nullable(inner)andLowCardinality(inner)withcanWraprules (no nested Nullable, no LC on Float/Bool/UUID/IPv*, etc.),Unknown(raw)as a defensive fallback the reflection parser uses for any type string it does not recognise.isNumeric(),supportsLiteralEmission(),hasNullSemantics()(true iff the outer term isNullable).ClickHouseTypeParser— a hand-written recursive-descent parser for ClickHouse type strings; recognises every primitive plusNullable(…)andLowCardinality(…)and nested combinations, falls back toUnknown(raw)for everything else. Never throws.ClickHouseUnsupportedConstant— sentinel returned byClickHouseCast.castToInt/castToReal/castToText/isTrue/convertInternalwhen a coercion is not defined for the input. Propagates through cast pipelines and raisesIgnoreMeExceptionon any attempt to compare, evaluate, or consume it. This replaces everydefault: throw new AssertionError(...)inClickHouseCast.ClickHouseLancerDataType— addsgetTypeTerm(): ClickHouseTypewhile keepinggetType(): ClickHouseDataTypereturning the root primitive (lossy compatibility for legacy callers; documented).--test-nullable-typesand--test-lowcardinality-types, both defaulttrue, threaded throughClickHouseLancerDataType.getRandom(state)andClickHouseColumn.createDummy(name, table, state).ClickHouseExpressionGenerator.generateConstantandClickHouseSchema.getConstantnow dispatch on the ADT term:Primitive(kind)→ per-kind emitter,Nullable(inner)→ with small probability emitNULL, else recurse,LowCardinality(inner)→ transparent, recurse,Unknown→IgnoreMeException.aggType != Int32 && aggType != Stringfilters atClickHouseCODDTestOracle.java:175-177,208-210are rewritten as aisFoldableColumnTerm(ClickHouseType)capability check that acceptsInt*/UInt*/Bool/Stringand anyNullable/LowCardinalitywrapper of those. The localbaseTypeName/parseType/renderLiteralstring parsers are migrated ontoClickHouseTypeParser.generatorExprFor(ClickHouseType)composes generators:Nullable(inner)wraps the inner generator withif(rand() % 10 = 0, NULL, …),LowCardinality(inner)is transparent at INSERT (ClickHouse coerces),UnknownthrowsIgnoreMeException.PARTITION BY/ORDER BY/SAMPLE BYexpression generation now validates the result and retries up to 5 times before dropping the clause:"Sorting key cannot contain constants"),"Floating point partition key is not supported").ClickHouseProvideraddsallow_suspicious_low_cardinality_types=1to the JDBC URL when the LC flag is on, andClickHouseTableGeneratoraddsallow_nullable_key=1to theMergeTreeSETTINGSclause so wrapped columns can appear in ORDER/PARTITION/SAMPLE keys.ClickHouseErrorsgains the v1 type-family patterns surfaced by live smoke runs:ILLEGAL_TYPE_OF_ARGUMENT, LowCardinality conversion/nested-type rejections,SUSPICIOUS_TYPE_FOR_LOW_CARDINALITY,Partition key contains nullable columns,allow_nullable_key,CANNOT_INSERT_NULL_IN_ORDINARY_COLUMN, and"Cannot convert NULL value to non-Nullable type".TLP three-valued logic is explicitly deferred. The third-partition composition lives in
sqlancer.common.oracle.TernaryLogicPartitioningOracleBase(cross-DBMS), and the requirements doc forbids common-module edits. TLP runs withenableNullable=truebut the partitions remain 2-valued; that's a v1.0.5 follow-up (see the plan).3. Other changes
ClickHouseSchema.ClickHouseRowValue's constructor promoted from package-private to public so PQS inoracle/pqs/can construct it for the base class' diagnostic logging..github/workflows/main.yml:-Dtest=…list for theclickhouseCI job.citus,cockroachdb,databend,datafusion,duckdb,hive,spark,hsqldb,mariadb,materialize,mysql,oceanbase,postgres,presto,sqlite,tidb,yugabyte,doris). Keepsmisc(project-wide style/PMD/Checkstyle/SpotBugs + misc unit tests + naming convention check) andclickhouse.Verification
Unit tests (89 total, all pass) — seven new test classes covering the ADT, parser, cast extension, generation surface, CODDTest filter, CERT generator, and TableGenerator validators. CI runs them on every push.
mvn -DskipTests=true -Dspotbugs.skip=true verifyclean (formatter, Checkstyle, PMD). SpotBugs is broken on JDK 25 in this repo (Unsupported class file major version 69from the bundled ASM); affects every class, not just new code.Live SQLancer smoke against ClickHouse 26.5.1.111, 4 threads × 10 minutes, all four PR-introduced oracles (TLP / NoREC / PQS / CODDTest, both type flags ON):
CREATE TABLENullable(String),Nullable(Int32)LowCardinality(Int32)The live smoke also caught three v1-introduced rejections mid-run; all are fixed in this PR:
LowCardinality(Int32)rejected by default (SUSPICIOUS_TYPE_FOR_LOW_CARDINALITY) — fixed by addingallow_suspicious_low_cardinality_types=1to the JDBC URL.ILLEGAL_COLUMN,allow_nullable_key) — fixed by addingallow_nullable_key=1to MergeTreeSETTINGS.CANNOT_INSERT_NULL_IN_ORDINARY_COLUMNfrom MATERIALIZED clauses casting NULL → non-Nullable target — added toClickHouseErrors.The final post-fix run completed all 600 seconds with zero unhandled errors.
Downstream
The ClickHouse repo will pick these up via a Dockerfile pin bump once this lands: ClickHouse/ClickHouse#104988.