Skip to content

fix: calcite optimization adds LITERAL_AGG(true)#963

Open
mbwhite wants to merge 2 commits into
substrait-io:mainfrom
mbwhite:isthmus-literal-agg
Open

fix: calcite optimization adds LITERAL_AGG(true)#963
mbwhite wants to merge 2 commits into
substrait-io:mainfrom
mbwhite:isthmus-literal-agg

Conversation

@mbwhite

@mbwhite mbwhite commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

fix: calcite optimization adds literal_agg

I was testing running various Calcite optimisations; TPC/H 16 specifically was optimised in such a way that it wasn't possible for the Isthmus SubstraitRelVisitor to handle it.

The issue was that calcite had added LITERAL_AGG(true).


In Apache Calcite, LITERAL_AGG(true) is an internal aggregate function injected during the decorrelation process (specifically when handling ANY, SOME, or IN subqueries).

Here is why it is there and what it does:

1. The Problem: Handling Empty Subqueries

In your "before" plan, there is a <> SOME(...) correlated subquery. By SQL standards, if a subquery inside a quantified comparison (SOME/ANY) returns zero rows, the entire condition evaluates to FALSE (or UNKNOWN), not NULL.

2. The Solution: Flagging Matches

When Calcite transforms the correlated subquery into a join (LogicalCorrelate / LogicalAggregate), it needs a foolproof way to know if the subquery actually produced any rows for a given correlated key ($cor0.L_PARTKEY).

  • LITERAL_AGG(true) evaluates the literal value true for every row entering the aggregate.
  • Because it is an aggregate function, if the subquery returns zero rows, LITERAL_AGG(true) will return NULL.
  • If the subquery returns one or more rows, it returns true.

3. How Calcite Uses It

Look closely at the massive LogicalFilter(condition=[OR(...)] right above the correlate in your "after" plan.

Calcite uses the output of LITERAL_AGG(true) (which maps to column $19 or $20 in that flattened row) to evaluate the exact short-circuiting logic of the SOME operator:

If it's NULL: The subquery was empty $\rightarrow$ handle as false/null.
If it's TRUE: The subquery returned data $\rightarrow$ proceed to check the actual value comparisons (like COUNT and MAX).

It essentially acts as a highly optimized, null-aware boolean flag for row existence during complex subquery unnesting.


🤖 built with assistance from AI

mbwhite added 2 commits June 26, 2026 14:27
Signed-off-by: matthew brian white <whitemat@uk.ibm.com>
Signed-off-by: matthew brian white <whitemat@uk.ibm.com>
@nielspardon nielspardon changed the title fix: calcite optimization adds LITERAL_AGG(true) fix: calcite optimization adds LITERAL_AGG(true) Jun 29, 2026
@nielspardon

Copy link
Copy Markdown
Member

for some reason your PR header had a trailing whitespace in it

@mbwhite

mbwhite commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

thanks @nielspardon

@nielspardon

Copy link
Copy Markdown
Member

Nice fix — the approach is correct: within a grouped aggregate every group is non-empty, so LITERAL_AGG(true) always yields true, and the null-when-empty semantics come from the decorrelation LEFT JOIN. Re-inserting the literal via a Project is the right call.

One bug, which I reproduced locally (PR branch, JDK 17):

The wrapper's passthrough references are typed with the whole aggregate struct. newRootStructReference(int, Type) assigns its argument verbatim as the reference type, so passing aggRel.getRecordType() types every passthrough column as the entire struct instead of the scalar field. The wrapper Project's record type comes out as:

[Struct{[I64,I64,I64,I64,Decimal]}, Struct{…}, Struct{…}, Struct{…}, Struct{…}, Bool]

i.e. all five passthroughs are the whole struct; only the re-inserted Bool literal is correct. Conversion doesn't throw (parents reference Calcite types), but the POJO carries wrong types and the wrapper subtree doesn't round-trip. Fix matches fromGroupSet just above:

-      projectExprs.add(FieldReference.newRootStructReference(i, aggRel.getRecordType()));
+      projectExprs.add(FieldReference.newInputRelReference(i, aggRel));
@@
-        projectExprs.add(
-            FieldReference.newRootStructReference(realAggIndex, aggRel.getRecordType()));
+        projectExprs.add(FieldReference.newInputRelReference(realAggIndex, aggRel));

With that change the columns become [I64,I64,I64,I64,Decimal,Bool] and everything round-trips.

The current test's structural assertions check the outer SELECT projection, not the wrapper, so they pass even with the bug. Suggested replacement that actually inspects the wrapper, plus a guard test for the groupings.size() > 1 branch (both verified: red on this PR as-is, green with the fix):

@Test
void conversionHandlesLiteralAggInsertedBySubQueryRemoveRule()
    throws SqlParseException, IOException {
  String query =
      "select e1.l_orderkey from lineitem e1 "
          + "where e1.l_quantity <> some ("
          + "  select l_quantity from lineitem e2 where e2.l_partkey = e1.l_partkey)";

  RelRoot relRoot = SubstraitSqlToCalcite.convertQuery(query, TPCH_CATALOG);
  HepPlanner hepPlanner =
      new HepPlanner(
          new HepProgramBuilder()
              .addRuleInstance(CoreRules.FILTER_SUB_QUERY_TO_CORRELATE)
              .build());
  hepPlanner.setRoot(relRoot.rel);
  RelNode decorrelated =
      RelDecorrelator.decorrelateQuery(
          hepPlanner.findBestExp(),
          RelFactories.LOGICAL_BUILDER.create(relRoot.rel.getCluster(), null));

  // Pin the trigger so a future Calcite bump can't silently stop exercising this path.
  assertTrue(containsLiteralAgg(decorrelated), "test setup no longer produces LITERAL_AGG");

  Plan.Root planRoot =
      assertDoesNotThrow(
          () ->
              SubstraitRelVisitor.convert(
                  RelRoot.of(decorrelated, relRoot.kind), EXTENSION_COLLECTION));

  // The fix inserts a Project directly over the Aggregate; inspect THAT, not the outer SELECT.
  Project wrapper =
      findProjectOverAggregate(planRoot.getInput())
          .orElseThrow(() -> new AssertionError("expected a Project wrapping the Aggregate"));

  assertTrue(
      wrapper.getExpressions().stream()
          .anyMatch(e -> e instanceof Expression.BoolLiteral && ((Expression.BoolLiteral) e).value()),
      "LITERAL_AGG(true) should be re-inserted as a boolean true literal");

  // Passthroughs must carry the scalar field type, not the whole aggregate struct.
  assertTrue(
      wrapper.getRecordType().fields().stream().noneMatch(f -> f instanceof Type.Struct),
      "wrapper columns must be scalar; fields=" + wrapper.getRecordType().fields());

  // The wrapper subtree must survive a proto round-trip with its schema intact.
  ExtensionCollector ec = new ExtensionCollector();
  io.substrait.proto.Rel proto = new RelProtoConverter(ec).toProto(wrapper);
  Rel rt = new ProtoRelConverter(ec, extensions).from(proto);
  assertEquals(wrapper.getRecordType(), rt.getRecordType(), "wrapper schema must round-trip");
}

@Test
void literalAggCombinedWithGroupingSetsIsRejected() {
  RelNode input = builder.values(new String[] {"a", "b"}, 1, 2, 3, 4).build();
  RexBuilder rexBuilder = creator.rex();
  AggregateCall literalAgg =
      AggregateCall.create(
          SqlInternalOperators.LITERAL_AGG, false, false, false,
          List.of(rexBuilder.makeLiteral(true)), List.of(), -1, null,
          RelCollations.EMPTY, typeFactory.createSqlType(SqlTypeName.BOOLEAN), "li");
  ImmutableBitSet g0 = ImmutableBitSet.of(0);
  ImmutableBitSet g1 = ImmutableBitSet.of(1);
  RelNode aggregate =
      LogicalAggregate.create(input, List.of(), g0.union(g1), List.of(g0, g1), List.of(literalAgg));

  UnsupportedOperationException ex =
      assertThrows(
          UnsupportedOperationException.class,
          () -> SubstraitRelVisitor.convert(RelRoot.of(aggregate, SqlKind.SELECT), EXTENSION_COLLECTION));
  assertTrue(ex.getMessage().contains("GROUPING SETS"), ex.getMessage());
}

private static boolean containsLiteralAgg(RelNode node) {
  if (node instanceof org.apache.calcite.rel.core.Aggregate
      && ((org.apache.calcite.rel.core.Aggregate) node)
          .getAggCallList().stream()
              .anyMatch(c -> c.getAggregation().getKind() == SqlKind.LITERAL_AGG)) {
    return true;
  }
  return node.getInputs().stream().anyMatch(OptimizerIntegrationTest::containsLiteralAgg);
}

private static Optional<Project> findProjectOverAggregate(Rel rel) {
  if (rel instanceof Project && ((Project) rel).getInput() instanceof Aggregate) {
    return Optional.of((Project) rel);
  }
  return rel.getInputs().stream()
      .map(OptimizerIntegrationTest::findProjectOverAggregate)
      .filter(Optional::isPresent).map(Optional::get).findFirst();
}

@nielspardon nielspardon self-requested a review June 30, 2026 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants