Skip to content

Add mtree vs table-diff performance benchmark (not run in CI)#127

Merged
mason-sharp merged 2 commits into
mainfrom
ace-192-test
Jun 29, 2026
Merged

Add mtree vs table-diff performance benchmark (not run in CI)#127
mason-sharp merged 2 commits into
mainfrom
ace-192-test

Conversation

@mason-sharp

Copy link
Copy Markdown
Member

Add TestMtreeVsTableDiffPerformance, an integration benchmark comparing the wall-clock cost of mtree table-diff (DiffMtree) against brute-force table-diff on a 1M-row table as divergence grows. It validates that the concurrent CDC drain and prompt LSN-based stop keep mtree competitive on large tables.

The test builds a merkle tree on synced data, then runs three rounds of deterministic, evenly-scattered divergence (0.01% / 0.1% / 1%, selected via (index * 2654435761) % 1000000 < threshold). For each it times both diff paths and logs the speedup. Only the 0.01% round asserts mtree is faster; 0.1% and 1% are informational, since scattered divergence saturates most leaf blocks at higher rates.

This test is intentionally NOT wired into .github/workflows/test.yml: it is a benchmark whose pass/fail at low divergence rides on a narrow timing margin (~1.15x), which would be flaky on shared CI runners. Run it on demand with:

go test -count=1 -v ./tests/integration -run 'TestMtreeVsTableDiffPerformance$'

It is gated by testing.Short() so it never runs in short-mode passes.

Add TestMtreeVsTableDiffPerformance, an integration benchmark comparing
the wall-clock cost of mtree table-diff (DiffMtree) against brute-force
table-diff on a 1M-row table as divergence grows. It validates that the
concurrent CDC drain and prompt LSN-based stop keep mtree competitive on
large tables.

The test builds a merkle tree on synced data, then runs three rounds of
deterministic, evenly-scattered divergence (0.01% / 0.1% / 1%, selected
via (index * 2654435761) % 1000000 < threshold). For each it times both
diff paths and logs the speedup. Only the 0.01% round asserts mtree is
faster; 0.1% and 1% are informational, since scattered divergence
saturates most leaf blocks at higher rates.

This test is intentionally NOT wired into .github/workflows/test.yml: it
is a benchmark whose pass/fail at low divergence rides on a narrow
timing margin (~1.15x), which would be flaky on shared CI runners. Run
it on demand with:

  go test -count=1 -v ./tests/integration -run 'TestMtreeVsTableDiffPerformance$'

It is gated by testing.Short() so it never runs in short-mode passes.
@mason-sharp mason-sharp requested a review from danolivo June 25, 2026 18:45
@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds TestMtreeVsTableDiffPerformance, which builds a baseline mtree on synced node data, injects deterministic node2-only divergence at three thresholds, and compares DiffMtree() with brute-force table diff while checking detected row counts.

Changes

Mtree diff performance test

Layer / File(s) Summary
Harness setup and baseline mtree
tests/integration/mtree_perf_test.go
Adds the test entrypoint, shared performance constants, node/table setup, baseline data population, one-time mtree build timing, and cleanup for the mtree, table, and generated diff artifacts.
Comparison helpers
tests/integration/mtree_perf_test.go
Adds canonical node-pair diff counting, deterministic node2-only divergence under repair mode, and timing for DiffMtree() versus a fresh brute-force table-diff task.
Threshold loop and assertions
tests/integration/mtree_perf_test.go
Runs the 0.01%, 0.1%, and 1% divergence cases, compares both diff paths after each update, asserts divergence detection, and logs completion.

Poem

I hopped through rows with timing bright,
The mtree flashed through data light.
The table-diff went thump, thump, thud,
Yet both found splashes in the mud.
🐇 Baseline set; the burrow sings.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the new mtree vs table-diff performance benchmark and notes it is excluded from CI.
Description check ✅ Passed The description matches the new integration benchmark, its divergence rounds, and its CI/short-mode gating.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ace-192-test

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@codacy-production

codacy-production Bot commented Jun 25, 2026

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 21 complexity · 0 duplication

Metric Results
Complexity 21
Duplication 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
tests/integration/mtree_perf_test.go (1)

65-76: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Tie scatterMod directly to targetRows.

The comments say scatterMod must match targetRows, but the constants can drift independently and silently invalidate the deterministic row-count assumption.

Proposed change
-		scatterMod  = 1000000
+		scatterMod  = targetRows
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/mtree_perf_test.go` around lines 65 - 76, The deterministic
row-selection setup in the performance test can drift because scatterMod is
hardcoded separately from targetRows. Update the constants in mtree_perf_test.go
so scatterMod is derived directly from targetRows in the same const block, and
keep the existing scatterMult/scatterMod comment aligned with targetRows to
preserve the exact row-count assumption.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/integration/mtree_perf_test.go`:
- Around line 222-224: The current assertions in runComparison only check that
mtree and brute-force detect some divergence, but they do not verify the
expected number of differing rows; update the test to assert the exact expected
diff counts for both mc and bc in mtree_perf_test.go using runComparison and the
r.label-driven predicate, so the test fails if either implementation misses most
divergent rows.
- Around line 120-123: The cleanup in the mtree performance test is too broad
and can delete unrelated diff reports. Update the cleanup logic in the test
around the filepath.Glob and os.Remove loop to only target diff files created by
this schema/table, or better, use the exact report paths returned by the diff
writer functions so the test only removes its own artifacts.
- Around line 88-111: Register cleanup before any setup that can fail in
mtree_perf_test: the current cleanup for mtreeTask is added only after table
creation, inserts, RunChecks, and MtreeInit, so failures can leave
customers_perf_test or partial merkle-tree state behind. Move the MtreeTeardown
registration to immediately after newMerkleTreeTask creates mtreeTask, before
calling RunChecks or MtreeInit, so the task is always cleaned up even if setup
aborts early.

---

Nitpick comments:
In `@tests/integration/mtree_perf_test.go`:
- Around line 65-76: The deterministic row-selection setup in the performance
test can drift because scatterMod is hardcoded separately from targetRows.
Update the constants in mtree_perf_test.go so scatterMod is derived directly
from targetRows in the same const block, and keep the existing
scatterMult/scatterMod comment aligned with targetRows to preserve the exact
row-count assumption.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8e930132-d08f-42be-8d2e-e393e274a34b

📥 Commits

Reviewing files that changed from the base of the PR and between 5fb45b6 and 49735ff.

📒 Files selected for processing (1)
  • tests/integration/mtree_perf_test.go

Comment thread tests/integration/mtree_perf_test.go Outdated
Comment thread tests/integration/mtree_perf_test.go Outdated
Comment thread tests/integration/mtree_perf_test.go Outdated
- Register cleanup before setup so failures can't leak table/mtree state
- Scope diff-file cleanup glob to this test's table
- Assert exact diff counts (== threshold) instead of just non-zero
- Sanitize table identifiers (pgx.Identifier) + // nosemgrep on raw SQL
- Extract an mtreePerf harness to keep functions under complexity/length limits

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/integration/mtree_perf_test.go (2)

81-84: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Pin the mtree diff output mode.

compare() hard-codes td.Output = "json", but the mtree side inherits whatever env.newMerkleTreeTask defaulted Output to. If that default is empty or changes later, the 0.01% gate will time different work on each side. Set h.task.Output explicitly and reuse it for td.Output.

Proposed change
 	h.task = env.newMerkleTreeTask(t, h.qualifiedTableName, h.nodes)
 	h.task.BlockSize = blockSize
 	h.task.MaxCpuRatio = 1.0 // parallelise build/recompute across CPUs
+	h.task.Output = "json"
@@
-	td.Output = "json"
+	td.Output = h.task.Output

Also applies to: 173-179

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/mtree_perf_test.go` around lines 81 - 84, The mtree
performance comparison is using mismatched output settings because compare()
forces td.Output to json while h.task inherits the default from
env.newMerkleTreeTask. Update the setup in mtree_perf_test.go to explicitly set
h.task.Output to the same pinned mode and reuse that value when assigning
td.Output in compare(), so both sides benchmark the same work regardless of
defaults. Use the existing symbols newMerkleTreeTask, h.task, and compare() to
locate and align the output configuration in both places mentioned by the
review.

33-40: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Derive the scatter modulus from the seeded row count.

The “exactly threshold rows” guarantee only holds while perfScatterMod matches the number of seeded rows. Since newMtreePerf() already takes rows, a later change to targetRows or helper reuse will silently invalidate the exact-count assertions. Store the modulus on mtreePerf from rows, or assert that invariant before calling divergeOnN2().

Proposed change
 type mtreePerf struct {
 	t                  *testing.T
 	ctx                context.Context
 	env                *testEnv
 	task               *mtree.MerkleTreeTask
 	tableName          string
 	qualifiedTableName string
 	tableIdent         string // pgx-sanitized "schema"."table" for raw SQL
 	nodes              []string
 	pairKey            string
+	scatterMod         int
 	blockSize          int
 	compareUnitSize    int
 }
 
 func newMtreePerf(t *testing.T, ctx context.Context, env *testEnv, tableName string, rows, blockSize, compareUnitSize int) *mtreePerf {
 	t.Helper()
 	h := &mtreePerf{
 		t:                  t,
 		ctx:                ctx,
 		env:                env,
 		tableName:          tableName,
 		qualifiedTableName: fmt.Sprintf("%s.%s", testSchema, tableName),
 		tableIdent:         pgx.Identifier{testSchema, tableName}.Sanitize(),
 		nodes:              []string{env.ServiceN1, env.ServiceN2},
+		scatterMod:         rows,
 		blockSize:          blockSize,
 		compareUnitSize:    compareUnitSize,
 	}
 }
 
 func (h *mtreePerf) divergeOnN2(threshold int) {
 	h.env.withRepairMode(h.t, h.ctx, h.env.N2Pool, func(conn *pgxpool.Conn) {
 		_, err := conn.Exec(h.ctx, fmt.Sprintf(
-			`UPDATE %s SET email = 'scatter%d_' || index::text WHERE (index * %d) %% %d < %d`,
-			h.tableIdent, threshold, perfScatterMult, perfScatterMod, threshold,
+			`UPDATE %s SET email = 'scatter%d_' || index::text WHERE (index * %d) %% %d < %d`,
+			h.tableIdent, threshold, perfScatterMult, h.scatterMod, threshold,
 		))
 		require.NoError(h.t, err, "failed to diverge rows on node2")
 	})
 }

Also applies to: 152-156

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/mtree_perf_test.go` around lines 33 - 40, The scatter
predicate currently relies on a fixed perfScatterMod constant, so the
exact-count guarantee can break if the seeded row count changes or the helper is
reused. Update mtreePerf/newMtreePerf to store the modulus from the rows value
used to seed the tree, and have divergeOnN2 use that per-instance modulus
instead of the global constant. If keeping the constant, add an explicit
invariant check before divergence to ensure the modulus matches the seeded row
count.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/integration/mtree_perf_test.go`:
- Around line 81-84: The mtree performance comparison is using mismatched output
settings because compare() forces td.Output to json while h.task inherits the
default from env.newMerkleTreeTask. Update the setup in mtree_perf_test.go to
explicitly set h.task.Output to the same pinned mode and reuse that value when
assigning td.Output in compare(), so both sides benchmark the same work
regardless of defaults. Use the existing symbols newMerkleTreeTask, h.task, and
compare() to locate and align the output configuration in both places mentioned
by the review.
- Around line 33-40: The scatter predicate currently relies on a fixed
perfScatterMod constant, so the exact-count guarantee can break if the seeded
row count changes or the helper is reused. Update mtreePerf/newMtreePerf to
store the modulus from the rows value used to seed the tree, and have
divergeOnN2 use that per-instance modulus instead of the global constant. If
keeping the constant, add an explicit invariant check before divergence to
ensure the modulus matches the seeded row count.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0c2f593d-f06b-46ee-afbc-97b7e19d97c3

📥 Commits

Reviewing files that changed from the base of the PR and between 49735ff and 8056db3.

📒 Files selected for processing (1)
  • tests/integration/mtree_perf_test.go

@mason-sharp mason-sharp merged commit 5664a04 into main Jun 29, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants