Add mtree vs table-diff performance benchmark (not run in CI)#127
Conversation
Add TestMtreeVsTableDiffPerformance, an integration benchmark comparing the wall-clock cost of mtree table-diff (DiffMtree) against brute-force table-diff on a 1M-row table as divergence grows. It validates that the concurrent CDC drain and prompt LSN-based stop keep mtree competitive on large tables. The test builds a merkle tree on synced data, then runs three rounds of deterministic, evenly-scattered divergence (0.01% / 0.1% / 1%, selected via (index * 2654435761) % 1000000 < threshold). For each it times both diff paths and logs the speedup. Only the 0.01% round asserts mtree is faster; 0.1% and 1% are informational, since scattered divergence saturates most leaf blocks at higher rates. This test is intentionally NOT wired into .github/workflows/test.yml: it is a benchmark whose pass/fail at low divergence rides on a narrow timing margin (~1.15x), which would be flaky on shared CI runners. Run it on demand with: go test -count=1 -v ./tests/integration -run 'TestMtreeVsTableDiffPerformance$' It is gated by testing.Short() so it never runs in short-mode passes.
📝 WalkthroughWalkthroughAdds ChangesMtree diff performance test
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 21 |
| Duplication | 0 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
tests/integration/mtree_perf_test.go (1)
65-76: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winTie
scatterModdirectly totargetRows.The comments say
scatterModmust matchtargetRows, but the constants can drift independently and silently invalidate the deterministic row-count assumption.Proposed change
- scatterMod = 1000000 + scatterMod = targetRows🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/mtree_perf_test.go` around lines 65 - 76, The deterministic row-selection setup in the performance test can drift because scatterMod is hardcoded separately from targetRows. Update the constants in mtree_perf_test.go so scatterMod is derived directly from targetRows in the same const block, and keep the existing scatterMult/scatterMod comment aligned with targetRows to preserve the exact row-count assumption.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/integration/mtree_perf_test.go`:
- Around line 222-224: The current assertions in runComparison only check that
mtree and brute-force detect some divergence, but they do not verify the
expected number of differing rows; update the test to assert the exact expected
diff counts for both mc and bc in mtree_perf_test.go using runComparison and the
r.label-driven predicate, so the test fails if either implementation misses most
divergent rows.
- Around line 120-123: The cleanup in the mtree performance test is too broad
and can delete unrelated diff reports. Update the cleanup logic in the test
around the filepath.Glob and os.Remove loop to only target diff files created by
this schema/table, or better, use the exact report paths returned by the diff
writer functions so the test only removes its own artifacts.
- Around line 88-111: Register cleanup before any setup that can fail in
mtree_perf_test: the current cleanup for mtreeTask is added only after table
creation, inserts, RunChecks, and MtreeInit, so failures can leave
customers_perf_test or partial merkle-tree state behind. Move the MtreeTeardown
registration to immediately after newMerkleTreeTask creates mtreeTask, before
calling RunChecks or MtreeInit, so the task is always cleaned up even if setup
aborts early.
---
Nitpick comments:
In `@tests/integration/mtree_perf_test.go`:
- Around line 65-76: The deterministic row-selection setup in the performance
test can drift because scatterMod is hardcoded separately from targetRows.
Update the constants in mtree_perf_test.go so scatterMod is derived directly
from targetRows in the same const block, and keep the existing
scatterMult/scatterMod comment aligned with targetRows to preserve the exact
row-count assumption.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8e930132-d08f-42be-8d2e-e393e274a34b
📒 Files selected for processing (1)
tests/integration/mtree_perf_test.go
- Register cleanup before setup so failures can't leak table/mtree state - Scope diff-file cleanup glob to this test's table - Assert exact diff counts (== threshold) instead of just non-zero - Sanitize table identifiers (pgx.Identifier) + // nosemgrep on raw SQL - Extract an mtreePerf harness to keep functions under complexity/length limits
There was a problem hiding this comment.
🧹 Nitpick comments (2)
tests/integration/mtree_perf_test.go (2)
81-84: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winPin the mtree diff output mode.
compare()hard-codestd.Output = "json", but the mtree side inherits whateverenv.newMerkleTreeTaskdefaultedOutputto. If that default is empty or changes later, the 0.01% gate will time different work on each side. Seth.task.Outputexplicitly and reuse it fortd.Output.Proposed change
h.task = env.newMerkleTreeTask(t, h.qualifiedTableName, h.nodes) h.task.BlockSize = blockSize h.task.MaxCpuRatio = 1.0 // parallelise build/recompute across CPUs + h.task.Output = "json" @@ - td.Output = "json" + td.Output = h.task.OutputAlso applies to: 173-179
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/mtree_perf_test.go` around lines 81 - 84, The mtree performance comparison is using mismatched output settings because compare() forces td.Output to json while h.task inherits the default from env.newMerkleTreeTask. Update the setup in mtree_perf_test.go to explicitly set h.task.Output to the same pinned mode and reuse that value when assigning td.Output in compare(), so both sides benchmark the same work regardless of defaults. Use the existing symbols newMerkleTreeTask, h.task, and compare() to locate and align the output configuration in both places mentioned by the review.
33-40: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winDerive the scatter modulus from the seeded row count.
The “exactly
thresholdrows” guarantee only holds whileperfScatterModmatches the number of seeded rows. SincenewMtreePerf()already takesrows, a later change totargetRowsor helper reuse will silently invalidate the exact-count assertions. Store the modulus onmtreePerffromrows, or assert that invariant before callingdivergeOnN2().Proposed change
type mtreePerf struct { t *testing.T ctx context.Context env *testEnv task *mtree.MerkleTreeTask tableName string qualifiedTableName string tableIdent string // pgx-sanitized "schema"."table" for raw SQL nodes []string pairKey string + scatterMod int blockSize int compareUnitSize int } func newMtreePerf(t *testing.T, ctx context.Context, env *testEnv, tableName string, rows, blockSize, compareUnitSize int) *mtreePerf { t.Helper() h := &mtreePerf{ t: t, ctx: ctx, env: env, tableName: tableName, qualifiedTableName: fmt.Sprintf("%s.%s", testSchema, tableName), tableIdent: pgx.Identifier{testSchema, tableName}.Sanitize(), nodes: []string{env.ServiceN1, env.ServiceN2}, + scatterMod: rows, blockSize: blockSize, compareUnitSize: compareUnitSize, } } func (h *mtreePerf) divergeOnN2(threshold int) { h.env.withRepairMode(h.t, h.ctx, h.env.N2Pool, func(conn *pgxpool.Conn) { _, err := conn.Exec(h.ctx, fmt.Sprintf( - `UPDATE %s SET email = 'scatter%d_' || index::text WHERE (index * %d) %% %d < %d`, - h.tableIdent, threshold, perfScatterMult, perfScatterMod, threshold, + `UPDATE %s SET email = 'scatter%d_' || index::text WHERE (index * %d) %% %d < %d`, + h.tableIdent, threshold, perfScatterMult, h.scatterMod, threshold, )) require.NoError(h.t, err, "failed to diverge rows on node2") }) }Also applies to: 152-156
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/mtree_perf_test.go` around lines 33 - 40, The scatter predicate currently relies on a fixed perfScatterMod constant, so the exact-count guarantee can break if the seeded row count changes or the helper is reused. Update mtreePerf/newMtreePerf to store the modulus from the rows value used to seed the tree, and have divergeOnN2 use that per-instance modulus instead of the global constant. If keeping the constant, add an explicit invariant check before divergence to ensure the modulus matches the seeded row count.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/integration/mtree_perf_test.go`:
- Around line 81-84: The mtree performance comparison is using mismatched output
settings because compare() forces td.Output to json while h.task inherits the
default from env.newMerkleTreeTask. Update the setup in mtree_perf_test.go to
explicitly set h.task.Output to the same pinned mode and reuse that value when
assigning td.Output in compare(), so both sides benchmark the same work
regardless of defaults. Use the existing symbols newMerkleTreeTask, h.task, and
compare() to locate and align the output configuration in both places mentioned
by the review.
- Around line 33-40: The scatter predicate currently relies on a fixed
perfScatterMod constant, so the exact-count guarantee can break if the seeded
row count changes or the helper is reused. Update mtreePerf/newMtreePerf to
store the modulus from the rows value used to seed the tree, and have
divergeOnN2 use that per-instance modulus instead of the global constant. If
keeping the constant, add an explicit invariant check before divergence to
ensure the modulus matches the seeded row count.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0c2f593d-f06b-46ee-afbc-97b7e19d97c3
📒 Files selected for processing (1)
tests/integration/mtree_perf_test.go
Add TestMtreeVsTableDiffPerformance, an integration benchmark comparing the wall-clock cost of mtree table-diff (DiffMtree) against brute-force table-diff on a 1M-row table as divergence grows. It validates that the concurrent CDC drain and prompt LSN-based stop keep mtree competitive on large tables.
The test builds a merkle tree on synced data, then runs three rounds of deterministic, evenly-scattered divergence (0.01% / 0.1% / 1%, selected via (index * 2654435761) % 1000000 < threshold). For each it times both diff paths and logs the speedup. Only the 0.01% round asserts mtree is faster; 0.1% and 1% are informational, since scattered divergence saturates most leaf blocks at higher rates.
This test is intentionally NOT wired into .github/workflows/test.yml: it is a benchmark whose pass/fail at low divergence rides on a narrow timing margin (~1.15x), which would be flaky on shared CI runners. Run it on demand with:
go test -count=1 -v ./tests/integration -run 'TestMtreeVsTableDiffPerformance$'
It is gated by testing.Short() so it never runs in short-mode passes.