Skip to content

perf: skip merge on subtrees the memo proves are unaffected#35

Merged
AmitMY merged 1 commit into
mainfrom
perf/merge-lazy-out
Jun 26, 2026
Merged

perf: skip merge on subtrees the memo proves are unaffected#35
AmitMY merged 1 commit into
mainfrom
perf/merge-lazy-out

Conversation

@AmitMY

@AmitMY AmitMY commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Reworked from the earlier lazy-out version (which read messy and only gave ~5%) to reuse the get_merges memo as an "is this subtree affected?" check, per review.

_merges (when memoized — the default) already lists every mergeable subsequence in the subtree, including nested ones (yield from child.get_merges()). So if the chosen merge isn't among them, this subtree contains it nowhere and merge() can return self immediately — skipping the scan, the child recursion, and the rebuild in one membership check:

cached = getattr(self, "_merges", None)
if cached is not None and merge not in cached:
    return self
# ... otherwise the plain #29 scan/rebuild ...

Falls back to the scan when the node isn't memoized (e.g. TRADE_MEMORY_FOR_SPEED=False, or pre-seeded merges before any get_merges), so behavior is unchanged there. Output identical; 137 tests pass (rebased on current main, which now has #32/#33/#34/#36).

Measured back-to-back, 3 identical rounds (50 texts / 200 merges; time = best of 3):

main this PR Δ time Δ mem
BPE 0.655s / 1.89MB 0.428s / 2.14MB −35% +0.25MB
BNE n=4 0.966s / 4.80MB 0.845s / 3.63MB −13% −1.17MB
Boundless 0.765s / 1.99MB 0.510s / 2.28MB −33% +0.29MB

Far bigger than the original lazy-out (−4–7%): an unaffected word now skips the scan, recursion, and rebuild in a single C-level in check. Memory is mixed and small (BNE −1.2MB from avoided transient rebuilds; BPE/Boundless +~0.25MB).

🤖 Generated with Claude Code

Comment thread complex_tokenization/graph.py Outdated
out.extend(nodes[i:])
if len(out) == 1:
return out[0]
merged_nodes = tuple(node.merge(token, merge) for node in out)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i love the idea, but think we might need a better implementation - this is a bit messy and not straightforward

@AmitMY

AmitMY commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Fair — I tried two cleaner rewrites and benchmarked both against main; both lose the speedup:

main inline (current) _contains pre-check extracted _replace_top
BPE 0.841s 0.796s (−6.9%) 0.856s (~flat) 0.840s (~flat)
BNE 1.443s 1.377s (−4.8%) 1.440s (~flat) 1.448s (~flat)
Boundless 0.991s 0.907s (−4.0%) 0.957s (−3%) 0.945s (~flat)

The win comes specifically from the single inline pass: the _contains version double-scans the words that do contain the merge, and extracting a helper adds a method call per word per merge — both erase the gain.

So it's a real tension: the speedup needs the inline scan, which is the part that reads as messy. Given this is the smallest of the five wins and resists a clean expression, my recommendation is to close this one rather than add code that's either messy or doesn't help. The other four (slots, skip-flatten, argmax-items, interning) carry the bulk of the improvement. Happy to close it unless you'd rather keep the inline version as-is.

When get_merges is memoized, _merges lists every mergeable subsequence in
the subtree. If the chosen merge isn't among them, this subtree contains
it nowhere, so merge() returns self immediately — skipping the scan, the
child recursion, and the rebuild. Falls back to the plain scan when the
node isn't memoized.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@AmitMY AmitMY force-pushed the perf/merge-lazy-out branch from 5babe23 to c4d8cee Compare June 26, 2026 13:56
@AmitMY AmitMY changed the title perf: defer out-list construction in merge until first match perf: skip merge on subtrees the memo proves are unaffected Jun 26, 2026
@AmitMY AmitMY merged commit 2f6cdf0 into main Jun 26, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant