Add --chunk-concurrent-size for parallel row-copy by dnovitski · Pull Request #1688 · github/gh-ost

dnovitski · 2026-05-22T09:01:52Z

Summary

Port of #1398 by @shaohk to current master, with correctness improvements.

Adds --chunk-concurrent-size flag that allows multiple row-copy chunks to execute in parallel within each iteration using errgroup. Default is 1 (no behavior change).

Motivation

On large tables with fast storage (NVMe/SSD), the single-threaded row-copy loop can become a bottleneck. This flag enables parallel chunk copying to improve migration throughput.

Performance Results

1M rows, ADD COLUMN extra_col INT DEFAULT 0, Docker MySQL 8.0, chunk-size=1000:

Concurrency	Wall-clock	Speedup
1 (default)	30.6s	baseline
4	23.9s	22% faster
8	20.9s	32% faster

Benefits scale with table size and storage throughput.

Key Design Decisions

Two execution paths: concurrency=1 matches master's retry semantics exactly (range calc inside retry loop for hook-based chunk size reduction); concurrency>1 pre-calculates ranges under mutex for safe parallel execution
Thread-safe range calculation: CalculateNextIterationRangeEndValues(advanceCursor bool) protected by mutex, returns *IterationRangeValues struct with isolated Min/Max per goroutine
No shared mutable state in hot path: SQL warnings returned as function value (eliminates data race on shared MigrationLastInsertSQLWarnings field)
errgroup with real migration context: Proper cancellation propagation when any chunk fails
DB pool auto-sizing: Connection pool increased when --chunk-concurrent-size exceeds default pool size
Backward compatible: Default concurrency=1 preserves existing single-threaded behavior exactly

Changes from original #1398

Adapted to current master API (builder pattern, receiver names, retryBatchCopyWithHooks)
Fixed thread-safety: ApplyIterationInsertQuery returns SQL warnings instead of writing to shared field
Correct retry behavior: single-threaded path recalculates range on retry (matches master); concurrent path retries same range (INSERT IGNORE is idempotent)
Proper IncludeMinValues handling for first iteration
Uses real migration context (not context.Background())

Testing

All CI checks pass (build, lint, CodeQL, migration tests on MySQL 5.7/8.0/8.4/Percona)
Performance benchmarked (22-32% improvement with concurrency 4-8)
Data integrity verified (1M rows, checksum match)
TestRetryBatchCopyWithHooks passes (hook-based chunk size reduction works correctly)

Checklist

Tests pass
Documentation updated (doc/command-line-flags.md)
Backward compatible (default=1)

Based on work by @shaohk in #1398.

shaohk · 2026-06-07T00:08:00Z

There's a problem with concurrent chunk copying: during the chunk operation, executing SELECT ... INSERT will hold the auto-increment lock on the target table, which puts a ceiling on the concurrency gains you can achieve.

dnovitski · 2026-06-08T09:10:09Z

Thanks @shaohk — you're right to flag this, and since most tables do have an AUTO_INCREMENT column it's worth being precise about when it bites.

The AUTO-INC ceiling depends on innodb_autoinc_lock_mode:

Mode 0/1: INSERT ... SELECT is classified as a "bulk insert", so InnoDB holds the table-level AUTO-INC lock for the whole statement — and that's decided by statement type, not by whether we supply values (we do copy the source PK values, but it doesn't exempt us). Concurrent chunks serialize, so the ceiling is real here. Mode 1 was the default through MySQL 5.7.
Mode 2 (interleaved): no table-level AUTO-INC lock — just a lightweight per-allocation mutex. This is the default on MySQL 8.0+, and it's safe for gh-ost specifically because gh-ost mandates binlog_format=ROW (statement-based replication is the only reason mode 2 is otherwise flagged "unsafe"). So on a typical 8.0 deployment the AUTO-INC ceiling you describe is largely gone by default.

So I'd frame it as: the hard serialization you're describing applies under mode 0/1; under mode 2 it becomes soft contention.

That said — even under mode 2, parallel copy doesn't scale linearly, and the dominant cap is actually in gh-ost's own loop rather than MySQL. The next-chunk boundary calculation (CalculateNextIterationRangeEndValues) runs under a global mutex and includes a round-trip + indexed scan of the source, because each chunk's start depends on the previous chunk's max cursor. So boundary computation is fully serialized and only the INSERT...SELECT runs in parallel — classic Amdahl. On top of that there's secondary-index/redo contention on the ghost table and gh-ost's replication-lag/load throttling, which usually paces the migration well before insert concurrency does.

I think the right things to do here are (1) document that --chunk-concurrent-size > 1 wants innodb_autoinc_lock_mode = 2, and ideally detect mode 0/1 on an auto-increment table at startup and warn that the speedup won't materialize; (2) we need to fix the gh-ost loop such that it's not capped internally

Does that match what you were seeing?

olegkv · 2026-06-12T20:54:08Z

MySQL8 INSERTs scale much better than 20-30% when parallel workers count increases from 1 to 4-8.
My experiments show about several times speed up, just for parallel inserts into same table.
INSERTs are expected to be several times slower than range SELECTs, so selecting ranges is not going to be a bottleneck if the number of workers is 4.
auto-increment by default works well for parallel inserts, even with Mode 1 it still allows to increase throughput several times (by parallel inserts).
Something is not fully parallel here, I suspect.

@shaohk

Port of PR github#1398 by @shaohk: allows multiple row-copy chunks to execute in parallel within each iteration using errgroup. Key changes: - Add IterationRangeValues struct for thread-safe range passing - Serialize range calculation with CalculateNextIterationRangeEndValuesLock - Rewrite iterateChunks to spawn N goroutines per queue item via errgroup - Return SQL warnings from ApplyIterationInsertQuery (eliminates race on shared MigrationLastInsertSQLWarnings field) - Increase DB connection pool when concurrency > default pool size - Add --chunk-concurrent-size CLI flag (default 1, no behavior change) Co-authored-by: shaohk <shaohk@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…r, cut per-chunk round-trips The --chunk-concurrent-size parallel row-copy only ran the INSERTs in parallel; the boundary calculation and the per-chunk transaction overhead serialized work and capped the achievable speedup well below the hardware's parallel-insert ceiling. This addresses three of those caps. Prefetch range producer (overlap serialized boundary calc with INSERTs): - A single dedicated producer goroutine is the sole caller of CalculateNextIterationRangeEndValues and streams pre-computed ranges into a buffered channel, so boundary scans now overlap the parallel INSERTs of earlier work instead of stalling between batches. - Split iterateChunks into iterateChunksSingle (unchanged single-threaded semantics) and iterateChunksConcurrent. - Size the applier pool for concurrentSize + producer + headroom. #1 Per-chunk round-trips (applier.go): - ApplyIterationInsertQuery sent BEGIN / SET SESSION / INSERT / COMMIT as four round-trips per chunk. It now sends "SET SESSION ...; INSERT ..." as a single autocommit, multi-statement round-trip on one pinned connection. The applier pool already enables multiStatements + interpolateParams + autocommit; RowsAffected() reports the INSERT (last statement), and the optional SHOW WARNINGS runs on the same pinned connection. 4 round-trips -> 1. #2 Persistent worker pool (migrator.go): - Replace the per-batch errgroup+g.Wait barrier (which stalled N workers on the slowest chunk every N chunks) with continuous dispatch to an errgroup bounded by SetLimit(concurrentSize) for a 200ms time quantum. Workers stay saturated; the only barrier is at the quantum boundary. The time bound keeps executeWriteFuncs returning to apply binlog events and re-check throttling, preserving row-copy/event mutual exclusion. Checkpoints record the last contiguous completed range (not the producer's prefetched cursor), so resume restarts from fully-copied data. Benchmarked on MySQL 8.0.46 (innodb_autoinc_lock_mode=2), 2.1M rows: copy time vs the prior parallel impl improved up to 32% (chunk=200, conc=4: 22s->15s; chunk=1000, conc=8: 8s->6s). Data integrity verified by row count + checksum. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

dnovitski requested review from meiji163, rashiq and timvaillancourt as code owners May 22, 2026 09:01

dnovitski force-pushed the pr-1398-rebased branch 5 times, most recently from 18f91e8 to de32943 Compare May 22, 2026 11:23

dnovitski mentioned this pull request May 23, 2026

Multithreaded replication, parallel row-copy with DML merge, frontier filter, and heartbeat lag throttle #1665

Open

dnovitski and others added 2 commits June 14, 2026 00:54

dnovitski force-pushed the pr-1398-rebased branch from ecaeb56 to e989096 Compare June 13, 2026 22:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add --chunk-concurrent-size for parallel row-copy#1688

Add --chunk-concurrent-size for parallel row-copy#1688
dnovitski wants to merge 2 commits into
github:masterfrom
dnovitski:pr-1398-rebased

dnovitski commented May 22, 2026 •

edited

Loading

Uh oh!

shaohk commented Jun 7, 2026

Uh oh!

dnovitski commented Jun 8, 2026 •

edited

Loading

Uh oh!

olegkv commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dnovitski commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Performance Results

Key Design Decisions

Changes from original #1398

Testing

Checklist

Uh oh!

shaohk commented Jun 7, 2026

Uh oh!

dnovitski commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

olegkv commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dnovitski commented May 22, 2026 •

edited

Loading

dnovitski commented Jun 8, 2026 •

edited

Loading