test(amber): move state-mat e2e into amber-integration#5682
Open
Yicong-Huang wants to merge 5 commits into
Open
test(amber): move state-mat e2e into amber-integration#5682Yicong-Huang wants to merge 5 commits into
Yicong-Huang wants to merge 5 commits into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5682 +/- ##
============================================
- Coverage 53.09% 52.74% -0.35%
+ Complexity 2546 2543 -3
============================================
Files 1076 1090 +14
Lines 41762 42150 +388
Branches 4513 4529 +16
============================================
+ Hits 22174 22233 +59
- Misses 18278 18592 +314
- Partials 1310 1325 +15
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Contributor
✅ No material benchmark regressions detected🟢 13 better · 🔴 0 worse · ⚪ 2 noise (<±5%) · 0 without baseline
Baseline detailsLatest main
Raw CSVconfig_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,443.54,200,128000,451,0.275,22207.79,29681.99,29681.99
1,100,10,64,20,2112.87,2000,1280000,947,0.578,106607.71,137774.55,137774.55
2,1000,10,64,20,17798.41,20000,12800000,1124,0.686,889964.31,953406.43,953406.43 |
b9a13f2 to
9f13e26
Compare
Drop the macOS-only `pyamber-state-materialization-mac` diagnostic job and the sqlite-backed `SqlCatalog` override it relied on. Mark the two e2e tests with @pytest.mark.integration so they run in the `amber-integration` ubuntu job, which already provisions postgres + iceberg catalog DB + MinIO + Lakekeeper and runs `pytest -m integration` as its last step. Catalog initialization now mirrors test_iceberg_document.py:45 — postgres-backed JdbcCatalog with a tempdir warehouse — so we exercise the real prod catalog code path instead of an sqlite shim. The `StorageConfig.initialize` call is wrapped in a session-autouse fixture (rather than at module import) so it co-exists with test_iceberg_document.py's import-time initialize regardless of pytest collection order. The unit-style `test_process_start_channel_persists_produce_state_on_start_output` that the mac job also ran is unaffected: it monkeypatches the output manager and is already picked up by the regular `pyamber` job's `pytest -m "not integration"` step. Closes apache#5681
9f13e26 to
009d9e4
Compare
Signed-off-by: Yicong Huang <17627829+Yicong-Huang@users.noreply.github.com>
The macos-latest matrix entry was failing at "Set up job" / "docker:
command not found" because GitHub-hosted macOS runners have no
Docker and `services:` containers are Linux-only. Provision the same
stack natively on macOS:
* postgres@16 via brew, with a `postgres` superuser created so the
psql steps stay OS-agnostic;
* minio via `brew install minio`, backgrounded with nohup;
* lakekeeper from the upstream aarch64-apple-darwin tarball
(published v0.11.0 onward — same version as the Linux docker tag);
* minio-mc via brew for the warehouse bucket-init step;
* protoc release asset switched on \$RUNNER_OS.
The job-level `services.postgres` block is also removed since it's a
Linux-only feature; both branches now start postgres via a step.
Each docker-dependent step branches on \$RUNNER_OS inside its `run:`
script so the YAML keeps one step per concept rather than
Linux/macOS pairs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
protoc 3.19.4 predates upstream arm64-mac builds — the previous osx-aarch_64 URL 404'd. GitHub's Apple Silicon macOS runners ship Rosetta 2 preinstalled, so the x86_64 binary runs transparently. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
aglinxinyuan
left a comment
Contributor
There was a problem hiding this comment.
The change is correct. I will review it after the CI passes.
protoc's import resolver walks each -I in order. With `-I=/usr/local/include` listed first, ambermessage.proto's `import "org/apache/.../controlcommands.proto"` caused protoc to stat `/usr/local/include/org/apache/...`. On Linux that path returns ENOENT and protoc falls through to the next -I; on macOS the unzip-created /usr/local/include lacks +x for the runner user, so the stat returns EACCES and protoc aborts. Widen the existing chmod from `/usr/local/include/google` to `/usr/local/include` so the parent dir is traversable on both OSes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this PR?
Drops the macOS-only
pyamber-state-materialization-macdiagnostic job and the sqlite-backedSqlCatalogoverride it relied on. The two state-materialization e2e tests now run under the existingamber-integrationubuntu job, which already provisions postgres + iceberg catalog DB + MinIO + Lakekeeper and runspytest -m integrationas its last step.Concretely:
pyamber-state-materialization-macmacOS job (build.yml:742)SqlCatalog(sqlite) injected in module fixtureJdbcCatalog, matchingtest_iceberg_document.py:45pytest -sv <path>from that job@pytest.mark.integration+ picked up byamber-integration'spytest -m integrationStorageConfig.initializeis wrapped in a session-autouse fixture (rather than called at module import time) so it co-exists withtest_iceberg_document.py's import-timeinitializeregardless of pytest collection order.The unit-style
test_process_start_channel_persists_produce_state_on_start_outputthat the mac job also ran is untouched: it monkeypatches the output manager and is already picked up by the regularpyamberjob'spytest -m "not integration"step.Any related issues, documentation, discussions?
Closes #5681.
How was this PR tested?
Locally against the existing texera-dev infra (postgres on 5432 with
texera_iceberg_catalogschema initialized viasql/iceberg_postgres_catalog.sql):And the regular pyamber suite still green:
In CI, the
amber-integrationjob picks these tests up automatically because they're marked@pytest.mark.integration. The mac job is gone so PRs no longer pay for a separate macOS runner just for these two tests.Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.7)