fix(examples): use runnable local data paths in zinc_protein example#5698
fix(examples): use runnable local data paths in zinc_protein example#5698wanghan-iapcm wants to merge 1 commit into
Conversation
zinc_se_a_mask.json referenced its training/validation data with repo-root-relative paths (examples/zinc_protein/...), so the example only trained when launched from the repository root; from examples/zinc_protein the paths resolved to examples/zinc_protein/examples/zinc_protein/... and did not exist. Every other example uses paths relative to its own directory. Use local train_data_dp_mask/ and val_data_dp_mask/ paths, and extend test_examples.py to assert that each example's data systems resolve relative to the example directory (previously only argument-checking was verified, so unresolvable data paths passed). Fix deepmodeling#5694
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThe zinc_protein example JSON's training_data and validation_data systems paths were updated to remove a redundant directory prefix so they resolve correctly when run from the example's own directory. A new test was added to verify system data paths exist for all examples. ChangesZinc Protein Example Path Fix
Estimated code review effort: 1 (Trivial) | ~5 minutes Related issues: Suggested labels: documentation, tests Suggested reviewers: (none identified from provided context) 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #5698 +/- ##
==========================================
- Coverage 81.97% 81.78% -0.20%
==========================================
Files 959 959
Lines 105748 105748
Branches 4102 4104 +2
==========================================
- Hits 86684 86483 -201
- Misses 17573 17771 +198
- Partials 1491 1494 +3 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Problem
examples/zinc_protein/zinc_se_a_mask.jsonreferenced its training and validation data with repo-root-relative paths (examples/zinc_protein/train_data_dp_mask/andexamples/zinc_protein/val_data_dp_mask/), while the data actually lives next to the config. Every other example uses paths relative to its own directory. As a result this example only trained when launched from the repository root; run fromexamples/zinc_protein, the paths resolved toexamples/zinc_protein/examples/zinc_protein/..., which does not exist.Fix
Use local
train_data_dp_mask/andval_data_dp_mask/paths, matching the config-relative convention of the other examples.Test
source/tests/common/test_examples.pyalready loaded this config, but only ran argument-checking (normalize), so an unresolvable data path passed. This PR addstest_data_paths_exist, which asserts that every example'straining_data/validation_datasystemsresolve relative to the config's own directory. Across all listed examples this fails only for the two zinc_protein paths on the current tree and passes for every other example; after the fix it passes. It permanently guards the config-relative-path convention.Known limitation
The new check covers the single-task
input_fileslist (where this example lives); multi-task configs use a different nested data schema and are out of scope for this fix.Fix #5694
Summary by CodeRabbit