Skip to content

fix(examples): use runnable local data paths in zinc_protein example#5698

Queued
wanghan-iapcm wants to merge 1 commit into
deepmodeling:masterfrom
wanghan-iapcm:fix-zinc-protein-example-paths
Queued

fix(examples): use runnable local data paths in zinc_protein example#5698
wanghan-iapcm wants to merge 1 commit into
deepmodeling:masterfrom
wanghan-iapcm:fix-zinc-protein-example-paths

Conversation

@wanghan-iapcm

@wanghan-iapcm wanghan-iapcm commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Problem

examples/zinc_protein/zinc_se_a_mask.json referenced its training and validation data with repo-root-relative paths (examples/zinc_protein/train_data_dp_mask/ and examples/zinc_protein/val_data_dp_mask/), while the data actually lives next to the config. Every other example uses paths relative to its own directory. As a result this example only trained when launched from the repository root; run from examples/zinc_protein, the paths resolved to examples/zinc_protein/examples/zinc_protein/..., which does not exist.

Fix

Use local train_data_dp_mask/ and val_data_dp_mask/ paths, matching the config-relative convention of the other examples.

Test

source/tests/common/test_examples.py already loaded this config, but only ran argument-checking (normalize), so an unresolvable data path passed. This PR adds test_data_paths_exist, which asserts that every example's training_data/validation_data systems resolve relative to the config's own directory. Across all listed examples this fails only for the two zinc_protein paths on the current tree and passes for every other example; after the fix it passes. It permanently guards the config-relative-path convention.

Known limitation

The new check covers the single-task input_files list (where this example lives); multi-task configs use a different nested data schema and are out of scope for this fix.

Fix #5694

Summary by CodeRabbit

  • Bug Fixes
    • Updated example data paths to use shorter, relative locations so they resolve correctly from the example directory.
    • Added a check that example input files reference existing system data paths, helping catch broken paths earlier.

zinc_se_a_mask.json referenced its training/validation data with
repo-root-relative paths (examples/zinc_protein/...), so the example only
trained when launched from the repository root; from examples/zinc_protein
the paths resolved to examples/zinc_protein/examples/zinc_protein/... and did
not exist. Every other example uses paths relative to its own directory.

Use local train_data_dp_mask/ and val_data_dp_mask/ paths, and extend
test_examples.py to assert that each example's data systems resolve relative
to the example directory (previously only argument-checking was verified, so
unresolvable data paths passed).

Fix deepmodeling#5694
@dosubot dosubot Bot added the bug label Jul 1, 2026
@wanghan-iapcm wanghan-iapcm requested a review from njzjz July 1, 2026 05:27
@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8a6dca92-dec2-4f79-bb75-9d94aae4fb39

📥 Commits

Reviewing files that changed from the base of the PR and between 0e5c170 and 147c8bd.

📒 Files selected for processing (2)
  • examples/zinc_protein/zinc_se_a_mask.json
  • source/tests/common/test_examples.py

📝 Walkthrough

Walkthrough

The zinc_protein example JSON's training_data and validation_data systems paths were updated to remove a redundant directory prefix so they resolve correctly when run from the example's own directory. A new test was added to verify system data paths exist for all examples.

Changes

Zinc Protein Example Path Fix

Layer / File(s) Summary
Corrected example data paths
examples/zinc_protein/zinc_se_a_mask.json
Removed the examples/zinc_protein/ prefix from training_data.systems and validation_data.systems paths so they resolve relative to the example directory.
Path-existence validation test
source/tests/common/test_examples.py
Added test_data_paths_exist, which loads each example's input.json, reads training_data/validation_data systems, and asserts each referenced path exists relative to the example directory.

Estimated code review effort: 1 (Trivial) | ~5 minutes

Related issues: #5694 (directly addresses the repo-root-relative path issue in the zinc_protein example).

Suggested labels: documentation, tests

Suggested reviewers: (none identified from provided context)

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: making the zinc_protein example use local, runnable data paths.
Linked Issues check ✅ Passed The PR fixes the reported repo-root-relative paths by switching to local example-relative paths and adds a guard test.
Out of Scope Changes check ✅ Passed The new test is directly related to the path-fix objective and no unrelated changes are evident.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@wanghan-iapcm wanghan-iapcm enabled auto-merge July 1, 2026 06:09
@wanghan-iapcm wanghan-iapcm requested review from njzjz and removed request for njzjz July 1, 2026 06:11
@codecov

codecov Bot commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.78%. Comparing base (0e5c170) to head (147c8bd).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5698      +/-   ##
==========================================
- Coverage   81.97%   81.78%   -0.20%     
==========================================
  Files         959      959              
  Lines      105748   105748              
  Branches     4102     4104       +2     
==========================================
- Hits        86684    86483     -201     
- Misses      17573    17771     +198     
- Partials     1491     1494       +3     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@wanghan-iapcm wanghan-iapcm added this pull request to the merge queue Jul 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Code scan] Use runnable local data paths in zinc_protein example

2 participants