Skip to content

fix(tf): accept standard checkpoint inputs in change-bias#5702

Queued
wanghan-iapcm wants to merge 1 commit into
deepmodeling:masterfrom
wanghan-iapcm:fix-tf-change-bias-ckpt-input
Queued

fix(tf): accept standard checkpoint inputs in change-bias#5702
wanghan-iapcm wants to merge 1 commit into
deepmodeling:masterfrom
wanghan-iapcm:fix-tf-change-bias-ckpt-input

Conversation

@wanghan-iapcm

@wanghan-iapcm wanghan-iapcm commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Problem

The TensorFlow change-bias dispatcher only accepts inputs ending in .pb, .pbtxt, .ckpt, .meta, .data, or .index, and rejects everything else with RuntimeError("The model provided must be a checkpoint file or frozen model file (.pb)").

However, the CLI examples and the frozen-model fallback message recommend passing a checkpoint directory, and real TensorFlow checkpoint prefixes commonly look like model.ckpt-1000. A checkpoint directory has no recognized suffix, and a model.ckpt-1000 prefix ends in -1000, so both are rejected before _change_bias_checkpoint_file() — which already resolves the checkpoint via the checkpoint state file — can run.

Fix

Route suffix-less inputs to the checkpoint handler when a TensorFlow checkpoint state file is present in the effective directory (the input itself if it is a directory, otherwise its parent). Inputs without such a state file still raise the original error. _change_bias_checkpoint_file() now also resolves checkpoint_dir correctly when the input is a directory.

Test

source/tests/tf/test_change_bias.py previously exercised only .pb, .ckpt, and bad-suffix inputs, never a bare directory or a ckpt-<step> prefix. Two new tests mock _change_bias_checkpoint_file and assert it is invoked for a model.ckpt-1000 prefix and for a checkpoint directory that contain a checkpoint state file. On the current dispatcher both raise the "checkpoint file or frozen model file" RuntimeError before reaching the handler; after the fix both route correctly. The existing rejection tests (e.g. a model.xyz file with no checkpoint state file) still raise.

Fix #5683

Summary by CodeRabbit

  • Bug Fixes

    • Improved handling of checkpoint inputs so bias changes now work reliably for both checkpoint file prefixes and checkpoint directories.
    • Unrecognized input paths are now treated as checkpoint-based inputs only when a valid checkpoint state is present, reducing incorrect routing.
  • Tests

    • Added coverage for checkpoint prefix and directory cases to verify the correct bias-change path is used.

The change-bias dispatcher only routed inputs ending in .pb/.pbtxt/.ckpt/
.meta/.data/.index and rejected everything else with "must be a checkpoint
file or frozen model file (.pb)". Standard TensorFlow checkpoint prefixes such
as model.ckpt-1000 carry no recognized suffix, and checkpoint directories have
none either, so the inputs recommended by the CLI docs and the frozen-model
fallback message were rejected before _change_bias_checkpoint_file (which
already reads the checkpoint state file) could run.

Route directory and suffix-less prefix inputs to the checkpoint handler when a
TensorFlow "checkpoint" state file is present in the effective directory, and
make _change_bias_checkpoint_file resolve checkpoint_dir for a directory input.
Add dispatch tests for a step-suffixed prefix and a checkpoint directory.

Fix deepmodeling#5683
@dosubot dosubot Bot added the bug label Jul 1, 2026
@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b40f37b4-02a1-4cc7-89aa-c93d0f65950d

📥 Commits

Reviewing files that changed from the base of the PR and between b22a3c3 and 4027229.

📒 Files selected for processing (2)
  • deepmd/tf/entrypoints/change_bias.py
  • source/tests/tf/test_change_bias.py

📝 Walkthrough

Walkthrough

The change_bias() entrypoint now recognizes checkpoint directories and checkpoint prefixes with step suffixes (e.g., model.ckpt-1000) by checking for a TensorFlow checkpoint state file, routing such inputs to _change_bias_checkpoint_file(). That helper's checkpoint_dir derivation was also updated to handle directory inputs. Corresponding unit tests were added.

Changes

Checkpoint Input Handling

Layer / File(s) Summary
Fallback routing for checkpoint directories and prefixes
deepmd/tf/entrypoints/change_bias.py
Adds a fallback branch for unrecognized INPUT suffixes that checks for a checkpoint state file and routes to _change_bias_checkpoint_file(); updates checkpoint_dir derivation to support both directory and prefix inputs.
Tests for new checkpoint routing
source/tests/tf/test_change_bias.py
Imports the module via importlib to patch internal helpers, and adds tests verifying checkpoint prefix and checkpoint directory inputs are routed to _change_bias_checkpoint_file.

Estimated code review effort: 2 (Simple) | ~15 minutes

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant change_bias
  participant FileSystem
  participant ChangeBiasCheckpointFile as _change_bias_checkpoint_file

  User->>change_bias: INPUT (checkpoint dir or prefix)
  change_bias->>change_bias: check known file suffixes
  alt suffix unrecognized
    change_bias->>FileSystem: check checkpoint_dir/checkpoint exists
    FileSystem-->>change_bias: exists
    change_bias->>ChangeBiasCheckpointFile: forward INPUT and args
    ChangeBiasCheckpointFile->>ChangeBiasCheckpointFile: derive checkpoint_dir (dir or parent)
  else suffix recognized
    change_bias->>change_bias: existing dispatch logic
  end
Loading

Suggested labels: Python

Suggested reviewers: njzjz, iProzd

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main fix: accepting standard TensorFlow checkpoint inputs in change-bias.
Linked Issues check ✅ Passed The code now accepts checkpoint directories and prefixes when checkpoint state exists, matching the linked issue's requirements.
Out of Scope Changes check ✅ Passed The changes stay focused on change-bias checkpoint handling and its tests, with no obvious unrelated additions.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@wanghan-iapcm wanghan-iapcm requested a review from njzjz July 1, 2026 08:11
@github-actions github-actions Bot added the Python label Jul 1, 2026
@codecov

codecov Bot commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.70%. Comparing base (0e5c170) to head (4027229).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5702      +/-   ##
==========================================
- Coverage   81.97%   81.70%   -0.27%     
==========================================
  Files         959      966       +7     
  Lines      105748   106151     +403     
  Branches     4102     4139      +37     
==========================================
+ Hits        86684    86731      +47     
- Misses      17573    17901     +328     
- Partials     1491     1519      +28     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@njzjz-bot njzjz-bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. The fallback dispatch now accepts both checkpoint directories and standard TensorFlow checkpoint prefixes when a checkpoint state file is present, while preserving the existing handling for explicit checkpoint component files. The added tests cover both newly accepted input forms. CI is green.

Reviewed by OpenClaw 2026.6.8 (844f405)
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)

@njzjz njzjz left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@njzjz njzjz added this pull request to the merge queue Jul 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Code scan] Accept standard TensorFlow checkpoint inputs in change-bias

3 participants