Skip to content

fix: propagate frame-level data modifier failures#5709

Merged
njzjz merged 1 commit into
deepmodeling:masterfrom
wanghan-iapcm:fix-data-modifier-future-result
Jul 2, 2026
Merged

fix: propagate frame-level data modifier failures#5709
njzjz merged 1 commit into
deepmodeling:masterfrom
wanghan-iapcm:fix-data-modifier-future-result

Conversation

@wanghan-iapcm

@wanghan-iapcm wanghan-iapcm commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

Problem

DeepmdData.get_single_frame() runs the data modifier through a one-off ThreadPoolExecutor but never calls future.result(). The context manager waits for the submitted task to finish, but any exception raised inside modifier.modify_data(...) stays stored on the Future and is never surfaced. As a result, frame-level data loading silently ignores modifier failures: callers receive the unmodified frame, and when caching is enabled that bad frame can be cached for future use. Other code paths call the modifier directly and do propagate errors, so the behavior was inconsistent.

Fix

Call future.result() before leaving the modifier block so an exception raised inside the modifier propagates instead of being swallowed (and the unmodified frame cached).

Test

A new test constructs a DeepmdData with a modifier whose modify_data always raises, and asserts that get_single_frame re-raises the error and does not cache the failed frame. On the current code the error is swallowed (no exception, frame cached); after the fix the error propagates.

Fix #5690

Summary by CodeRabbit

  • Bug Fixes
    • Improved error handling when preparing a single frame so failures from data modifiers are no longer hidden.
    • Prevented invalid frames from being saved and reused after a modifier error.
  • Tests
    • Added coverage to verify modifier failures are reported correctly and do not populate the frame cache.

DeepmdData.get_single_frame ran the data modifier through a one-off
ThreadPoolExecutor but never called future.result(). The context manager waited
for the task to finish, but an exception raised inside modify_data stayed stored
on the Future and was never surfaced, so callers received the unmodified frame
and, with caching enabled, that bad frame could be cached for future use.

Call future.result() before leaving the modifier block so modifier exceptions
propagate. Add a test with a raising modifier asserting get_single_frame
re-raises and does not cache the failed frame.

Fix deepmodeling#5690
@wanghan-iapcm wanghan-iapcm requested a review from njzjz July 2, 2026 01:39
@dosubot dosubot Bot added the bug label Jul 2, 2026
@github-actions github-actions Bot added the Python label Jul 2, 2026
@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

DeepmdData.get_single_frame now calls future.result() after submitting the modifier's modify_data task in a ThreadPoolExecutor, propagating exceptions instead of silently ignoring them. A new unit test verifies the exception propagates and the failing frame is not cached.

Changes

Modifier error propagation

Layer / File(s) Summary
Force exception propagation from modifier future
deepmd/utils/data.py
get_single_frame now calls future.result() after submitting self.modifier.modify_data(...), ensuring exceptions raised in the modifier propagate instead of being silently dropped.
Test for modifier failure propagation
source/tests/common/test_deepmd_data.py
Adds _RaisingModifier and TestDeepmdDataModifierError to confirm ValueError from a failing modifier propagates and the frame is not stored in _modified_frame_cache.

Estimated code review effort: 1 (Trivial) | ~5 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: propagating frame-level data modifier failures.
Linked Issues check ✅ Passed The patch matches issue #5690 by surfacing modifier exceptions and adding a test that prevents caching failed frames.
Out of Scope Changes check ✅ Passed The changes stay focused on exception propagation and its regression test, with no unrelated scope added.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
deepmd/utils/data.py (1)

526-535: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick win

Consider dropping the single-task ThreadPoolExecutor.

Only one task is submitted here, then immediately awaited via future.result() — the executor buys no parallelism, just thread-pool setup/teardown cost per get_single_frame call (a hot path via preload_and_modify_all_data_torch). Other modifier call sites in this class (get_test, _load_batch_set) call modify_data directly.

♻️ Proposed simplification
         if self.modifier is not None:
-            with ThreadPoolExecutor(max_workers=num_worker) as executor:
-                # Apply modifier if it exists
-                future = executor.submit(
-                    self.modifier.modify_data,
-                    frame_data,
-                    self,
-                )
-                # propagate any exception raised inside the modifier instead of
-                # silently returning (and caching) an unmodified frame
-                future.result()
+            # Apply modifier directly; exceptions propagate naturally
+            self.modifier.modify_data(frame_data, self)
             if self.use_modifier_cache:
                 # Cache the modified frame to avoid recomputation
                 self._modified_frame_cache[index] = copy.deepcopy(frame_data)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deepmd/utils/data.py` around lines 526 - 535, The single-task
ThreadPoolExecutor in get_single_frame adds setup/teardown overhead without any
parallelism because only one modify_data call is submitted and immediately
awaited. Simplify get_single_frame by calling self.modifier.modify_data
directly, matching the existing get_test and _load_batch_set call sites, while
still preserving the exception propagation behavior currently achieved by
future.result().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@deepmd/utils/data.py`:
- Around line 526-535: The single-task ThreadPoolExecutor in get_single_frame
adds setup/teardown overhead without any parallelism because only one
modify_data call is submitted and immediately awaited. Simplify get_single_frame
by calling self.modifier.modify_data directly, matching the existing get_test
and _load_batch_set call sites, while still preserving the exception propagation
behavior currently achieved by future.result().

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f0b6ac40-63bc-4117-8027-3b12b765ca99

📥 Commits

Reviewing files that changed from the base of the PR and between 0f19772 and 3909d62.

📒 Files selected for processing (2)
  • deepmd/utils/data.py
  • source/tests/common/test_deepmd_data.py

@codecov

codecov Bot commented Jul 2, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.09%. Comparing base (0e5c170) to head (3909d62).
⚠️ Report is 14 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5709      +/-   ##
==========================================
- Coverage   81.97%   81.09%   -0.88%     
==========================================
  Files         959      980      +21     
  Lines      105748   109355    +3607     
  Branches     4102     4207     +105     
==========================================
+ Hits        86684    88681    +1997     
- Misses      17573    19147    +1574     
- Partials     1491     1527      +36     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@njzjz njzjz added this pull request to the merge queue Jul 2, 2026
Merged via the queue into deepmodeling:master with commit 4c94171 Jul 2, 2026
60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Code scan] Propagate frame-level data modifier failures

2 participants