Skip to content

Avoid duplicate on-chain HTLC claims after replay#4583

Open
joostjager wants to merge 14 commits into
lightningdevkit:mainfrom
joostjager:onchain-claim-replay-fixes
Open

Avoid duplicate on-chain HTLC claims after replay#4583
joostjager wants to merge 14 commits into
lightningdevkit:mainfrom
joostjager:onchain-claim-replay-fixes

Conversation

@joostjager

@joostjager joostjager commented Apr 30, 2026

Copy link
Copy Markdown
Contributor

Fixes #4572

@ldk-reviews-bot

ldk-reviews-bot commented Apr 30, 2026

Copy link
Copy Markdown

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@codecov

codecov Bot commented Apr 30, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 94.66019% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.22%. Comparing base (0c7e6e7) to head (f73f75c).
⚠️ Report is 28 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/chain/channelmonitor.rs 79.00% 20 Missing and 1 partial ⚠️
lightning/src/chain/onchaintx.rs 99.58% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4583      +/-   ##
==========================================
- Coverage   87.16%   86.22%   -0.95%     
==========================================
  Files         161      156       -5     
  Lines      109251   108965     -286     
  Branches   109251   108965     -286     
==========================================
- Hits        95230    93950    -1280     
- Misses      11547    12402     +855     
- Partials     2474     2613     +139     
Flag Coverage Δ
fuzzing-fake-hashes ?
fuzzing-real-hashes ?
tests 86.22% <94.66%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@joostjager joostjager self-assigned this Apr 30, 2026
@joostjager joostjager force-pushed the onchain-claim-replay-fixes branch 2 times, most recently from 531576d to 9e0f886 Compare May 1, 2026 11:06
@joostjager joostjager marked this pull request as ready for review May 1, 2026 11:22
@joostjager joostjager changed the title Onchain claim replay fixes Avoid duplicate on-chain HTLC claims after replay May 1, 2026
@joostjager joostjager removed the request for review from valentinewallace May 1, 2026 11:24
@ldk-claude-review-bot

ldk-claude-review-bot commented May 1, 2026

Copy link
Copy Markdown
Collaborator

I've re-reviewed the entire diff, focusing on areas not covered in my prior pass. I traced the index-only matching in htlc_output_resolution_on_chain (sound, since claim_htlcs! only runs for the confirmed counterparty commitment and resolved-HTLC indices belong to that same commitment), verified the removal of the CSV arm from confirmation_threshold (the separate MaturingOutput still maxes ANTI_REORG_DELAY with CSV, so the spendable output and balance accounting remain correct), and checked both call sites of get_broadcasted_holder_htlc_descriptors (skipping resolved HTLCs is correct in both). I also confirmed outpoint_claim_state's three-level priority and the reorg-resurrection path interaction.

No issues found.

No new line-specific issues to report. The dedup layering (ChannelMonitor htlcs_resolved_on_chain filtering + OnchainTxHandler outpoint_claim_state guard), the ClaimId canonicalization, the HTLCSpendConfirmation/MaturingOutput separation, and the new tests all hold up under scrutiny.

One non-blocking observation (not a bug, no inline comment): htlc_output_resolution_on_chain matches resolved HTLCs purely by commitment_tx_output_idx without comparing the commitment txid. This is safe given the current call paths (only the single confirmed commitment's HTLCs are ever in scope, and reorgs remove stale htlcs_resolved_on_chain entries), but it is a fragile invariant — a future caller that iterates HTLCs from a different commitment than the one that confirmed could get false-positive suppression.

Comment thread lightning/src/chain/onchaintx.rs Outdated
requests.retain(|req| {
let outpoint = req.outpoint();
if self.claimable_outpoints.get(outpoint).is_some() {
if self.is_outpoint_spend_waiting_threshold_conf(outpoint) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this break for reorgs where a counterparty's transaction gets unconfirmed? ie if we have a event awaiting confirmations for the counterparty claiming an outpoint, but then that gets reorg'd out and not replayed, do we still manage to broadcast our own claim (and RBF it)?

I assume that this case is only reachable if we receive a preimage after a counterparty's timeout claim is confirming?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think on a reorg the item in onchain_events_awaiting_threshold_conf is revived again? Basically the claim is passed around to different data structures, but never lost?

Comment thread lightning/src/chain/channelmonitor.rs Outdated

fn is_htlc_output_spent_on_chain(&self, htlc: &HTLCOutputInCommitment) -> bool {
if let Some(transaction_output_index) = htlc.transaction_output_index {
// This is a monitor-level HTLC generation filter. OnchainTxHandler

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So why exactly do we need the duplicate?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It stems from OnchainTxHandler not persisting state

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The duplication covers a restart/replay gap after finality.

OnchainTxHandler persists active claim state, but drops it once the claim reaches anti-reorg finality. It does not keep an ever-growing resolved-outpoint tombstone set.

After that cleanup, a replayed preimage update can look like a fresh claim request to OnchainTxHandler. ChannelMonitor does persist final HTLC resolution in htlcs_resolved_on_chain, so this filter uses that state to suppress only already-resolved HTLC outpoints.

Comment thread lightning/src/chain/channelmonitor.rs Outdated
// still guards package state for outpoints split out by confirmed
// spends; here we avoid recreating HTLC claim requests once the
// monitor has observed resolution.
self.onchain_events_awaiting_threshold_conf.iter().any(|entry| match entry.event {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as the previous commit - does this break broadcasting our conflicting claim on reorg?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, here indeed a claim is never submitted and also cannot be resurrected.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now only checking for deeply confirmed outpoints, but need to look at the consequences of that.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I would imagine any issues we have with deeply-confirmed would apply to less-confirmed as well, only more likely.

@joostjager joostjager May 4, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added resolve HTLC spends at anti-reorg finality. Previously, there was a gap between anti-reorg finality and CSV maturity where the monitor could resubmit a claim to the onchain handler. The commit closes that by adding HTLC spends to htlcs_resolved_on_chain at anti-reorg finality.

That should not affect balance reporting, since any CSV-delayed output remains tracked separately. Before anti-reorg finality, the claim still goes to the onchain handler, so its reorg handling stays intact. After anti-reorg finality, replayed claims are suppressed by the monitor.

Let me know what you think.

@joostjager joostjager force-pushed the onchain-claim-replay-fixes branch 2 times, most recently from 37d6b77 to 1c3b725 Compare May 1, 2026 18:26
joostjager added 5 commits May 4, 2026 09:36
Have ChannelMonitor hand singular ClaimRequests to OnchainTxHandler.

Convert them to PackageTemplates only after duplicate filtering.

This makes the single-outpoint invariant explicit at that boundary.
Clarify ChannelMonitor comments around on-chain event thresholds.
Some events only wait for anti-reorg finality, while CSV-delayed
outputs wait until spendable through the same threshold queue.
Move repeated OnchainTxHandler setup into shared test helpers so the
claim-replay coverage can focus on the behavior under test.
Add a monitor test for an inbound HTLC claimed by preimage from a
holder commitment. Confirm that the claimable balance remains unchanged
after the HTLC-success spend reaches anti-reorg finality but before the
CSV-delayed output is spendable.
Treat HTLCSpendConfirmation entries as irrevocably resolved once
the commitment HTLC output spend reaches anti-reorg finality. Do
not wait for CSV maturity of any delayed output created by that
spend.

Delayed outputs remain tracked separately as MaturingOutput entries,
keeping claimable balances alive until they are CSV-mature and can be
surfaced as SpendableOutputs.
@joostjager joostjager force-pushed the onchain-claim-replay-fixes branch 2 times, most recently from a3fbe2d to 3196617 Compare May 4, 2026 10:44
@joostjager joostjager requested a review from TheBlueMatt May 5, 2026 12:46
OnchainEvent::HTLCSpendConfirmation { on_to_local_output_csv: Some(csv), .. } => {
// A CSV'd transaction is confirmable in block (input height) + CSV delay, which means
// it's broadcastable when we see the previous block.
OnchainEvent::FundingSpendConfirmation { on_local_output_csv: Some(csv), .. } => {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the same apply to FundingSpendConfirmation? Maybe its worth asserting this exists when pushing HTLCSpendConfirmations (might have to move self.check_tx_and_push_spendable_outputs above self.is_resolving_htlc_output)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered making the change for FundingSpendConfirmation too, but it breaks more things. Balance calc needs to be updated, but also the changed behavior permeates up and leads to problems. I don't think it is needed functionally because there's no preimage claim coming in that triggers a sweep?

I did add a fixup for the assert you describe, that indeed also swaps those two calls).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmmm, it would be really nice if we made this behavior consistent, though? Its strange that we are now so inconsistent. Also, did you check through git history to find out why we do this? Will this implicitly break downgrade to some (potentially-ancient) version?

@joostjager joostjager May 22, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think FundingSpendConfirmation got CSV-delayed because it was used to drive claimable balance visibility for the holder commitment to-self output. HTLCSpendConfirmation later copied that same threshold behavior, but it probably didn't need to because the delayed HTLC-success output is already tracked separately as a MaturingOutput, which this PR makes use of.

If starting from scratch, it seems better to be more clear about the distinction between confirmed and spendable. But for this PR, I wouldn't want to extend the scope to include that.

For downgrades, I think get_htlc_balance will keep working because it gives precedence to the MaturingOutput event.

Comment thread lightning/src/chain/channelmonitor.rs Outdated
if htlc.offered && htlc.payment_hash == matching_payment_hash {
if htlc.offered
&& htlc.payment_hash == matching_payment_hash
&& !self.is_htlc_output_resolved_on_chain(htlc)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we at least log a huge "we lost money" error here (same elsewhere)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added logging fixup, but not completely certain it's now in the exact right places.

Comment thread lightning/src/chain/onchaintx.rs Outdated
}

fn is_outpoint_spend_waiting_threshold_conf(&self, outpoint: &BitcoinOutPoint) -> bool {
self.onchain_events_awaiting_threshold_conf.iter().any(|entry| {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this look at the timelocked set too to simplify the callsite?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added fixup that refactors to outpoint_claim_state helper

@ldk-reviews-bot

Copy link
Copy Markdown

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

Check that any HTLCSpendConfirmation carrying a local-output CSV
has a matching delayed MaturingOutput. Scan spendable outputs before
recording HTLC spend confirmations so the invariant is present when
the assertion runs.
@joostjager joostjager force-pushed the onchain-claim-replay-fixes branch from 3196617 to f73f75c Compare May 6, 2026 18:09
@joostjager joostjager requested a review from TheBlueMatt May 7, 2026 12:44
@ldk-reviews-bot

Copy link
Copy Markdown

🔔 1st Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 2nd Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 3rd Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 6th Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 1st Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@TheBlueMatt TheBlueMatt removed their request for review May 21, 2026 23:23
@TheBlueMatt

Copy link
Copy Markdown
Collaborator

Waiting for a response on #4583 (comment) and @wpaulino's review.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 2nd Reminder

Hey @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 3rd Reminder

Hey @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 4th Reminder

Hey @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 5th Reminder

Hey @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 6th Reminder

Hey @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 7th Reminder

Hey @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 8th Reminder

Hey @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

Comment thread lightning/src/chain/channelmonitor.rs Outdated
&& match &entry.event {
OnchainEvent::MaturingOutput {
descriptor: SpendableOutputDescriptor::DelayedPaymentOutput(descriptor),
} => descriptor.to_self_delay == csv,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this should probably also match on the HTLC output index

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in f: pin HTLC spend assert to the delayed output

assert!(tx_handler.claimable_outpoints.is_empty());

// If the spend reorgs out, the contentious outpoint is resurrected into
// the delayed package.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: may be worth checking that replaying the original claim requests after the reorg doesn't result in duplicates as well

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in "f: cover claim replay after reorg resurrection"

Aggregated HTLC spends create one delayed output per HTLC input, all
sharing the same transaction and CSV delay, so txid and CSV alone do
not identify the output. Match on the output index paired with the
spending input as well.

AI tools were used in preparing this commit.
A replayed holder HTLC claim may arrive as a single-outpoint
request after earlier requests were merged into a delayed package.
Check whether an existing delayed package already covers the new
request instead of requiring exact outpoint-set equality.

Add focused OnchainTxHandler coverage and a ChannelMonitor regression
through claim_funds for both current anchor variants.
When a transaction spends one outpoint from a delayed package, the
split outpoint is tracked as a ContentiousOutpoint until the spend
reaches anti-reorg finality. Reject replayed claim requests for those
pending-spent outpoints so they are not added back before the spend
reaches anti-reorg finality or reorgs out.

Add an OnchainTxHandler regression that replays a holder claim during
that pending-spent window and verifies reorg resurrection still works.
Classify duplicate outpoint state in one helper.

Preserve existing filter ordering and timelock logging.
Replay the original claim requests once the contentious spend reorgs
out and verify the resurrected delayed package is not duplicated.

AI tools were used in preparing this commit.
Filter regenerated HTLC claim requests once ChannelMonitor has persisted
anti-reorg finality for the commitment HTLC output spend.

This keeps replayed preimage updates from recreating claims after
OnchainTxHandler has cleaned up its active retry state, relying on the
monitor's persisted HTLC resolution state.
Log when a replayed preimage claim is skipped because the
HTLC output reached anti-reorg finality without that preimage.
Hash HTLC claim outpoints in canonical order so the same logical HTLC
set produces the same ClaimId regardless of descriptor order.

Add a unit test covering reversed descriptor order.
@joostjager joostjager force-pushed the onchain-claim-replay-fixes branch from f73f75c to 977444d Compare June 10, 2026 13:07
@joostjager joostjager requested a review from wpaulino June 10, 2026 13:09
@wpaulino

Copy link
Copy Markdown
Contributor

LGTM after squash

@joostjager joostjager requested a review from TheBlueMatt June 11, 2026 05:33
@joostjager

Copy link
Copy Markdown
Contributor Author

Regarding the back port question: I do wonder if without this PR there can be zombie claims that never disappear. But even if that is a possibility, it might not be critical enough. Unless it does something like using up wallet utxos for bumping.

@wpaulino

Copy link
Copy Markdown
Contributor

Good point, using up wallet UTXOs would be annoying. AFAICT though this PR doesn't clean up zombie claims, it merely prevents them?

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 1st Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

1 similar comment
@ldk-reviews-bot

Copy link
Copy Markdown

🔔 1st Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot

Copy link
Copy Markdown

🔔 2nd Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

1 similar comment
@ldk-reviews-bot

Copy link
Copy Markdown

🔔 2nd Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Duplicate Delayed Holder HTLC Claim Replay After Force-Close

5 participants