Fix PBTree flush for negative child address#17955
Conversation
HTHou
left a comment
There was a problem hiding this comment.
Thanks for the fix. The core change makes sense to me: RecordUtils.node2Buffer(child) serializes the child segment address, so the updated-child path needs to assign a valid segment address before serializing a non-measurement child.
I have two things I think should be addressed or clarified before merging:
-
Please add a regression test for the exact path fixed here: an updated non-measurement child whose segment address is still negative, followed by
forceFlushAndRelease()/scheduleFlushAll(). This is a persistence/liveness bug and the PR currently has no test guarding against the negative address being serialized again. -
There is still an adjacent liveness hole in
scheduleFlushAll(): after a region is added toflushingRegionSet, the async task can return early whenstore == nullor!regionToStore.containsKey(regionId), before reachingexecuteFlush()and itsfinallyblock that removes the region fromflushingRegionSet. Since this PR is also trying to prevent force-flush helpers from looping forever, it would be good to either clearflushingRegionSetin an outer taskfinally, or explicitly leave a follow-up if this is intentionally out of scope.
| } | ||
|
|
||
| if (!child.isMeasurement() && getNodeAddress(child) < 0) { | ||
| if (child.isDevice() && child.getAsDeviceMNode().isUseTemplate()) { |
There was a problem hiding this comment.
Could you clarify why the updated-child path rejects using-template devices here? writeNewChildren() allocates a segment for non-measurement children without this check, while this path will now fail if an updated template device still has a negative address. If the invariant is that such nodes can never reach this branch, a regression test or a short comment would make that clear; otherwise this should probably follow the new-child allocation behavior and use the existing i18n message for any real template-child write rejection.
There was a problem hiding this comment.
Addressed in e35299c. I added a regression test for the updated non-measurement child with a negative segment address, and changed the updated-child path to follow writeNewChildren(): allocate a segment before serialization without adding an extra template-device rejection in this branch. I also fixed the scheduleFlushAll()/scheduleFlush early-return path so flushingRegionSet is cleared from an outer finally even when the store is already closed or removed.
| throw new MetadataException( | ||
| String.format( | ||
| "Adding or updating children of device using template [%s] is NOT allowed.", | ||
| child.getFullPath())); | ||
| } |
There was a problem hiding this comment.
Addressed in e35299c. I removed the hard-coded template-device exception from this updated-child allocation branch and aligned it with writeNewChildren(), so there is no non-i18n message here now. The real template-child write rejection remains in SchemaFile.writeMNode() with the existing i18n message.
Fix PBTree flush when an updated non-measurement child still has a negative segment address. The parent record now allocates a segment for that child before serialization, preventing later flush attempts from repeatedly failing with negative-address nodes. Also propagate failures from force flush so test helpers do not loop forever when a flush cannot make progress. Tests: .\mvnw.cmd -pl iotdb-core/datanode -DskipTests spotless:check passed. Attempted target UT with -am, but local build was blocked by generated thrift sources missing javax.annotation in iotdb-thrift.