Add test for UpdateStatistics chaining set and remove by nmshr · Pull Request #3557 · apache/iceberg-python

nmshr · 2026-06-26T13:12:19Z

Context

The intention of this PR is to shed light on a potential issue.
It is meant as an inquiry to understand what the maintainers feel about this scenario.
This is perhaps an edge case, so feel free to prioritise it in accordance.

Observation

When executing update_statistics on a table, there are two methods which can be chained by design: set_statistics and remove_statistics.

set_statistics appends to _updates using += (line 56).
remove_statistics replaces _updates using = (line 65).

When chaining both, remove_statistics drops all preceding set_statistics calls. This may lead to the query engine scanning more data in the absence of the required statistics files.

See pyiceberg/table/update/statistics.py, lines 56 and 65.

AI Use Disclaimer

I used claude code to understand test coverage and discover improvement opportunities. I used
claude code to verify that the test adds value and is not going to be a waste of time for the reviwers. I did not use claude code to write any code, commit messages or descriptions, all writing, including this, is my own.

Anshul Mishra added 3 commits June 26, 2026 18:41

Add test for UpdateStatistics chaining set and remove

c374600

Mark test as xfail to document expected failure

35220e3

fixing typos

e55ab1f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add test for UpdateStatistics chaining set and remove#3557

Add test for UpdateStatistics chaining set and remove#3557
nmshr wants to merge 3 commits into
apache:mainfrom
nmshr:test_update_statistics_chains_set_and_remove

nmshr commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

nmshr commented Jun 26, 2026

Context

Observation

AI Use Disclaimer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant