Skip to content

[perf] Make PrettyPrinter format lazily so output can be budget-capped#14588

Draft
Pierre-Sassoulas wants to merge 3 commits into
pytest-dev:mainfrom
Pierre-Sassoulas:pprint-lazy-budget
Draft

[perf] Make PrettyPrinter format lazily so output can be budget-capped#14588
Pierre-Sassoulas wants to merge 3 commits into
pytest-dev:mainfrom
Pierre-Sassoulas:pprint-lazy-budget

Conversation

@Pierre-Sassoulas

@Pierre-Sassoulas Pierre-Sassoulas commented Jun 13, 2026

Copy link
Copy Markdown
Member

Refactor required prior to #14523.

_format and the per-type helpers now yield their output as a stream of string chunks instead of writing to a file-like object, and pformat joins them. On top of that, pformat_lines pulls from the formatter only until a budget is reached:

pformat_lines(obj, max_lines=None, max_chars=None)

It stops on the first chunk that reaches either budget, so a huge collection costs O(budget) rather than O(N). Either dimension may be None (unbounded); with both None the whole object is formatted.

Benchmark (PrettyPrinter alone, width 80)::

list(range(500_000)):
    pformat().splitlines()        ~805 ms
    pformat_lines(max_lines=11)   ~0.027 ms      (~30000x)

[8 small ints] (common small diff):
    pformat().splitlines()        ~0.0133 ms
    pformat_lines(max_lines=11)   ~0.0163 ms (+3µs)

["x"*100_000] * 3 (flat, few huge elements):
    pformat_lines(max_chars=640)  stops after ~100_000 chars
                                  (one element) instead of 300_000

@Pierre-Sassoulas Pierre-Sassoulas added the skip news used on prs to opt out of the changelog requirement label Jun 13, 2026
@Pierre-Sassoulas Pierre-Sassoulas marked this pull request as draft June 13, 2026 16:30
…apped

``_format`` and the per-type helpers now ``yield`` their output as a
stream of string chunks instead of writing to a file-like object, and
``pformat`` joins them. On top of that, ``pformat_lines`` pulls from the
formatter only until a budget is reached:

    pformat_lines(obj, max_lines=None, max_chars=None)

It stops on the first chunk that reaches *either* budget, so a huge
collection costs O(budget) rather than O(N). Either dimension may be
``None`` (unbounded); with both ``None`` the whole object is formatted.

Motivation
----------
Assertion diffs are truncated to a handful of lines/chars before being
shown. Formatting the whole of a large ``==`` comparison and then
throwing almost all of it away is pure waste. With a lazy formatter the
truncating caller simply stops pulling once it has enough.

Benchmark (``PrettyPrinter`` alone, width 80)::

    list(range(500_000)):
        pformat().splitlines()        ~805 ms
        pformat_lines(max_lines=11)   ~0.027 ms      (~30000x)

    [8 small ints] (common small diff):
        pformat().splitlines()        ~0.0133 ms
        pformat_lines(max_lines=11)   ~0.0185 ms     (+~5 us)

    ["x"*100_000] * 3 (flat, few huge elements):
        pformat_lines(max_chars=640)  stops after ~100_000 chars
                                      (one element) instead of 300_000

Why a lazy generator rather than a fast path + budget stream
------------------------------------------------------------
An earlier approach kept a cheap ``pformat().splitlines()`` fast path
guarded by ``len(obj) <= max_lines`` plus a flatness check, falling back
to a write-intercepting budget-stream class for the rest. Two problems:

* ``len(obj)`` is only a *lower* bound on the line count — one nested
  element (``[{...50 keys...}]``) expands to many lines — so the guard
  needed the flatness scan to stay correct, and even then it bounded
  only *lines*, never *chars*: a flat container of a few enormous
  strings has almost no lines but blows the char budget.
* it was two code paths plus a stream class plus an exception used for
  control flow.

Because the formatter is lazy, "stop pulling at the budget" is the whole
optimisation: correct regardless of how lines/chars are distributed
across elements, bounding both dimensions, with no ``len()`` proxy to
get wrong and no fast/slow branch. The common small-diff case costs only
~5 us more than the unbounded path (it is never the bottleneck — a
failing assertion isn't hot), while large comparisons drop by orders of
magnitude.

``_pprint_set``/``_pprint_dict`` also try a plain ``sorted`` first and
fall back to the ``_safe_key`` wrapper only for unorderable mixes.

This diverges structurally from the upstream cpython ``pprint`` it was
vendored from; the module header notes it is no longer kept in sync.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
In ``pformat_lines``'s budget loop, ``chunk.count("\n")`` ran on every
chunk, but most chunks (brackets, indentation, item reprs) contain no
newline. Guarding the call with ``"\n" in chunk`` skips it on those and
recovers part of the per-chunk budget-tracking overhead: formatting an
8-element list under a budget drops from ~0.0185 ms to ~0.0163 ms
(versus ~0.0132 ms for an uncapped ``pformat().splitlines()``, so the
budget overhead roughly halves, from ~+5 us to ~+3 us).

The win is small and only matters on the ``-v`` truncating path of a
failing assertion (the default path doesn't format the diff at all), so
this is kept as a separate commit — easy to drop if the extra branch
isn't judged worth it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip news used on prs to opt out of the changelog requirement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant