Skip to content

Add get_db_overview for fast, lightweight run listing#8266

Merged
astafan8 merged 4 commits into
microsoft:mainfrom
astafan8:astafan8/dataset-db-overview
Jul 3, 2026
Merged

Add get_db_overview for fast, lightweight run listing#8266
astafan8 merged 4 commits into
microsoft:mainfrom
astafan8:astafan8/dataset-db-overview

Conversation

@astafan8

@astafan8 astafan8 commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds get_db_overview, a fast, lightweight way to list the runs in a QCoDeS
database. It issues a single JOIN query against the runs and experiments
tables to collect run metadata (experiment/sample names, timestamps, record
counts, guids) without instantiating a DataSet object per run.

This avoids the expensive experiments() + data_sets() enumeration, so
listing a database with many thousands of runs becomes near-instant instead of
taking minutes. It is intended for tools that display a table of runs (dataset
browsers, e.g. plottr's inspectr).

What's included

  • New module qcodes/dataset/sqlite/db_overview.py with:
    • get_db_overview(path_to_db=None, conn=None, start_run_id=0, extra_columns=None)
    • RunOverviewDict — the TypedDict describing each run's overview entry
  • Both are exported on the public qcodes.dataset namespace (added to
    qcodes/dataset/__init__.py __all__).
  • Tests in tests/dataset/test_db_overview.py.

Notes / design choices

  • records is a best-effort data-point count: for completed runs the run
    description shapes are preferred (authoritative final count), falling back
    to the results-table row count, then result_counter. For in-progress runs
    the live results-table row count is preferred.
  • start_run_id supports cheap incremental refresh (only fetch runs newer than
    a known run_id).
  • extra_columns lets callers pull ad-hoc metadata columns added via
    DataSet.add_metadata (e.g. plottr's inspectr_tag) into the overview.
    Columns that don't exist in the runs table are silently skipped.
  • Either path_to_db (opened read-only, then closed) or an existing conn
    (left open) may be supplied.

Origin

This was prototyped and battle-tested in
plottr and is being upstreamed
so it can be shared. A companion plottr PR switches to this implementation when
available and keeps a local fallback for older QCoDeS versions.

Status

Draft — opening for early feedback on the public API shape (function name,
argument names, and the RunOverviewDict key names). Happy to rename
experiment/sample keys or adjust the records heuristic based on review.


Note

This pull request was created by an agent on behalf of @astafan8.

Mikhail Astafev and others added 2 commits July 3, 2026 12:25
Add `qcodes.dataset.sqlite.db_overview.get_db_overview`, which fetches a
lightweight overview of the runs in a database via a single JOIN query on the
`runs` and `experiments` tables, without instantiating a DataSet per run.
This makes listing databases with many thousands of runs near-instant.

The function and its return type `RunOverviewDict` are exported on the public
`qcodes.dataset` namespace. An `extra_columns` argument allows reading
ad-hoc metadata columns added via `DataSet.add_metadata`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@codecov

codecov Bot commented Jul 3, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.12%. Comparing base (ee06e5c) to head (a4b7885).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8266      +/-   ##
==========================================
+ Coverage   71.04%   71.12%   +0.08%     
==========================================
  Files         304      305       +1     
  Lines       31911    32001      +90     
==========================================
+ Hits        22670    22760      +90     
  Misses       9241     9241              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@astafan8

astafan8 commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

Companion plottr PR that consumes this API (with a local fallback for older QCoDeS): toolsforexperiments/plottr#462

Comment thread src/qcodes/dataset/sqlite/db_overview.py Outdated
Comment thread src/qcodes/dataset/sqlite/db_overview.py
Comment thread src/qcodes/dataset/sqlite/db_overview.py
Comment thread src/qcodes/dataset/sqlite/db_overview.py
Comment thread src/qcodes/dataset/sqlite/db_overview.py Outdated
Comment thread src/qcodes/dataset/sqlite/db_overview.py Outdated
…fallback

- Make conn/start_run_id/extra_columns keyword-only.
- Remove the leftover schema-stability note from the module docstring.
- Catch sqlite3.Error rather than bare Exception around the queries.
- Document why the extra_columns dict update needs a type: ignore.
- Drop the result_counter fallback for the record count: result_counter is the
  run's ordinal within its experiment, not a data-point count, so it produced a
  misleading number; report 0 (unknown) instead.
- Add tests for the in-progress, missing-results-table and query-error branches.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread docs/changes/newsfragments/8266.new Outdated
Comment thread tests/dataset/test_db_overview.py Outdated
Comment thread tests/dataset/test_db_overview.py Outdated
Comment thread tests/dataset/test_db_overview.py Outdated
…onfig

- Document in the docstring and newsfragment that snapshots are not read (they
  can be large and slow) and that the record count may be less precise than
  DataSet.number_of_results.
- Rewrite the tests to build the database via an explicit connection object
  (load_or_create_experiment(conn=...)/new_data_set(conn=...)) so the global
  QCoDeS config is not mutated, and close the connection via request.addfinalizer
  so it is cleaned up even if a test fails.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@astafan8 astafan8 marked this pull request as ready for review July 3, 2026 13:39
@astafan8 astafan8 requested a review from a team as a code owner July 3, 2026 13:39
@astafan8 astafan8 enabled auto-merge July 3, 2026 14:13
@astafan8 astafan8 added this pull request to the merge queue Jul 3, 2026
Merged via the queue into microsoft:main with commit 17f088b Jul 3, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants