You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Follow-up to #19. That issue/PR (#25) fixed the SDK's incorrect 50k cap so the query results endpoint can now return up to the instance's configured query.maxLimit (e.g. 100,000 on Cloud).
However, the interactive query results endpoint is itself capped at query.maxLimit. Data-extraction / pipeline workflows that need millions of rows (the use case that motivated #19) can't be served by that endpoint — the Lightdash UI uses a separate CSV/download export path for those, governed by query.csvCellsLimit / query.csvMaxLimit (default ceiling 5,000,000 cells).
Proposed solution
Add an export-based fetch path to the SDK for result sets larger than query.maxLimit, e.g.:
result=query.to_df(via="export") # or query.export() / result.download()
This would POST to the CSV/scheduler-style export endpoint, poll for the generated file, and stream it back (CSV → DataFrame), rather than paging the interactive results API.
Notes / open questions
Confirm the exact endpoint(s): the download//csv or scheduler export API used by the UI's "Export" / "Download results".
Background
Follow-up to #19. That issue/PR (#25) fixed the SDK's incorrect 50k cap so the query results endpoint can now return up to the instance's configured
query.maxLimit(e.g. 100,000 on Cloud).However, the interactive query results endpoint is itself capped at
query.maxLimit. Data-extraction / pipeline workflows that need millions of rows (the use case that motivated #19) can't be served by that endpoint — the Lightdash UI uses a separate CSV/download export path for those, governed byquery.csvCellsLimit/query.csvMaxLimit(default ceiling 5,000,000 cells).Proposed solution
Add an export-based fetch path to the SDK for result sets larger than
query.maxLimit, e.g.:This would POST to the CSV/scheduler-style export endpoint, poll for the generated file, and stream it back (CSV → DataFrame), rather than paging the interactive results API.
Notes / open questions
/csvor scheduler export API used by the UI's "Export" / "Download results".query.csvCellsLimit(rows × columns) and surface a clear error when exceeded, consistent with the no-silent-truncation behaviour added in Support fetching results beyond the 50k row limit #19.to_df()/to_records()vs. a dedicatedexport()method.References
query.maxLimit,query.csvCellsLimit,query.csvMaxLimit(seedocs/export-limits.mdin lightdash/lightdash)