Skip to content

Better handling of unsupported types in args#34

Draft
mdbenito wants to merge 5 commits into
LadybugDB:mainfrom
mdbenito:fix/handle-unsupported-args
Draft

Better handling of unsupported types in args#34
mdbenito wants to merge 5 commits into
LadybugDB:mainfrom
mdbenito:fix/handle-unsupported-args

Conversation

@mdbenito

Copy link
Copy Markdown
Contributor

Passing a dataframe / arrow table to any non-scanning clause (LOAD/COPY FROM) results in lb dying in UNREACHABLE_CODE because we hit the end of a switch without handling LogicalTypeId::POINTER, which is what
pyLogicalType maps dfs to, but only for LOAD / COPY ... FROM scan replacement.

This is not a big deal, but I think it's nicer to have an informative error

This PR:

  • adds some validation inside the python bindings before expression evaluation happens.
  • improves previous checking for "being in a scan query" to also catch LOAD WITH HEADERS.
  • adds a few tests (in a separate module since they span polars+pandas+arrow)

@mdbenito mdbenito changed the title Avoid crashing on unsupported args Better handling of unsupported types in args Jun 25, 2026
Comment thread src_py/connection.py
if not self._is_python_scan_object(value):
msg = (
"Binder exception: Trying to scan from unsupported data type "
"INT8[]. The only parameter types that can be scanned from "

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looked like a typo

Comment thread src_py/connection.py
def _maybe_raise_scan_unsupported_object(query: str) -> None:
match = re.search(
r"\bLOAD\s+FROM\s+([A-Za-z_][A-Za-z0-9_]*)\b", query, re.IGNORECASE
r"\b(LOAD|COPY)\b.*?\bFROM\s+([A-Za-z_][A-Za-z0-9_]*)\b",

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex should catch all of LOAD, LOAD WITH... and COPY FROM

@mdbenito mdbenito marked this pull request as draft June 25, 2026 19:34
@adsharma

Copy link
Copy Markdown
Contributor

The regexes could be replaced to use something like this in C-API:

https://github.com/LadybugDB/ladybug/blob/main/src/c_api/prepared_statement.cpp#L37-L39

That's preferable to using Cypher.g4 and generating python from it and using them so:

# After parsing, check the statement context type  
if isinstance(statement_context, IC_CopyFromContext):  
    # This is a COPY FROM statement  
elif isinstance(statement_context, IC_LoadFromContext):  
    # This is a LOAD FROM statement

because of the maintenance cost (every time the grammar changes, we need to generate bindings)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants