Skip to content

Add the ability to construct nodes#13

Open
sgrif wants to merge 1 commit into
mainfrom
sg-make-fns
Open

Add the ability to construct nodes#13
sgrif wants to merge 1 commit into
mainfrom
sg-make-fns

Conversation

@sgrif

@sgrif sgrif commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

All the APIs in this commit are intended to be internal, but are beefy enough to be reviewed on their own. Notably this does not allow the creation of a node without manual juggling of a MemoryContext, which is not a type we want to make public.

In this commit we generate code for every function defined in include/nodes/makefuncs.h and include/nodes/value.h. Ideally we would wrap all of these functions in a C function that does PG_TRY(); PG_CATCH();, but there is no generic way to write that function in C, generating that code would be fairly painful, and all of these functions are purely allocating and setting fields so we should never have a longjmp happen anyway. If we are concerned about UB here, I might go ahead and set the exception stack to NULL at the start of these functions so any errors that are raised abort the process, but that's a future discussion. It will be UB if any of these functions begin eloging in the future, but I just don't see a middle ground for this. The only existing case today where an error should ever be raised is in an OOM condition, which we consider to be sufficiently impossible in this context that we can ignore it.

Our generated code takes the types that the associated accessor methods on the structs would return, not the raw types that the C function would generate (with the exception of slice instead of iterators for PgList, as we do actually need a contiguous pointer in that case). We generate a bit of glue code to then convert that to the raw type that PG wants. Notably, we take &str and &[Node] for c strings and lists respectively. We don't want callers to have to care about putting strings on the PG heap, nor do we want them to have to care about list internals. So instead we take the convenient Rust types and copy onto the PG heap. This is potentially more expensive, but is ultimately necessary for any arguments we'll receive with the exception of a &'static str.

The tricky part of building this API was ensuring that all lifetimes are invariant. We don't want it to be possible to construct a node on one MemoryContext that points to a different one. This is because ultimately the lifetimes end up being erased, in order to allow MemoryContext to be passed around as part of an owned structure with no lifetime such as ParseResult. Because soundness around doing this can be tricky, we've pulled in an external dependency that provides a safe API for doing this. Semantically it's quite similar to stack pinning.

With this change, and Unique as a type indicating that there are no other references to a given node, we can look towards providing functions that allow actually manipulating AST structures.

All the APIs in this commit are intended to be internal, but are beefy
enough to be reviewed on their own. Notably this does not allow the
creation of a node without manual juggling of a MemoryContext, which is
not a type we want to make public.

In this commit we generate code for every function defined in
`include/nodes/makefuncs.h` and `include/nodes/value.h`. Ideally we
would wrap all of these functions in a C function that does PG_TRY();
PG_CATCH();, but there is no generic way to write that function in C,
generating that code would be fairly painful, and all of these functions
are purely allocating and setting fields so we should never have a
longjmp happen anyway. If we are concerned about UB here, I might go
ahead and set the exception stack to NULL at the start of these
functions so any errors that are raised abort the process, but that's a
future discussion. It will be UB if any of these functions begin
`elog`ing in the future, but I just don't see a middle ground for this.
The only existing case today where an error should ever be raised is in
an OOM condition, which we consider to be sufficiently impossible in
this context that we can ignore it.

Our generated code takes the types that the associated accessor methods
on the structs would return, not the raw types that the C function would
generate (with the exception of slice instead of iterators for PgList,
as we do actually need a contiguous pointer in that case). We generate a
bit of glue code to then convert that to the raw type that PG wants.
Notably, we take `&str` and `&[Node]` for c strings and lists
respectively. We don't want callers to have to care about putting
strings on the PG heap, nor do we want them to have to care about list
internals. So instead we take the convenient Rust types and copy onto
the PG heap. This is potentially more expensive, but is ultimately
necessary for any arguments we'll receive with the exception of a
`&'static str`.

The tricky part of building this API was ensuring that all lifetimes are
invariant. We don't want it to be possible to construct a node on one
MemoryContext that points to a different one. This is because ultimately
the lifetimes end up being erased, in order to allow `MemoryContext` to
be passed around as part of an owned structure with no lifetime such as
`ParseResult`. Because soundness around doing this can be tricky, we've
pulled in an external dependency that provides a safe API for doing
this. Semantically it's quite similar to stack pinning.

With this change, and `Unique` as a type indicating that there are no
other references to a given node, we can look towards providing
functions that allow actually manipulating AST structures.
@sgrif sgrif requested review from levkk and meskill June 29, 2026 20:59

@levkk levkk left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it but I might need to see an example to love it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants