Add the ability to construct nodes#13
Open
sgrif wants to merge 1 commit into
Open
Conversation
All the APIs in this commit are intended to be internal, but are beefy enough to be reviewed on their own. Notably this does not allow the creation of a node without manual juggling of a MemoryContext, which is not a type we want to make public. In this commit we generate code for every function defined in `include/nodes/makefuncs.h` and `include/nodes/value.h`. Ideally we would wrap all of these functions in a C function that does PG_TRY(); PG_CATCH();, but there is no generic way to write that function in C, generating that code would be fairly painful, and all of these functions are purely allocating and setting fields so we should never have a longjmp happen anyway. If we are concerned about UB here, I might go ahead and set the exception stack to NULL at the start of these functions so any errors that are raised abort the process, but that's a future discussion. It will be UB if any of these functions begin `elog`ing in the future, but I just don't see a middle ground for this. The only existing case today where an error should ever be raised is in an OOM condition, which we consider to be sufficiently impossible in this context that we can ignore it. Our generated code takes the types that the associated accessor methods on the structs would return, not the raw types that the C function would generate (with the exception of slice instead of iterators for PgList, as we do actually need a contiguous pointer in that case). We generate a bit of glue code to then convert that to the raw type that PG wants. Notably, we take `&str` and `&[Node]` for c strings and lists respectively. We don't want callers to have to care about putting strings on the PG heap, nor do we want them to have to care about list internals. So instead we take the convenient Rust types and copy onto the PG heap. This is potentially more expensive, but is ultimately necessary for any arguments we'll receive with the exception of a `&'static str`. The tricky part of building this API was ensuring that all lifetimes are invariant. We don't want it to be possible to construct a node on one MemoryContext that points to a different one. This is because ultimately the lifetimes end up being erased, in order to allow `MemoryContext` to be passed around as part of an owned structure with no lifetime such as `ParseResult`. Because soundness around doing this can be tricky, we've pulled in an external dependency that provides a safe API for doing this. Semantically it's quite similar to stack pinning. With this change, and `Unique` as a type indicating that there are no other references to a given node, we can look towards providing functions that allow actually manipulating AST structures.
levkk
approved these changes
Jun 29, 2026
levkk
left a comment
There was a problem hiding this comment.
I like it but I might need to see an example to love it
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
All the APIs in this commit are intended to be internal, but are beefy enough to be reviewed on their own. Notably this does not allow the creation of a node without manual juggling of a MemoryContext, which is not a type we want to make public.
In this commit we generate code for every function defined in
include/nodes/makefuncs.handinclude/nodes/value.h. Ideally we would wrap all of these functions in a C function that does PG_TRY(); PG_CATCH();, but there is no generic way to write that function in C, generating that code would be fairly painful, and all of these functions are purely allocating and setting fields so we should never have a longjmp happen anyway. If we are concerned about UB here, I might go ahead and set the exception stack to NULL at the start of these functions so any errors that are raised abort the process, but that's a future discussion. It will be UB if any of these functions begineloging in the future, but I just don't see a middle ground for this. The only existing case today where an error should ever be raised is in an OOM condition, which we consider to be sufficiently impossible in this context that we can ignore it.Our generated code takes the types that the associated accessor methods on the structs would return, not the raw types that the C function would generate (with the exception of slice instead of iterators for PgList, as we do actually need a contiguous pointer in that case). We generate a bit of glue code to then convert that to the raw type that PG wants. Notably, we take
&strand&[Node]for c strings and lists respectively. We don't want callers to have to care about putting strings on the PG heap, nor do we want them to have to care about list internals. So instead we take the convenient Rust types and copy onto the PG heap. This is potentially more expensive, but is ultimately necessary for any arguments we'll receive with the exception of a&'static str.The tricky part of building this API was ensuring that all lifetimes are invariant. We don't want it to be possible to construct a node on one MemoryContext that points to a different one. This is because ultimately the lifetimes end up being erased, in order to allow
MemoryContextto be passed around as part of an owned structure with no lifetime such asParseResult. Because soundness around doing this can be tricky, we've pulled in an external dependency that provides a safe API for doing this. Semantically it's quite similar to stack pinning.With this change, and
Uniqueas a type indicating that there are no other references to a given node, we can look towards providing functions that allow actually manipulating AST structures.