feat(closes OPEN-11315): modernize the LangChain/LangGraph callback handler for v1#654
Open
viniciusdsmello wants to merge 1 commit into
Conversation
4404ea1 to
40ad1bc
Compare
…andler for v1 Modernizes the sync + async Openlayer LangChain/LangGraph callback handler for LangChain v1 (OPEN-11315). The callbacks API stayed backwards-compatible across v1, so these are gap-fills and modernizations rather than a migration. High-priority fixes: - Wire on_retriever_start/end/error into the sync handler (previously only the async handler had them) so synchronous RAG pipelines produce a RETRIEVER step and populate the context column; fall back to the handler-owned trace when a root-level retriever has no external trace context. - Fall back to serializing AIMessage.tool_calls in _extract_output when the generation text is empty, and preserve tool_calls in _message_to_dict, so tool-only agent turns no longer record empty output. Modernization: - Tokens: read AIMessage.usage_metadata first, then llm_output / generation_info; capture input/output token details (cache_read, cache_creation, reasoning) under step metadata. - Provider: use metadata["ls_provider"] as the primary source (mapped to Openlayer provider names), with the _type map and LiteLLM prefixes as fallbacks. - LangGraph metadata: chain steps prefer the runnable's own name, falling back to langgraph_node, then the serialized id (matching the TS handler) so node internals keep their real names; metadata["thread_id"] auto-maps to session_id unless an explicit session is set (opt-out via map_thread_id_to_session). - v1 content blocks: normalize list / content-block message content (text joined into content, non-text blocks preserved under content_blocks). - Drop the removed langchain.schema / langchain.callbacks.base import fallback; import from langchain_core only. Tests: - Add an offline test suite for the handler (real langchain_core objects, publishing disabled). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
40ad1bc to
26458d4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Modernizes the sync and async Openlayer LangChain/LangGraph callback handler (
langchain_callback.py) for OPEN-11315. The callbacks API stayed backwards-compatible across LangChain v1, so these are gap-fills + modernizations rather than a migration.🔴 High-priority fixes
on_retriever_start/end/errorexisted only onAsyncOpenlayerHandler; the sync handler now has them too, so synchronous RAG pipelines produce aRETRIEVERstep and populate thecontextcolumn. Also fixes context being dropped when a sync root-level retriever has no external trace context (external trace still wins)._extract_outputfalls back to serializingAIMessage.tool_callswhen the generation text is empty (tool-only agent turns previously recorded""), and_message_to_dictpreservestool_calls. Backwards-compatible for messages without tool calls.🟡 / 🟢 Modernization
_extract_token_inforeadsAIMessage.usage_metadatafirst, then falls back tollm_output/generation_info; capturesinput_token_details/output_token_details(cache_read, cache_creation, reasoning) understep.metadata["token_details"].metadata["ls_provider"]is now the primary provider source (mapped to Openlayer's canonical names), with the_typemap and LiteLLM model prefixes as fallbacks.name, falling back tometadata["langgraph_node"], then the serialized id — i.e.name → langgraph_node → id, matching the TypeScript handler.langgraph_nodeis inherited by every run nested inside a node, so preferring it overnamewould relabel all of a node's internal LCEL runs (RunnableSequence/Prompt/ …) — and even nested sub-graphs — with the parent node's name, collapsing the tree. LangGraph already setsnameto the node name at node boundaries, so nodes stay identifiable while the inner structure keeps its real names.metadata["thread_id"]auto-maps to the tracesession_id(opt out withmap_thread_id_to_session=False), and never clobbers an explicitly provided session.post_process_tracealready promotessession_idto its column, so it works end-to-end._message_to_dictnormalizes list / content-block content (text joined intocontent, non-text blocks preserved undercontent_blocks).langchain.schema/langchain.callbacks.baseimport fallback (removed in v1); imports fromlangchain_coreonly, and theImportErrornow suggestspip install langchain-core.Tests
New offline suite
tests/lib/integrations/test_langchain_callback.pydrives the handler with reallangchain_coreobjects (publishing disabled) — retriever capture, tool-call output, token/provider extraction, content blocks, and the LangGraph naming +thread_id → session_idrules. 34/34 passing.End-to-end validation
Validated against a real LangGraph
create_react_agentworkflow streaming to an Openlayer inference pipeline (langchain-core1.x,langgraph1.x,langchain-openai1.x):buildSystemPrompt/callAgent/rewrite) → nested react agent (agent/tools/agent) → tool calls and chat completions — with node names, nested LCEL names, and the sub-graph name all preserved (this is what motivated thename-first precedence above).thread_id → session_id,user_id, provider (OpenAI), and token counts all land on the resulting row.New public surface (additive)
map_thread_id_to_session: bool = Trueconstructor param on both handlers._message_to_dictmay add acontent_blockskey for v1 list content;token_detailsappears under step metadata when present.Notes
on_agent_action/on_agent_finishare left as-is (dead withcreate_agent/ LangGraph, but kept forlangchain-classicAgentExecutorcompatibility).tests/test_tracing_core.pythat surfaces only single-process (masked by-n auto); confirmed identical on a cleanmain.🤖 Generated with Claude Code