$λ_A$: A Typed Lambda Calculus for LLM Agent Composition

2026-04-13Programming Languages

Programming LanguagesMultiagent SystemsSoftware Engineering
AI summary

The authors created a formal system called $λ_A$ to better understand and build large language model (LLM) agents, which are like programs that use language models to perform tasks. Their system extends a known math framework (typed lambda calculus) by adding features for things like making calls to oracles, loops, random choices, and changing environments. They proved that their system is well-behaved and used it to build a tool that checks if agent setups are complete or will work properly. By analyzing many real-world agent configurations, they found most have structural problems, and their tool helps identify these issues more accurately when combining configuration files and code. They also showed that popular agent frameworks fit within their system, making it a common foundation for understanding LLM agent design.

typed lambda calculuslarge language modelsagent compositiontype safetyterminationlintingoperational semanticsprobabilistic choicefixpointsCoq mechanization
Authors
Qin Liu
Abstract
Existing LLM agent frameworks lack formal semantics: there is no principled way to determine whether an agent configuration is well-formed or will terminate. We present $λ_A$, a typed lambda calculus for agent composition that extends the simply-typed lambda calculus with oracle calls, bounded fixpoints (the ReAct loop), probabilistic choice, and mutable environments. We prove type safety, termination of bounded fixpoints, and soundness of derived lint rules, with partial Coq mechanization (1,567 lines, 43 completed proofs). As a practical application, we derive a lint tool that detects structural configuration errors directly from the operational semantics. An evaluation on 835 real-world GitHub agent configurations shows that 94.1% are structurally incomplete under $λ_A$, with YAML-only lint precision at 54%, rising to 96--100% under joint YAML+Python AST analysis on 175 samples. This gap quantifies, for the first time, the degree of semantic entanglement between declarative configuration and imperative code in the agent ecosystem. We further show that five mainstream paradigms (LangGraph, CrewAI, AutoGen, OpenAI SDK, Dify) embed as typed $λ_A$ fragments, establishing $λ_A$ as a unifying calculus for LLM agent composition.