NTILC: Neural Tool Invocation via Learned Compression

📅 2026-06-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of context inflation, increased inference latency, and reduced tool selection accuracy in large language models when invoking numerous tools, primarily caused by embedding full tool descriptions into the context and interference from irrelevant tools. To mitigate these issues, the authors propose the NTILC framework, which maps user intents and tool specifications into a shared embedding space and replaces in-context lookup with external neural retrieval. Only structured information of the selected tool is then fed into the model to generate invocation parameters. A signature-aware composite objective function is innovatively designed, integrating semantic similarity with tool signature constraints—such as parameter types and return values—using Circle Loss and Functional Margin Loss to distinguish between semantically similar but signature-incompatible tools. Experiments demonstrate that NTILC reduces context token consumption by over 95% and lowers inference latency by up to 74% compared to long-context in-context tuning baselines, while maintaining high tool selection accuracy across multiple public benchmarks.
📝 Abstract
Agentic tool-calling language models depend on large registries of callable APIs, functions, and local actions. Placing full tool specifications directly in the prompt incurs a cost that scales linearly with the size of the tool registry, rapidly consuming the context budget. As the registry grows, this leads to higher latency and degrades selection accuracy, particularly due to interference from irrelevant tools. We overcome these limitations by introducing NTILC, a neural tool selection and invocation framework that replaces in-context registry look-up with learned latent retrieval. NTILC maps both user intent and tool specifications into a shared embedding space, enabling tool selection via external retrieval rather than in-context lookup. The language model is conditioned only on the selected tool schema, allowing for precise, constrained argument generation. Central to our approach is a signature-aware composite objective, which augments semantic similarity with constraints derived from tool signatures (e.g., argument schema, type compatibility, and return types). By combining Circle Loss with a Functional Margin Loss, the model enforces separation between tools that are semantically similar but incompatible under their execution signatures. We evaluate NTILC on public tool-selection and function-calling datasets and report context token usage, retrieval accuracy, and selection latency metrics. Across these settings, NTILC reduces context window consumption by over 95% and inference latency by up to 74% compared to long-context ICT baselines.
Problem

Research questions and friction points this paper is trying to address.

tool selection
context budget
latency
tool registry
in-context learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

learned latent retrieval
signature-aware objective
tool invocation
embedding space
context compression
🔎 Similar Papers