🤖 AI Summary
Enterprises face significant challenges in integrating large language models (LLMs), including data silos, poor interoperability among heterogeneous models and APIs, and difficulty guaranteeing QoS—particularly cost, accuracy, and latency. Method: This paper proposes a Composite AI Blueprint Architecture for enterprise applications, centered on the “flow” paradigm. It introduces dual registries—agent and data—to unify management and enable semantic search across private models, third-party APIs, and multimodal enterprise data. A QoS-driven task planner supports automatic end-to-end task decomposition, dynamic resource mapping, and execution optimization. The architecture pioneers cross-agent streaming collaboration, integrating metadata-driven governance, LLM-enhanced workflow orchestration, and multimodal data governance. Contribution/Results: Evaluated in an HR use case, the system significantly improves task completion rate and response controllability, enables plug-and-play integration of heterogeneous resources, and reduces customization and operational costs.
📝 Abstract
Large language models (LLMs) have gained significant interest in industry due to their impressive capabilities across a wide range of tasks. However, the widespread adoption of LLMs presents several challenges, such as integration into existing applications and infrastructure, utilization of company proprietary data, models, and APIs, and meeting cost, quality, responsiveness, and other requirements. To address these challenges, there is a notable shift from monolithic models to compound AI systems, with the premise of more powerful, versatile, and reliable applications. However, progress thus far has been piecemeal, with proposals for agentic workflows, programming models, and extended LLM capabilities, without a clear vision of an overall architecture. In this paper, we propose a 'blueprint architecture' for compound AI systems for orchestrating agents and data for enterprise applications. In our proposed architecture the key orchestration concept is 'streams' to coordinate the flow of data and instructions among agents. Existing proprietary models and APIs in the enterprise are mapped to 'agents', defined in an 'agent registry' that serves agent metadata and learned representations for search and planning. Agents can utilize proprietary data through a 'data registry' that similarly registers enterprise data of various modalities. Tying it all together, data and task 'planners' break down, map, and optimize tasks and queries for given quality of service (QoS) requirements such as cost, accuracy, and latency. We illustrate an implementation of the architecture for a use-case in the HR domain and discuss opportunities and challenges for 'agentic AI' in the enterprise.