Optimizing FaaS Platforms for MCP-enabled Agentic Workflows

📅 2026-01-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deploying multi-agent workflows on serverless platforms faces significant challenges in state management and scalability. This work proposes FAME, an architecture that decouples agent paradigms such as ReAct into three composable Function-as-a-Service (FaaS) components—Planner, Actor, and Evaluator—orchestrated via AWS Step Functions and built upon LangGraph. To overcome the inherent statelessness of FaaS, FAME introduces automated memory persistence, Lambda-based encapsulation of Model Context Protocol (MCP) servers, tool output caching, and function fusion strategies. Experimental results on paper summarization and log analysis tasks demonstrate that FAME achieves up to a 13× reduction in latency, an 88% decrease in input tokens, and a 66% cost saving, while substantially improving workflow completion rates.

Technology Category

Application Category

📝 Abstract
Agentic workflows that use autonomous AI Agents powered by Large Language Models (LLMs) and Model Context Protocol (MCP) servers is rapidly rising. This introduces challenges in scalable cloud deployment and state management. Traditional hosting on Virtual Machines (VMs) is resource-intensive and lacks elasticity. Functions-as-a-Service (FaaS) platforms offer modularity, autoscaling and cost efficiency but are inherently stateless. In this paper, we present the FAME, a FaaS-based architecture for orchestrating MCP-enabled agentic workflows. FAME decomposes agentic patterns such as ReAct into composable agents: Planner, Actor and Evaluator, that are each a FaaS function built using LangGraph and are orchestrated as a FaaS workflow. This enables modular composition as AWS Step Functions and avoids function timeouts seen for monolithic agentic workflows. To address context persistence across user requests in a conversation, FAME automates agent memory persistence and injection using DynamoDB. It also optimizes MCP server deployment through AWS Lambda wrappers, caches tool outputs in S3 and proposes function fusion strategies. We evaluate FAME on two representative applications, on research paper summarization and log analytics, under diverse memory and caching configurations. Results show up to 13x latency reduction, 88% fewer input tokens and 66% in cost savings, along with improved workflow completion rates. This demonstrates the viability of serverless platforms for hosting complex, multi-agent AI workflows at scale.
Problem

Research questions and friction points this paper is trying to address.

FaaS
Agentic Workflows
MCP
State Management
Serverless
Innovation

Methods, ideas, or system contributions that make the work stand out.

FaaS
MCP
Agentic Workflows
Serverless Architecture
Context Persistence
🔎 Similar Papers
2024-05-222024 IEEE International Conference on Cloud Engineering (IC2E)Citations: 4
Varad Kulkarni
Varad Kulkarni
PhD Candidate, Indian Institute of Science, Bangalore
Distributed SystemsEdge & Hybrid Cloud computingServerless
V
Vaibhav Jha
Indian Institute of Science, Bangalore, India
N
Nikhil Reddy
Indian Institute of Science, Bangalore, India
Y
Yogesh L. Simmhan
Indian Institute of Science, Bangalore, India