LLM Agents, Part 5 - Communication Protocol in Agentic Systems
How should agents and services communicate, coordinate, and keep track of tasks?
In our multi-agent systems series, we have started by introducing what agents are and how multi-agent systems emerge as a natural evolution of the software architecture as we move on to more complex workflows. We explored how Service-Oriented Architecture (SOA) can be applied to create flexible, modular multi-agent systems, and then looked at how it can be used for a biotech sales organization as our example.
While SOA provides a solid foundation for structuring our multi-agent system, it doesn't fully capture the dynamic nature of complex, real-time interactions. SOA tells us what the components are, but it doesn't address how these components interact or manage their internal workflows. To create truly responsive and adaptable systems, ones that can eventually mimic some degree of agency, we need to go beyond static structures and incorporate patterns that handle the flow of information and the progression of tasks.
Today, we’ll be building on that foundation by introducing another critical architectural component: Event-Driven Architecture (EDA).
Service-Oriented Architecture (SOA)
Before getting into EDA, let's first recap why Service-Oriented Architecture really matters for multi-agent systems. SOA is all about breaking down complex processes into manageable, independent services. In our biotech sales system, SOA allowed us to modularize the entire sales process into independent, specialized services. Each service, such as lead generation, lead qualification, viability assessment, and proposal writing, operates like a specialized team within a business, each performing a distinct role. The key here is that these services communicate with each other through well-defined interfaces, promoting loose coupling. This, we argued, results in a system that won't fall apart every time you need to make a change.
For example, let's say you decide to develop a more advanced logic for lead qualification. In a monolithic system, this could be a nightmare, potentially affecting everything from data input to proposal generation. But with SOA, you can update the lead generation service without disrupting other services like proposal writing or viability assessment. This flexibility is what makes SOA such a powerful bedrock for multi-agent systems.
Event-Driven Architecture (EDA)
Now, let's talk about Event-Driven Architecture. EDA isn't new, it's a well-established design pattern that's been around for years. But its application in multi-agent systems, while in its infancy, is where things get interesting, and potentially messy if not handled correctly.
EDA is a software design pattern that emphasizes real-time system response to events. An "event" is a significant state change such as a new lead being identified or a proposal being finalized. In EDA, components produce and consume these events, triggering further actions across the architecture. It promotes decoupled, asynchronous interactions, making systems more flexible and scalable.
This approach has been used in systems like enterprise applications, where services like customer orders, payments, and inventory updates function independently but are synchronized through events. The same principles now apply to multi-agent systems, where agents can respond to events without being tightly coupled to other agents or services.
Why EDA in Multi-Agent Systems?
In the context of our biotech sales system, EDA allows us to design a responsive system where services and agents react to events as they occur, without waiting for direct interactions. For example, when the lead generation service identifies a new potential client, it produces a "New Lead Identified" event without necessarily knowing or being impacted by how other services or agents might interact with that event. This event triggers actions in services that subscribe to that event type such as lead qualification, market analysis, and proposal generation. This architecture choice would lead to flexibility to adapt to evolving business needs by adjusting events and agent interactions without requiring significant system-wide changes:
Real-time Responsiveness: EDA ensures that when an event like identifying a new lead occurs, multiple agents can start immediately, such as the lead qualification and market analysis agents.
Decoupling: One of the core principles of EDA is decoupling. In this approach, agents or services react to events independently, without any direct connection. In the biotech example, the lead qualification agent doesn’t need to know how the lead generation agent works. It just reacts to the event that the lead agent produces. This allows the system to remain modular and flexible.
Scalability: New agents, say a pricing analysis agent, can be easily integrated to listen for relevant events and act without disrupting existing workflows.
Mechanics of EDA in Multi-Agent Systems
1. Event Producers and Consumers
In EDA, agents or services are categorized as event producers (those that trigger events) or event consumers (those that react to events). In many cases, an agent can play both roles. For example:
The lead generation service identifies a potential client and creates a "New Lead Identified" event.
The lead qualification service consumes this event, evaluates the lead, and produces a "Lead Qualified" event.
The proposal generation service consumes the "Lead Qualified" event to start preparing the proposal.
This approach allows for greater flexibility, as services can operate independently but are still coordinated by events.
2. The Event Bus: System Coordination
The event bus is the backbone of the EDA system, routing events between producers and consumers. In the biotech sales scenario, it ensures that when the lead qualification service produces a "Lead Qualified" event, it is automatically routed to all relevant services—such as proposal generation, pricing strategy, and market analysis.
This centralized coordination ensures that agents and services stay decoupled, yet the entire system stays synchronized as events flow through the architecture.
3. Event Schemas: Standardizing Communication
Event schemas define data structure, standardize communication, and ensure correct data interpretation.
For example, in the biotech sales system, a "Lead Qualified" event schema might look like this:
{
"eventType": "LeadQualified"
"lead_Id": "12345"
"companyName": "PharmaCorp"
"potentialValue": 500000,
"productInterest": ["Lab Equipment", "AI Drug Discovery"]
"qualificationScore": 85
}
This standardization allows agents to communicate consistently, ensuring that data is interpreted correctly, like predefined contracts (APIs) in a microservices architecture.
In LLM-based agentic architectures, large language models are often used to create and / or interpret some of the components of the schema that are best captured as natural language or domain specific language. For example, the event produced by the “Proposal Writing” service might contain a field called “content” that provides a nested dictionary with section titles and paragraphs of the proposal. That nested dictionary is likely to be generated using an LLM call (potentially a RAG based sub-system). On the other hand, the receiver of the event would also likely need an LLM call to interpret and react to that data object.
Challenges and Considerations
While EDA has many benefits, there are also some challenges to consider:
Event Storage: As the system grows, the number of events increases, making efficient event storage crucial. Event sourcing patterns, which are commonly used in traditional EDA systems, can be applied to reconstruct system states from past events.
Debugging Complexity: Tracing the flow of events in a large system can be challenging, especially when issues arise. Distributed tracing tools are often required to pinpoint problems in the event chain.
Over-communication: If systems are not carefully managed, they can become overwhelmed with too many events. It’s important to balance responsiveness with efficiency to avoid performance bottlenecks.
Event-based Communication Protocol
In this article we explored event-driven architectures as one of the more promising communication protocols in multi-agent systems. We looked at how this architecture choice complements the service-oriented architecture that we previously discussed as a design pattern for constructing multi-agent products based on existing business workflows. In the next parts of this series, we will introduce more architecture concepts, and will eventually discuss how they will combine to create a full picture of multi-agent systems.