Hexagonal Architecture Applied to AI Agents

Hexagonal and AI agents speak the same language

We design AI agents applying the hexagonal pattern because it maps precisely onto how agents reason and how they connect to the real world. The agent's core — reasoning, domain rules, decision criteria — sits isolated at the center. Integrations with LLMs, internal systems, and external tools sit at the edge, as interchangeable adapters.

The result is an agent testable in isolation, portable across models and providers, with a maintenance cost that grows linearly with domain complexity, not multiplicatively with the number of integrations. This is the architecture we apply on serious AI projects, and how to build it step by step.

The pattern in five minutes

Hexagonal splits the system into three layers:

Domain. Pure business logic: entities, rules, decisions. No outward dependencies.
Ports. Interfaces the domain defines to talk with the world. "I need to look up a customer" is a port; how it gets looked up isn't.
Adapters. Concrete implementations of the ports: the adapter that calls HubSpot, the adapter that queries the ERP, the adapter that writes to PostgreSQL. Interchangeable without touching the domain.

Key rule: the domain depends only on ports. Adapters depend on the domain. Never the other way around.

Mapping to the AI agent

An AI agent fits this pattern with surprising naturalness. Three components of the agent, three layers:

Agent domain. Contains reasoning, decision policies, the business rules the agent applies. A OrderQualificationAgent class doesn't know GPT, doesn't know HubSpot, doesn't know Postgres. It knows the domain: what a qualified lead is, when to escalate, what threshold triggers HITL. The domain can be tested with stubs without calling a real LLM.

Ports. The abstractions the agent's domain needs to operate:

LLMPort — "give me a decision over this context". Doesn't know whether Claude, GPT, Gemini, or a proprietary model sits behind.
ToolPort — "execute this action on an external system". Doesn't know whether the action runs over REST, MCP, a direct Java client call, or an async queue.
MemoryPort — "retrieve what the agent remembers about this entity". Doesn't know whether memory lives in a vector store, a key-value store, or markdown files in the repo.
ObservabilityPort — "record this decision". Doesn't know whether it goes to a Grafana dashboard, a file, or an external audit system.

Adapters. Concrete implementations the domain never sees:

AnthropicLLMAdapter, OpenAILLMAdapter — implement LLMPort against the specific provider.
HubSpotMCPToolAdapter, SAPClientToolAdapter, InternalAPIToolAdapter — implement ToolPort against each integrated system.
PgVectorMemoryAdapter, FileSystemMemoryAdapter — implement MemoryPort.
OpenTelemetryObservabilityAdapter — logs every decision with standard traces.

The domain receives the ports via dependency injection. The day you change the LLM provider, you swap the adapter. The domain doesn't notice.

What a well-structured agent looks like

A minimal skeleton in code (Quarkus, modern Java; the pattern applies identically in Python, TypeScript, or Go):

// Domain
public final class OrderQualificationAgent {
    private final LLMPort llm;
    private final ToolPort tools;
    private final MemoryPort memory;
    private final ObservabilityPort obs;
 
    public QualificationDecision qualify(Lead lead) {
        var context = memory.recall(lead.customerId());
        var decision = llm.decide(buildPrompt(lead, context));
 
        if (decision.requiresAction()) {
            tools.execute(decision.action());
        }
        obs.record(lead, decision);
        return decision;
    }
}
 
// Ports (interfaces)
public interface LLMPort { Decision decide(Prompt p); }
public interface ToolPort { Result execute(Action a); }
 
// Adapters
public final class AnthropicLLMAdapter implements LLMPort { /* ... */ }
public final class HubSpotMCPToolAdapter implements ToolPort { /* ... */ }

No import in the domain points to an external library. The imports point to interfaces (ports) of the domain itself. That restriction is the heart of the pattern.

Concrete benefits on an AI agent

Four benefits the architecture delivers out of the box:

Real testability of the reasoning. The domain gets tested with deterministic stubs. FakeLLMPort returns controlled decisions; FakeToolPort records which actions were requested without executing them; FakeMemoryPort returns synthetic contexts. Domain tests run in milliseconds, with no inference cost, and validate exactly the agent's business logic — not the model's.

Model change as a managed exercise. Moving from Claude to GPT or from a frontier model to an open-weights one happens by swapping the LLMPort adapter. The domain stays the same. The new adapter's eval against the existing dataset decides whether the migration passes.

Failure isolation by integration. If HubSpot is down, HubSpotMCPToolAdapter fails; the domain receives it as Result.failed() and decides what to do — retry, escalate to a human, return a degraded response. The fallback policy lives in the domain, not scattered through each adapter.

Cross-cutting observability. ObservabilityPort gets called from the domain on every key decision. Traceability isn't "added" at project's end — it emerges naturally from the architecture.

Anti-patterns the hexagonal prevents

Three frequent errors in agent projects this architecture cuts off at the root:

The "Frankenstein agent" — code where OpenAI API calls live in the same method with SQL queries, HTTP integrations, and business logic. Without layer separation, any change in the model or the database touches multiple places.

The "photocopy agent" — the team tries to support two LLM providers by cloning all of the agent's code with conditionals. Hexagonal replaces that with two interchangeable adapters.

The "untestable agent" — code that requires calling the real LLM to verify any branch of behavior. This produces slow, expensive, non-deterministic tests. With hexagonal, domain tests are as fast as any other unit test.

A real case: commercial agent over Salesforce and SAP

A client of ours has a lead-qualification agent that operates over two distinct systems: Salesforce as CRM and SAP as a backup ERP. History: the client migrated models twice during the project's first year (Claude 3.x to Claude 4.x, then a trial with an open-weights model on proprietary infrastructure for sensitive-data cases).

Hexagonal design:

Domain: LeadQualificationAgent with the commercial policy (ICP, thresholds, escalation, fit criteria). Tested with 280 synthetic scenarios.
Ports: LLMPort, CRMToolPort, ERPToolPort, MemoryPort, ObservabilityPort.
Adapters: AnthropicLLMAdapter, OpenWeightsLLMAdapter (transparent substitutes), SalesforceMCPAdapter, SAPGRPCAdapter.

Outcome of the two model migrations: neither touched the domain. Each migration was a pull request with a new adapter, an evals dataset run, and approval if metrics held. Average migration time: five days, including human review. Equivalent time in a non-hexagonal agent: essentially rebuilding half the system.

How we build it in production

In serious AI agent projects, hexagonal isn't a technical decoration — it's the decision that protects the three-year cost. Before picking a model, before writing a prompt, we define ports: what the agent's domain needs to know about the world and what it can leave abstract.

The agent's repository lives at the client's, with CI/CD, domain tests, and an evals dataset that runs against every LLMPort adapter before promotion. The architecture is executable from the first commit.

Behind each of these systems is a team of engineers who believe serious AI engineering starts by separating what changes fast — models, providers, integrations — from what has to hold for years: the agent's business logic. Hexagonal architecture by default, simplified when it helps.

If your AI agent mixes business logic with LLM calls and external-system calls at the same level, and every model change feels like rebuilding the system, we can audit the current architecture and hand you the refactor to hexagonal with a phased plan.