Building an Agent-First Email Assistant with Cloudflare Durable Objects

Exploring agentic architectures using Cloudflare Workers and Durable Objects

/ January 16, 2026 / 28 min read / 5471 words

The source code is available meleksomai/os and you can try interacting with the AI Assistant by sending an email to hello@somai.me.

As I argued in my previous post, I believe the future of software lies in agent-first systems. During the holidays , I finally had time to explore what it would mean to build software around LLMs from first principles.1 Around the same time, I found myself confronting a backlog of emails requiring the usual tedious task of triaging, prioritizing, responding, scheduling, etc.

This led me to build an AI assistant that operates autonomously over email on my behalf. This is a great example of an agentic application where the AI Agent operates autonomously on behalf of a user. Despite its persistent dysfunction, email remains the most resilient protocol for asynchronous human (and soon machine) communication. It is a well-defined protocol with clear inputs (incoming messages) and outputs (responses, scheduling actions), making it an ideal testbed. The assistant must navigate the complexities of human communication, context retention, and task execution, all while adhering to the constraints of email as a medium.

This would serve dual purposes: reducing my email burden while also exploring the design space of agent-first systems. It also felt like enough of risk since I wasn't relying on this system for anything mission-critical, yet it would be launched into the wild and exposed to the real world. I remain concerned about potential jailbreaking or hallucinations from the AI assistant. I will explore some of these challenges later in the post. But, seeing how the system would fail and what edge cases would emerge when deployed in the wild is part of this experiment after all.

Design Principles

Starting from first principles required defining clear requirements that would guide the architecture and implementation of this AI Assistant. My goal is to build an AI Assistant that can manage emails autonomously — providing context-aware responses, scheduling meetings, and handling routine inquiries without human intervention. For this to work, the AI Assistant must:

  • intercept and understand emails sent to a specific email inbox (e.g., hello@somai.me). The AI Assistant should be able to parse the email content, extract relevant information, and understand the context of the conversation.
  • maintain context per contact to adjust its behavior safely. The assistant remembers prior interactions with each individual contact to inform future responses. It also means that there is a unique memory/context for each contact and that there is no cross-contamination of context between different contacts. Technically, this means that the assistant must be able to maintain separate state/memory for each contact.
  • connect to tools and services to perform actions on my behalf, such as scheduling meetings, sending follow-up emails, or retrieving information from external sources. Model-Context-Protocol (MCP) are useful to connect LLMs with external tools and services.
  • learn from my personal feedback and improve over time. The assistant should adapt its responses based on my personal feedback, refining its understanding of preferences and communication styles. This requires a feedback loop where I can provide corrections or suggestions to the assistant, and it can incorporate that feedback into its future behavior.
  • be secure and private since email often contains sensitive information. The assistant must ensure that all data is handled securely and without exposing sensitive information to unauthorized parties.

Platform: Cloudflare Workers + Durable Objects

Cloudflare has emerged as a compelling alternative to traditional cloud providers, offering AWS-level primitives with a developer experience closer to Vercel's. Cloudflare has a solid serverless platform with Cloudflare Workers, Cloudflare Email Routing, Cloudflare KV, and most recently Cloudflare Durable Objects.2

Cloudflare Durable Objects

Durable Objects service is the significant approach to serverless computing since AWS Lambda. Rather than thinking about scale in terms of microservices and distributed systems, Durable Objects let you think about scale in terms of stateful instances. Each instance is a self-contained unit of compute with its own memory, storage, and lifecycle. The scaling unit is no longer a stateless function but a stateful object that can maintain its own context over time. This is well-suited for building agent-first systems where each agent can be a Durable Objects instance with its own state and behavior. Durable Objects add concepts such as scheduling that enables each instance to have its own lifecycle, web sockets for streaming and long running connections (ideal for agents), and single-threading which enormously simplifies how we think about concurrency and race conditions.3

Agents require infrastructure designed for continuity, not ephemerality.

In my experience, building agent-first systems on traditional serverless platforms is fraught with complexity. Serverless assumes ephemerality. Functions execute, return results, and disappear. State, if needed, lives elsewhere in a database or external storage. This works well for request-response architectures but is too complex when applying serverless paradigms to agentic computing. And this is only going to get worse as agents become more sophisticated and stateful.4

Architecture

The architecture is simple: an email is received by Cloudflare Email Routing, forwarded to a Cloudflare Worker that routes the email to a specific agent (Durable Objects) instance. The agent processes the email, runs LLMs, updates its state, and optionally sends a reply.

Architecture diagram

I will focus on the most interesting parts of the architecture. You can check out the source code on meleksomai/os.

Cloudflare Worker

The Cloudflare Worker is the entry point for handling incoming emails. It uses Agents SDK helper function routeAgentEmail to route emails to the appropriate agent instance.

cloudflare worker logic

When an email is received, the routing logic should extract the sender's email address and use it to determine the correct agent instance. If an instance for that contact does not exist, a new one is created. However, email threads complicate things. If I have an ongoing conversation with someone (someone@example.com) and I reply to their email, I want the email to be routed to the same agent instance that is handling my conversation with that person, regardless of the sender. Email headers solve this problem. Email has a set of headers that can be used to identify the thread. The most relevant headers are Message-ID, In-Reply-To, and References.5 These headers are used by to group emails into threads and stack conversations.

Below is a simple function that extracts the root thread ID from the email headers using the References and In-Reply-To headers.

./resolver.ts

Thread-based routing enables another powerful pattern: I can reply to the email thread from my personal email address to only my AI Assistant (hello@somai.me). That way, I can share a private conversation with my AI Assistant regarding the ongoing thread without exposing its content to the other participants. This is a very powerful pattern that allows me to have private conversations with my AI Assistant about ongoing email threads that can be used to adjust its behavior or provide additional context.

Cloudflare provides some email routing resolvers out of the box in its Agents SDK like createCatchAllEmailResolver / createAddressBasedEmailResolver / createHeaderBasedEmailResolver. However, none of them fit my use case. I needed a custom routing logic that could route emails based on the email thread and contact.

Custom Routing Resolver

Since we are able to extract the thread ID from the email headers, we can use that to maintain a mapping between thread IDs and agent instance IDs. I created a custom email resolver createThreadBasedEmailResolver that can maintain this mapping using a Cloudflare KV store. When a new email is received, the routing logic checks for the sender's email address and the thread identifiers in the email headers. If a match is found in the KV store, the email is routed to the corresponding agent instance. If no match is found, a new agent instance is created, and the thread identifier is stored in the KV store.

resolver.ts

Cloudflare KV Store

The KV store is used to maintain the mapping between thread IDs and agent instance IDs. The KV store is attached to the Worker that handles the email routing using the wrangler.jsonc configuration file.

wrangler.jsonc

The only missing piece is updating the routing logic to use this custom resolver.

cloudflare worker with custom resolver

AI Agent

Now that we have the routing logic in place, we can focus on the AI Agent itself. Cloudflare Agents SDK provides a base Agent class that we can extend to implement our logic. It is basically a wrapper around Durable Objects that provides useful abstractions for building agentic systems.

Memory

Each Durable Objects' instance has its own memory: a simple SQLite database. This architecture makes memory management straightforward. Each agent instance can store its own state in its own database. The memory schema is therefore simple since we do not have to worry about multi-tenancy or cross-contamination of context between different contacts.

For my use case, the memory schema is pretty straightforward. Each agent instance maintains a list of messages (emails) exchanged with the contact, a context string that summarizes the conversation, and a summary of the contact's preferences and behavior.

memory.ts

When initializing the agent instance, we set up the initial memory state. We also define methods to update the memory state as the agent processes incoming emails and generates responses that should be persisted.

agent.ts

Routing Logic

At the entry point, the agent receives an email and determines whether it is from me (the owner) or from an external contact. The agent then routes the email to the appropriate workflow: either the owner workflow or the external contact workflow. This separation isolates the logic for handling emails from me versus emails from others.

We could call this defensive programming for AI agents—and that's exactly what it is. It is less prone to jailbreaking and unintended behavior if we separate the logic for handling emails from the owner versus external contacts using a deterministic routing.

snippet ./agent.ts

Workflow vs Loop Agents

To process emails, I use two distinct design patterns.

  • Loop Agents: The first is using a Loop Agent for handling emails from me (the owner). The Loop Agent pattern is useful for orchestrating dynamic workflows where the agent needs to iterate over a set of tasks until a certain condition is met. The Loop Agent has total freedom over the steps that it can execute from the list of tools it has access to and can decide when to stop based on the context and feedback it receives. The Loop Agent is therefore the most flexible and powerful pattern for building complex workflows. However, it is also less deterministic and more prone to unintended behavior if not carefully designed.

  • Workflow Agents: The second is using a more structured and step-by-step workflow. This is adapted from Anthropic's guide on building effective agents that emphasizes defining clear steps for the agent to follow. The Workflow Agent is more deterministic and easier to reason about since each step is explicitly defined. However, it is also less flexible and may not be able to handle dynamic workflows as effectively as the Loop Agent.

Those two patterns have the same interface and can be used interchangeably. The choice of which pattern to use depends on the specific use case and requirements of the workflow. Each Agent pattern implements the same AgentExecutor interface and can be invoked in the same way. This allows for flexibility in choosing the appropriate pattern for different scenarios.

./agent/workflows/agent.ts

Workflow Agents For Reviewing Incoming Emails

For incoming emails from external contacts, I used a Workflow Agent that follows a defined set of steps. Some of the steps are AI-powered (e.g., classify email, generate draft), while others are deterministic (e.g., send email, update context).

Architecture diagram

You should think of LLMs as stochastic functions that can be used to perform specific tasks (e.g., classification, generation) rather than as monolithic agents that try to do everything. This modular approach allows for better control over the agent's behavior and reduces the risk of unintended actions. Since its tool invocation has a deterministic interface (input, output), we can compose LLM-powered tools into larger workflows that have somehow predictable behavior.6

./agent/workflows/reply-contact-workflow.ts

LLM-Powered Loop Agents For Owner Emails

For emails from me (the owner), I used a Loop Agent that can handle more dynamic workflows. The Loop Agent can iterate over a set of tasks until a certain condition is met. This is useful for handling emails that may require multiple steps or iterations to resolve. Rather than defining a fixed workflow, the Loop Agent can decide what actions to take based on the context and feedback it receives. It uses an LLM to determine the next action to take, allowing for more flexibility and adaptability. You can think of this as a more free-form agent that can handle complex interactions.

The implementation of the Loop Agent uses Vercel AI SDK implementation. The Loop Agent defines a set of possible actions (tools) that it can invoke, and the LLM decides which action to take based on the current context.

Agent Loop diagram

The logic is therefore handled by writing the system prompt that guides the LLM on how to behave and what actions to take. You can think of it as using plain English to program the agent's behavior. The system prompt defines the agent's role, available actions, decision framework, and important guidelines to follow.

./agent/workflows/owner-loop-agent.ts

Tools

In both the Workflow and Loop Agents, tools are used to perform specific tasks. You can think of tools as functions that the agent can invoke to perform actions. They can be deterministic functions (e.g., send email, update context) or AI-powered functions (e.g., classify email, generate draft). Each tool has a defined input and output interface, allowing the agent to invoke them as needed. Vercel AI SDK provides a very powerful and easy framework to build tools that can be used by agents.

The AI Agent uses a set of tools to perform specific tasks. These tools include:

  • Email Classification Tool: Classifies incoming emails into categories (e.g., inquiry, complaint, follow-up) to determine the appropriate response strategy.
  • Email Drafting Tool: Generates draft responses based on the email content and context.
  • Context Update Tool: Updates the agent's memory based on new information from incoming emails
  • Email Sending Tool: Sends emails on behalf of the agent.

Cloudflare email routing currently does not support sending multiple emails directly from Durable Objects. Therefore, I used Resend as an external email sending service to send emails from the agent. This is a temporary workaround until Cloudflare adds support for sending emails directly from Durable Objects.

Email Classification Tool

I want to walk through one of the AI-powered tools in detail: the email classification tool. This tool is responsible for classifying incoming emails into categories to determine the appropriate response strategy.

Although the classification is handled by an LLM, the tool forces the output to conform to a strict schema using Zod validation. This ensures that the agent receives structured and predictable data that it can use to make decisions. LMs excel at generating structured data, and using Zod helps ensure that the output is valid and conforms to the expected schema at runtime. One trick I use is to include the Zod schema definition in the system prompt so that the LLM knows exactly what structure to follow.

There are three main parts to the email classification tool:

  • Input Schema (lines 6-10): Defines the expected input structure for the tool. In this case, it expects the current agent memory state, which includes messages and context.
  • Output Schema (lines 12-31): Defines the expected output structure for the tool. The output includes the classified intents, risk level, recommended action, whether approval is required, and comments.
  • Execution Logic (lines 58-65): The core logic of the tool that uses an LLM to classify the email based on the input state. It constructs a prompt that includes the email content, historical messages, and context, then invokes the LLM to generate the classification.
./agent/tools/classify-email-tool.ts

Summary

In this post, I walked through the architecture and implementation of my personal AI Assistant using Cloudflare Workers and Durable Objects. The key components include custom email routing logic based on email threads, AI-powered memory management using SQLite, and distinct agent patterns (Workflow and Loop Agents) for handling different types of emails. The use of tools, both deterministic and AI-powered, allows for modular and composable agent behavior.

We are still in the early days of AI Agents, and there is much to explore and improve. I think the combination of Cloudflare's serverless platform and the Agents SDK provides a powerful foundation for building scalable and efficient AI-powered applications. However, it still feels a bit confusing and the frameworks are evolving rapidly. I started this project as an experiment, but it took several iterations to get to a prototype that is architecturally sound and functionally useful.

In terms of safety and security, I took several precautions to ensure that the agent behaves responsibly. The separation of owner and external contact workflows helps prevent unintended actions. The use of strict schemas for tool outputs ensures that the agent receives structured and predictable data. Additionally, the routing logic based on email threads helps maintain context and continuity in conversations.

Send me an email at hello@somai.me and you will be routed to the agent. It is early, but it already works.

Footnotes

  1. By building "from first principles," I mean starting not from existing software patterns or human workflows, but from the fundamental capabilities and constraints of LLMs themselves. Most AI products today attempt to fit language models into interfaces designed for humans—dashboards, buttons, forms. The first-principles approach inverts this: what would software look like if we designed it around what an agent can naturally do? In the case of email, this meant asking: given that LLMs are stateless, context-dependent, and excel at language understanding, what architecture would allow an agent to operate autonomously within a protocol that was never designed for machines? The answer required rethinking memory, state, and interaction patterns from the ground up—not retrofitting AI into an inbox UI.

  2. Vercel vs AWS vs Cloudflare: Vercel is an amazing platform. It has historically been great for web apps. The recent fluid compute, AI SDK, and Workflow are very powerful and have tons of community support. However, Vercel is tightly coupled to Next.js and the web framework lifecycle. For this project, it felt unnatural for an agent-first system. I really like the craft the team at Vercel put into building all the products and services, but I did not want to build on top of a web app. AWS is a powerful hyper-scaler. It has become, though, an onerous platform to start with. I certainly do not want to set up VPCs, configure Bedrock, manage IAM, connect my CI with Cloudformation and orchestrate a mountain of infrastructure pieces just to get a prototype running. Moreover, their foray into AI with AWS Bedrock and its sub-services like AgentCore seems not to resonate with the new community of developers. For instance, I find the development experience on AWS Lambda to be quite challenging compared to Vercel. AWS's new AI services do not have the same "care" and dynamism that AWS used to put into building things like DynamoDB or S3 back in the day.

  3. I am excited but also skeptical. I am curious to see how Cloudflare will execute on this vision over time. There are still many open questions around Durable Objects, and I have yet to see successful businesses and ideas built on top.

  4. We see already the emergence of new infrastructure constructs and primitives such as sandboxed execution environments that allow agents to invoke tools safely, dedicated agent VMs that maintain state between invocations, RAG systems that ground LLM reasoning in retrieved knowledge, and orchestration layers (LangGraph, Temporal, AWS Step Functions) that coordinate multi-step agentic workflows. All these point to a future where agents require continuity rather than ephemerality. Traditional serverless platforms are not designed for this paradigm shift.

  5. For more information about email threading and headers, see RFC 5322 bis 12. It is important to note that we are assuming that In-Reply-To is always a single parent and hence we can walk backwards through the References field to find the parent of each message listed there. Therefore, this is not compatible when a reply has multiple parents (which is discouraged in the RFC). https://datatracker.ietf.org/doc/html/draft-ietf-emailcore-rfc5322bis-12#name-identification-fields

  6. This is similar to the concept of tool use in agentic systems where LLMs are used as components that can be orchestrated by higher-level logic. By breaking down the agent's behavior into discrete steps, we can better manage complexity and ensure that the agent behaves as intended. This might seem confusing and it is. But as we build more sophisticated agentic systems, we will need to adopt such modular and composable architectures to manage the complexity of agent behavior.