Commerce Protocols Solved Checkout. The Hard Part Comes Next.

I hate running out of pantry staples and prefer to keep backups of the essentials, especially coffee. We have a Sonos speaker with Alexa running in the kitchen. When I use up the last of a pouch of coffee, before I grab the next bag, I call out, “Alexa, tell AnyList to add Peet’s coffee to the grocery list.”

It’s ordinary now and has been for years, but I can now see it was setting us up for this day. I speak at an ordinary volume, in an ordinary kitchen, as if a butler is there. I’m no longer experimenting or playing with these products. They’re part of my life, the same as the computer I’m typing on.

When Siri was first introduced over a decade ago, that was the moment all of this felt inevitable, not futuristic. Now it almost feels boring. The task completes, and I move on to drinking my coffee.

I had not unlocked my phone or opened an app. I didn’t navigate to the grocery list, search for the item, or confirm a save. I simply called out my request into one system, and another system updated state on my behalf. Peet’s coffee appeared in AnyList, an app I have used and trusted for years, but one Alexa does not own.

When Amazon shut down the List Skill API, AnyList had to use an invocation name for its custom skill. Before that, I could just say “add Peet’s coffee to the grocery list” and Alexa would infer that my list lived in AnyList. A lot of us were frustrated when we had to start remembering the tool name explicitly in the request. But it also immediately dawned on me what Amazon was setting up for. The interface I used was not the system that executed the work. Authority still propagated because I had already granted permission for those systems to cooperate. The coordination happened quietly enough that it was easy to miss, and maybe that is exactly why it matters. The biggest platform shifts usually arrive disguised as convenience.

For most of the internet era, we accepted a different model. The company that owned the interface owned the relationship. If you logged into a retailer, that retailer controlled the path. If you logged into a bank, the bank controlled the path. If you opened a service app, that app defined the terms of interaction and the limits of what could happen next.

We built a lot of software around that assumption. We built identity around it, loyalty around it, and customer support around it. It was not sinister. It was structural. Every digital system was its own island, and the human was responsible for island hopping.

What is changing is not that the islands disappear. What is changing is who does the crossing and where the user experience is anchored.

Voice assistants were an early hint. They started as thin convenience layers over existing systems, then gradually became the place where intent was expressed first. That same pattern is now showing up with coding agents, shopping agents, and service agents. More often, the first move is not opening the system of record. The first move is asking the agent you trust.

I have seen this firsthand with coding agents too. When an agent I was working with recommended a deployment platform, it did not pick based on brand or marketing. It picked the option with the least friction. No campaign language. No opinion about brand. Just an operational judgment about ease of execution. I had just written about this idea that “you don’t win, you qualify” in the context of retailers and GEO, but experiencing it as a user still caught me off guard. Systems that are easier for agents to operate will be selected more often, and systems that are harder to operate will lose share of interaction over time, even if their human-facing marketing is stronger.

This is one reason the current protocol work in commerce matters so much. Universal Commerce Protocol (UCP) and Agentic Commerce Protocol (ACP) are meaningful progress. They are giving agents structured rails for discovery, offer handling, authentication, and checkout. Adoption will take off, and it also reveals what comes next.

Checkout is necessary, but checkout is only a thin slice of a much longer relationship. Buying has edges you can define clearly. Stewardship does not. Stewardship stretches across history, identity, habit, and change over time.

But the relationship outlasts the receipt. You can see it in something as simple as apparel. Putting my old Gap, Inc. hat on, what would need to be added to these frameworks to allow a major brand like The Gap to create an agent for its customer that can do everything? I’m thinking about the entire customer lifecycle and history of engagement. What I learned deep in customer files and purchase history was that a large brand like The Gap and its sub-brands has customers who start their lives in Gap, get their first job wearing a suit from Banana Republic, and stay active in Athleta. Later, when they get married and have kids, those kids are outfitted in Baby Gap, and the family ends up in “Jingle Jammies” from Old Navy. That consumer wants the entire customer history and context to be available. Hell, I’d like to say to this agent, “Remember those jammies my wife bought a couple Christmases ago? Estimate today’s sizes, or pull recent purchase history on the kids, and get us all matching fun t-shirts for spring break this year.”

There are so many things this agent could do if it had more access, and I’d want to control that access through traditional settings and permissions. In data models those may be separate brands. In lived experience, it is one evolving story.

That is not transaction execution. This is customer experience stewardship.

The same inversion becomes even clearer with my truck. I drive a 2014 Ford F-150, and like anyone who has owned a vehicle long enough, I have a mental map of its quirks. I know how it sounds on a cold morning. I know when it feels just a little off before a dashboard warning ever appears.

What I do not want is to manage that relationship by logging into a maze of disconnected portals. I do not want to think about which interface contains recall notices, which one has service records, or which one has maintenance timelines. I’m tired of scanning through a folder of service order receipts and paperwork. More than anything, I do not want Ford to own my interface for this problem.

I want my agent to own the interface.

I want to ask one question, in one place, and have the answer come back as a coherent plan. I want the agent to talk to Ford systems, dealership systems, and service data in the background, then surface only what I need to decide.

That is not an incremental improvement to the customer experience. It is a structural inversion. For decades, owning the customer interface was the business model. The portal was not a convenience. It was a moat. And whoever owned the moat owned the relationship. What I am describing is a world where manufacturers, retailers, and service providers remain essential to the system but no longer control the surface. The agent does.

When the interface is agent-primary, authority, execution, and data ownership no longer have to be bound to the same company in the same moment. They can be coordinated through delegated permission.

You see the same need outside commerce too. Healthcare is an obvious example. Families do not want to track every portal, every appointment system, every billing surface, and every prescription workflow by hand. They want one trusted conversational layer that can carry context, ask for approvals when needed, and keep continuity across institutions.

My use case is more involved because I have a disabled son, but the pattern is universal. Any parent with a kid has a pediatrician, a dentist, and likely at least one specialist. Change providers once or move once, and records start spreading across portals. We carry separate insurance for medical and dental, yet sometimes dental work needs to be billed to health insurance. Without a password manager, my wife and I could not keep track of it all. Add in scheduling tools, prescription pickups at the pharmacy, follow-up items from recent visits, and doctor signatures for summer camp, and you have a small bureaucracy that every family runs by hand. Even my parents have raised this coordination challenge with their own medical providers, therapists, and dentists. Here again, I want my agent to own the interface and orchestrate this web of data and transactions on my behalf.

That is why a recent “Markdown for Agents” feature announcement from Cloudflare gives more signal than it might appear at first glance. Cloudflare sits on global traffic paths. They do not need to guess from a trend deck about where interaction patterns are heading. They can observe change directly at infrastructure scale. When a company that built its reputation helping sites detect and control automated traffic starts helping sites become readable and operable for agents, that is not a cosmetic product tweak. It is infrastructure responding to what it is already seeing. Enough agents with delegated authority are now visiting enough websites that the companies behind those agents are realizing their content — and eventually their transactions — needs to be legible to an AI.

This is also why I find the merchant agent versus consumer agent framing less useful than it sounds. Both types of agents will exist because they solve different problems. Merchant agents will remain strong in brand-specific depth, catalog knowledge, inventory nuance, fulfillment policy, and loyalty mechanics. Consumer-primary agents will remain strong in cross-brand continuity, personal context, and long-horizon coordination.

I do think many retailers will not make it through this AI revolution in commerce. Some merchants will survive and earn the right to stay engaged. Likely it will be mega retailers like Walmart’s Sparky, The Gap, and grocery leaders such as Kroger. Then there is the consumer-driven ecosystem still being fought over by Siri, Alexa, Gemini, and Meta. But all brands and merchants will have to make themselves available to that consumer-driven ecosystem to stay relevant. For example, I might not want to engage a Ford Motor Company AI because I don’t have that much going on with Ford, but I would like to say, “ChatGPT, are there any upcoming maintenance items I should plan for with Dorothy, or any new recalls?”

The real bottleneck is not whether one of these disappears. The bottleneck is continuity between them.

Most agent interactions today are still session-bound. They authenticate, execute, and expire. That model is perfectly fine for narrow tasks. It starts to break when the relationship itself is the product and when the value depends on memory, context, and repeated action over time.

What is missing is a durable delegation layer. Durable delegation is not just a longer token lifetime. It is a model where a user can grant persistent, scoped authority to a trusted agent with clear boundaries, clear auditability, clear revocation, and stable identity continuity across systems. It is a model where lifecycle surfaces like returns, warranties, subscriptions, maintenance events, and service history are treated as first-class operations rather than custom exceptions.

I feel this every time I need something simple, like checking a warranty, updating a subscription, or confirming a service date. The cognitive load is not the task itself; it’s remembering which system owns which slice of data and which login gets me there. I end up bouncing across multiple portals that already have what they need, while I manually stitch the answer together in my head. From a builder’s perspective, that is the absurd part: the systems can exchange state better than I can reconcile tabs, passwords, and receipts. The quiet realization is that this coordination work should not belong to the human at all.

Without that layer, we can automate purchase completion but not lifecycle stewardship. And clearly we humans want this. It’s been in our sci-fi stories forever, and it had a bit of a breakout moment with OpenClaw demonstrating the pent-up demand for a durable agent delegated real power by its human user.

In August 2025, before the Universal Commerce Protocol (UCP) was announced, I jotted down some pieces I felt were missing from the Agentic Commerce Protocol (ACP). In thinking about the account-management features needed, not too dissimilar to what B2B commerce platforms support for large organizations with many actors handling various transactions, it was also clear that these could help limited consumer features like parental controls that aid commerce and commerce learning.

None of this implies a clean overnight transition. Platform incentives are real. Liability concerns are real. Standards always lag product pressure for a while. Some ecosystems will continue favoring temporary, per-task authority patterns long after others move toward persistent delegation.

But when I look at the pressure in the system, the direction still looks consistent. Lifecycle complexity keeps growing. Users keep signaling they prefer one trusted conversational locus over dozens of equal interfaces. Merchants keep benefiting when interaction friction drops. Agents keep preferring systems that are easier to operate end to end. And history keeps showing that when friction is optional, markets gradually remove it.

So the point is not that every company will implement one universal protocol at once. The point is that the primitives will converge because the incentives are converging. Different stacks will move at different speeds, but the architectural center of gravity is already shifting.

This is the client-to-actor transition.

In the current shape of the web, agents behave mostly like clients. They make requests inside short-lived permission windows and then disappear. In the emerging shape, trusted agents begin to behave more like durable actors. They carry identity, scoped authority, continuity, and accountability over time on behalf of a person.

Seen that way, this is not simply a story about agents placing orders. It is a story about unbundling pieces we historically kept fused together. Interface can separate from execution. Authority can separate from ownership. Coordination can span systems without requiring one system to own the human relationship outright.

When Amazon forced Alexa requests to include the tool name, such as AnyList, it felt like a small moment. But it was also a preview. Delegation already exists in fragments, and the fragments are becoming legible.

Commerce protocols solved the first layer. Agents can now buy. That is real progress, and the work behind it was not trivial.

The hard part comes next. The next layer is stewardship.

The question is no longer whether agents will transact for us. The question is how quickly we will build the authority models needed for agents to steward our digital lives responsibly. Meanwhile, the kitchen is quiet, the coffee is restocked, and the coordination that once required my attention already happens somewhere beyond my view.

About the Author

CATEGORIES

FOLLOW