It took me a year to see the painful truth about Agent payments
Author: jessy
Compiled by: Jiahua, ChainCatcher
Over the past year, I have been dedicated to building infrastructure for the Agent economy, communicating with teams from Stripe, Visa, Coinbase, Google, and dozens of startups driving Agent business. I have compiled the entire industry, launched products, and sought to find market fit.
Currently, there is no real demand, and startups face many structural issues when entering this field.
Last month, Stripe launched 288 new products at the Sessions conference, with nearly 40% of the traffic to their Agent documentation coming from the total document readership. Their Agent business marketplace has over 1,000 enabled merchants. However, at the Sessions conference, the number of registered Agents conducting transactions was only in the single digits.
Visa mentioned that their Agent payment tokens (linked to Agents, used for tokenized payment on behalf of users) currently require 3 to 9 months of KYC approval, and in reality, a minimum revenue threshold of $250 million must be met to qualify. Nowadays, only companies at the level of Amazon and Walmart can complete this identity verification loop.
Coinbase reported that as of April, there were 69,000 active Agents on the x402 protocol and 165 million transactions. However, independent on-chain analysis shows that the actual daily transaction volume is about $17,000, with approximately half being test transactions (according to CoinDesk in March 2026).
Agents and Merchants
We built shop.fast.xyz to directly validate the real applications of purchasing-based commerce. It includes real products, merchants, and transactions.
For most product categories, the current user experience of AI shopping is far inferior to traditional e-commerce. When you buy clothes, electronics, or furniture, you want to see pictures, browse various options, and make side-by-side comparisons.
The conversational format of chatbots is actually a regression. You are replacing a rich visual interface with pure text dialogue, while humans are essentially visual shoppers.
Agents perform excellently in areas we originally thought would be difficult. They can understand user needs and handle instructions like "something similar but cheaper" effectively. The model layer plays a role.
But it cannot replace the experience of browsing ten products side by side and then selecting one. The chat interface can be enhanced with carousels and interactive displays, but to that extent, you are essentially just recreating an e-commerce frontend in the chat window. For visually driven price comparison shopping, we have yet to find a compelling reason to prove that chat interfaces are better than native e-commerce interfaces.
We see real demand from merchants, but it is a defensive demand.
Merchants want their stores to be queryable by Agents. This is not because current customers are buying through Agents, but because they worry that if this becomes the mainstream channel, they will be left behind by the times.
This is a strategy of "Agent Engine Optimization (AEO)," but currently, it is just an enhancement rather than essential. Merchants are preparing for a wave that has yet to arrive.
Conversational commerce can indeed enhance the experience in certain scenarios: high-frequency, low-decision-cost purchases where users already know what they want. Ordering takeout is the most obvious example. The market is huge, the frequency is extremely high, and decisions are quick ("Help me order Pad Thai from that place I used last time"). Conversational Agents have a chance here.
However, major takeout platforms have not opened their APIs. The only way is through "computer usage": letting AI operate applications through visual navigation like a human. This method is slow, fragile, and the reasoning cost cannot bear the weight of a $15 lunch order.
Another breakthrough lies in the fact that the UI navigation of certain stores is extremely complex and painful. Stacked discounts, promotional codes, loyalty programs, and confusing checkout processes.
An Agent that can understand "use my coupon, deduct my reward points, find the cheapest shipping, operate in my native language" can simplify those currently poor experiences. This is particularly important for elderly users, non-native speakers shopping at online stores in foreign locations, or in very niche scenarios with specific needs.
Both breakthroughs require a large consumer-facing (B2C) distribution channel. You are competing for user entry points with DoorDash (the largest delivery platform in the U.S., holding 56% market share) and Amazon.
Consumer-scale distribution is an advantage for giants. The supply side of purchasing-based commerce is ready, while the demand side is limited by user experience and distribution channels; building more infrastructure does not solve these two problems.
Agents and APIs
We discussed the actual payment needs with dozens of developers. The situation is almost astonishingly consistent: the current use of Agents for APIs is frequent, including computation, reasoning, and data sources. Developers already have subscription services, archived API keys, and billing relationships with core providers.
The typical argument for stablecoins is that on Stripe, the minimum effective cost of credit card processing is about 2.9% plus 30 cents, making API calls under a dollar unprofitable. But for today's low-frequency transaction volumes, prepaid limits can solve this problem. Developers pre-load their accounts, and the issue is resolved.
The deeper issue lies in the vendor market. Most mainstream SaaS companies do not want to provide temporary API access that costs just a fraction of a cent. Their business model is based on multi-year enterprise contracts. Companies that rely on large commitment contracts will resist pricing mechanisms that bypass their existing models.
Machine commerce is structurally a long-tail market, including smaller services, niche data sources, individual developers, and MCP servers. Protocols like MPP and x402 are very suitable for this niche market.
But by definition, this is a market serving advanced users with special needs, and historically, developers have often been one of the groups with the lowest willingness to pay.
When Stripe Projects launched, it partnered with 32 vendor partners, such as Vercel, Supabase, Cloudflare, Twilio, etc., covering most of the tools developers use to build and deploy software, all accessible through existing billing systems. The top demand of the developer tech stack has already been met.
Opportunities for new payment channels exist in all areas outside of these top 30 services: opportunities do exist, but their scale is inherently much smaller than what those impressive numbers suggest.
The same pattern applies to content acquisition. Agents have been continuously scraping and summarizing articles, while publishers are fighting back.
But when content monetization arrives on a large scale, it will be realized through CDN providers that are already positioned between publishers and the internet (Cloudflare has already launched AI auditing tools for this purpose), or through large-scale licensing agreements between publishers and AI labs.
The opportunities for this infrastructure will ultimately flow to those giants that already have distribution channels.
Agents and Agents
The Agent-to-Agent business model is a long-term vision that currently exists almost entirely at the theoretical level, with no one achieving meaningful transaction volumes. Various startups are tackling the core challenges: Agent discovery, trust establishment, terms negotiation, and dispute resolution.
When this transaction structure truly lands, it will be completely different from existing payment tracks. Neither party in the transaction will include human identities. Delays will be in the sub-second range. Funds ranging from fractions of a cent to millions of dollars will operate within the same process.
Additionally, there will be multi-party settlement mechanisms, which do not conform to the existing payment tracks' preset bilateral buy-sell model. Once this happens, we believe it will come quickly and at a large scale.
This is a long-term bet on dedicated settlement infrastructure, and it genuinely exists. But "real long-term bets" and "current markets" are two different things.
For months, we have been among those promoting this market and have built a complete infrastructure around it over the past few years. With our distributed network, it can theoretically scale to over 1 billion TPS, with latency under 50 milliseconds and average consistency of 10 milliseconds. But we must align with the current real position of the market.
Agents and Finance
This can be said to be the only category with existing demand. The customer base already exists and has a willingness to pay. Today, fund managers, finance teams, and DeFi users are paying for financial tools. Integrating AI into existing workflows is a natural product evolution.
Agent finance also creates entirely new behavioral patterns. An Agent that can autonomously monitor and rebalance hundreds of positions in real-time operates in ways that humans cannot manually replicate. This is not just automation; it is a substantial capability enhancement.
The challenge lies in the competitive landscape. The financial industry is heavily regulated and highly dependent on existing business relationships. Established institutions have licenses, compliance infrastructure, and customer relationships. Startups can seek a foothold in areas with lighter regulation (like DeFi), where giants are slow to act, or in areas where AI can create capabilities that giants do not possess.
However, compared to the other three categories, the competitive dynamics here are more favorable to mature enterprises, as layering AI on top of existing products and customer bases is far easier than the reverse.
The Real Competitive Edge
So, why is everyone still building these things? There are two reasons.
First is motivation. Industry giants have ample cash flow to bet on a future that may take years to materialize. For them, the cost of entering five years early is just a rounding error, while the cost of entering a year late could be catastrophic. So they must build.
Second is cognitive blind spots. When your main business is payments, every problem looks like a payment problem. The Agent economy needs a payment layer, so let’s build that payment layer.
But payments are just one part of a larger issue. The real challenge is not how to transfer funds between Agents, but how to coordinate work between Agents and humans, verify work results, and settle outcomes. Payments are just part of the settlement. Settlements are just part of coordination. And coordination is the real big pie.
Large-scale coordination will naturally give rise to settlement mechanisms as a necessity. Payments are just one instrument in this symphony, not the entire movement. Companies that solve coordination problems will absorb payment businesses, not the other way around.
Most established companies are engaging in defensive construction to prepare for future scenarios of large-scale machine transactions. Since their funding runway is infinite, timelines do not matter to them.
But startups do not have that luxury. We must seek the true location of the market; we cannot just wait for the wave to crash ashore.
A year of building has led us in an unexpected direction. There, market activities genuinely exist, are growing rapidly, and have yet to be adequately served. It lies outside the four categories we have described.











