Unlocking the agent-to-agent AI opportunity

Agent-to-agent (A2A) protocols allow autonomous AI agents to connect and execute workflows directly across different organisations.
Without structured constraints, AI automation can rapidly escalate small processing errors into large, systemic financial issues before humans notice.
To bring these agents online safely, AI governance must be front of mind: businesses should deploy clear human-led oversight models, precise operational scope definitions, and strict payment ceilings.

You ask your AI assistant to book a flight to Melbourne for a Tuesday morning meeting. Within seconds, it has contacted your corporate travel provider’s booking agent, checked availability, and selected a flight and hotel within your company’s travel policy. Your agent either completes the booking and charges a nominated card, or it pauses and presents the reservation for your approval.

No email was sent. No phone call was made.

This is agent-to-agent AI (or A2A) in operation. The underlying protocol is available now, and the commercial use cases are beginning to move from experimentation into production, says Sam Hemphill, Executive General Manager – Customer, Channels and Data, CommBank.

Getting there safely is the harder part.

What A2A actually means

A2A describes interactions between two or more AI agents, either from different organisations or within one, “as distinct from one agent accessing multiple tools or multiple skills”, says Hemphill.

The protocol that makes this possible is an open standard, originally launched by Google in April 2025 and now maintained by the Linux Foundation. More than 150 organisations support it, including Microsoft, AWS, Salesforce and SAP. Think of it as a common language that lets agents from various applications understand each other.

In practice, every A2A-compliant agent publishes what is called an ‘agent card’ – a structured description of what it can do and how to interact with it. Another agent can use this information to determine whether it can safely delegate a task.

When your assistant AI agent needs to book a flight, it locates the travel provider’s booking agent, reads its agent card to understand its capabilities and sends a task request. The booking agent accepts the task, executes it and returns the result. Critically, the two agents interact without needing to share internal memory, tools or proprietary logic – your systems and the travel provider’s systems remain separate.

Within a single organisation, the need for multiple agents depends on workflow complexity. As individual agents become more capable, some tasks that once required several agents may be consolidated. The more consequential shift is happening across organisational boundaries.

“We are seeing increasing examples of A2A workflows between organisations,” Hemphill says, including personal agents interacting with external businesses. That is where the commercial opportunity – and the governance complexity – is greatest.

“We are seeing increasing examples of A2A workflows between organisations.” – Sam Hemphill, Executive General Manager – Customer, Channels and Data, CommBank

What tasks is A2A best suited for?

A2A workflows are generally best for routine, well-defined tasks where boundaries can be set in advance and the stakes per transaction are bounded. Travel booking is a classic example: destination, dates, class of travel and budget can all be specified before the agents interact.

But the range of tasks that meet these criteria across a business – with its customers, suppliers and service providers – is vast. Procurement approvals, invoice matching, appointment scheduling, logistics coordination: the next wave of automation is not about replacing individual tasks, but about connecting entire workflows between organisations.

What makes this wave different from previous automation is not speed or scale – it is responsibility. Earlier automation followed fixed rules. AI agents can choose between options and initiate actions within a defined scope. That shift changes the nature of what organisations are handing over to technology and it demands a different kind of governance in return.

The 100-flights problem illustrates why. Without the right constraints, nothing stops an agent from making the same type of decision repeatedly and rapidly – booking 100 flights in two minutes, committing to contracts across multiple suppliers or escalating a small error into a costly chain of actions before any human notices.

“If the actions of an agent in an A2A workflow cannot be precisely defined, managed, overseen and governed, then there is a level of risk involved that some will not be comfortable with.” – Sam Hemphill, Executive General Manager – Customer, Channels and Data, CommBank

What oversight do human teams and leaders have?

Return to the booking. The confirmation has come through. How do you know it is right? The destination, departure time, cabin class and total cost each represent a point at which something could go wrong. Whether a human checks before the booking is finalised or after depends on which oversight model a business applies.

Hemphill describes three forms of human-led monitoring. ‘Human in the loop’ means a human approves before the workflow is completed. ‘Humans at the helm’ means humans oversee, but without intervening at each step. ‘Human quality assurance’ means spot-checking completed workflows.

“We typically see agents monitoring agents as common practice,” Hemphill says. “This is rarely the sole form of oversight but is valuable in augmenting human oversight, given the potential breadth of coverage, speed of identification and resolution of issues.”

For most organisations starting out, payment is a natural line of intervention. Letting an agent search and select is one thing. Letting it charge a corporate card is another. Accountability must be designed in from the start.

“Ideally, this relies less on reconstruction after the fact, and instead includes identification and intervention in real time,” Hemphill says. As A2A scales, he acknowledges, “there will certainly be A2A workflows in which human oversight of individual decisions is impossible” which is why system design matters more than any individual check.

Greater complexity in A2A systems is not only a governance headache. It can create exposure to cyber threats, says Hemphill.

“As model capability evolves, so does potential agent functionality and cybersecurity implications. This can both introduce or increase risk and vulnerabilities, and create enhanced functionality that can be deployed to try to strengthen defences.” – Sam Hemphill, Executive General Manager – Customer, Channels and Data, CommBank

The same AI capabilities that create new attack vectors are also the best available means of defending against them. For example, Claude Mythos Preview, an AI security tool released by Anthropic to a small number of organisations and governments, can help identify vulnerabilities in complex environments like A2A workflows.

Models like Mythos or OpenAI’s GPT-5.5-Cyber can reportedly identify vulnerabilities in large-scale infrastructure at a speed and depth no human team could match. Imminently, AI may be the only tool sophisticated enough to defend against AI-powered threats.

When things go wrong

The travel booking comes through – but the agent has booked Brisbane, not Melbourne. Whose fault is that? For now, nobody can say with certainty.

“From what we understand, there are multiple and complex legal principles that could impact this,” Hemphill says. Applicable laws, principles and contractual arrangements and other factors will all shape questions of liability, but A2A interactions do challenge traditional principles of contract law and case law has not yet caught up.

That is a reason to ensure any commercial arrangement involving A2A interactions addresses agent liability explicitly, even if imperfectly. Before engaging another organisation’s agent, know whose agent you are dealing with and what its operating limits are.

Hemphill’s minimum viable governance comes down to a few conditions. Scope and permissions must be defined and documented precisely before the agents run – not just loosely described, but written down in terms the system can enforce. Controls must be built in from the start: prompts, guardrails, AI monitoring, human oversight and traceability.

On the practical side, consider running agent payments through a dedicated payment instrument, such as a credit card with a strict limit, so the agent cannot exceed an approved ceiling. Define what it can and cannot do – searching and booking is one thing, upgrading or adding extra passengers without approval is another. And build in a stop mechanism.

“You need the capacity to intervene and stop any workflows, if you are no longer comfortable for them to be agentically executed.” – Sam Hemphill, Executive General Manager – Customer, Channels and Data, CommBank

One cost that catches organisations off guard: token consumption. Running agent workflows at scale carries real compute costs that need to be in the budget from the outset, not discovered after the fact. These costs should be forecast, monitored and optimised as part of the operating model.

At its best, A2A handles the work more efficiently than any human could – the bookings, the approvals, the follow-ups – and returns that time to human employees. The technology to make this happen is ready. What remains is essential and unavoidable work: defining boundaries, establishing accountability and making sure the controls are in place before the agents run.

That work still belongs with human teams: defining boundaries, establishing accountability and making sure the controls are in place before agents run.

Spark brighter ideas

Get the latest research, actionable insights and expert views on the big issues facing businesses.

Brighter Perspectives