News & Insights

“You can’t blame it on the box”

The Financial Reporting Council’s Mark Babington says accountability for agentic AI is yours. The vendors won’t warrant agent behaviour. The system integrators won’t carry the risk. The answer? Don’t put the box in charge of anything you can’t afford to get wrong.

Mark Babington runs Britain’s Financial Reporting Council. The Council regulates auditors and accountants. This month, he told the Financial Times something about agentic AI every CTO needs to read carefully.

“You can’t blame it on the box. If you use this technology, you are accountable for it.”

Buyer Beware

Microsoft, Workday, Salesforce, ServiceNow and Oracle were asked a direct question last month. How much liability do you accept when a customer’s agent makes the wrong call?

Microsoft declined to comment. The other four didn’t reply at all.

That silence is the story. These are the five vendors selling agentic AI hardest into the enterprise market. Between them, they touch most of the systems a large business runs on. None of them will say, on the record, what they will stand behind.

Box Can’t Be Blamed

To understand why no vendor will accept liability for an AI agent, you need to understand what the technology is.

A traditional piece of software is deterministic. Same input, same output. Every time. That predictability is what makes warranties possible. The vendor knows what their product will do. They can stand behind it.

Agentic AI is different. It is non-deterministic. The same input can produce different outputs on different runs. The system reasons probabilistically. It can hallucinate facts that aren’t there. It can take actions nobody anticipated. This is not a bug being fixed in the next release. It is how the technology works.

Malcolm Dowden, senior technology lawyer at Pinsent Masons, put the consequence plainly. “If you’re giving a warranty about how something will behave, but it’s inherently unpredictable, then that makes it an uncomfortable contractual promise to make.”

Read any current enterprise AI contract and you will find the language has shifted. Less warranty, more “monitoring, observability and audits.” Those are services. They are not promises about outcomes. The vendor will help you watch the agent. The vendor will not stand behind what the agent does.

The Governance Pitch, And Why It Doesn’t Hold

The industry knows this is a problem. Its answer is something Gartner calls “guardian agents.” AI systems that monitor other AI systems for exceptions.

The pitch is that you can deploy agents safely because a second layer of agents will watch the first.

Independent AI researcher Dirk Roeckmann identified the structural problem that no AI governance layer yet solves.

“Non-determinism, first and higher order hallucinations and lack of formal verification of agentic actions are unsolved problems in the agentic workflow. It is not enough to let a non-deterministic review agent evaluate the actions of a non-deterministic executor or planning agent.”

Translation. You cannot fix a probabilistic AI system’s errors by asking another probabilistic AI system to check them. The checker has the same fundamental flaw as the thing being checked.

This is not a configuration issue. It is not a training issue. It is, as Roeckmann said, an unsolved problem.

You can see the trap. The vendors selling agentic AI cannot warrant its behaviour because the behaviour is unpredictable. Their answer is to add a second layer of the same unpredictable technology and call it governance. The governance layer cannot solve the problem either, because it has the same flaw. The CTO is being sold an answer to a question the technology cannot currently answer.

What you need is a different intelligence checking the AI. A system that does not share the same failure mode. Humans are the obvious example. Embodied, accountable, capable of saying no. But once you put a human in the loop on every consequential decision, you have to ask what the agent is for in the first place.

Where The Law Lands

If the vendor cannot warrant the behaviour, and the governance layer cannot verify the behaviour, the law has nowhere else to put the liability. It lands on the human who chose to deploy the system.

That is what “you can’t blame it on the box” means. It is not regulatory throat-clearing. It is the legal consequence of a technology nobody can stand behind.

Under UK data protection law, the user organisation is the data controller. That means the liable party.

A practical example. If you deploy an AI agent to screen job applications and it discriminates, the Information Commissioner’s Office holds you responsible. Not the AI vendor. You.

Lawyers are negotiating hard to push that risk back onto vendors through contractual provisions on bias testing and explainability. They are losing more of those negotiations than they are winning.

Gartner predicts unlawful AI decision-making will generate over $10 billion in remediation costs by mid-2026. Their analysts have started talking about “defensible AI.” Systems whose decisions can withstand scrutiny when somebody asks how they were made. That phrase exists because current deployments cannot.

I’ve Seen This Black Box Before

I want to bring some history into this. I worked on SAP rollouts in the early days. We sat across the table from decision-makers and asked them a straight question. Did they want a black box approach, or a skills transfer approach?

The big system integrators preferred the black box. They built the system, after all. They handed it over. The client got something that worked on day one. They also got something nobody inside the business understood. Every change after go-live meant another invoice. Every upgrade meant another project. Every new requirement meant another consultant on site.

Skills transfer was the other path. The SI worked alongside the client’s team. They taught as they built. By the end of the engagement, the client owned the system in every sense that mattered. They could run it. They could change it. They could grow it as the business grew.

The companies that chose the black box paid for it for 20 years. The companies that chose skills transfer ran their own shops.

Agentic AI puts the same question on the table. The stakes are higher this time. The vendor will not warrant the behaviour. The SI will not carry the risk. The regulator has named you, the CTO, as the accountable party. The only people in the room with skin in the game are your own.

That changes what good procurement looks like. Your team must understand the agent well enough to challenge it. They need to know what data it touches. They need to know how it reasons. They need to know when it is wrong. A black box agent nobody inside the business can interrogate is not just a legal exposure. It is an operational dead end. When the Information Commissioner calls, “the vendor built it” is not a defence.

There is a second question underneath all this, and it matters as much as the first. Who owns the data the agent feeds on?

Business data is not internet data. The emails your sales team sends. The contracts your legal team negotiates. The notes your finance team keeps on every customer. The records of every decision your board has ever taken. That information is worth more than the entire scraped contents of the public web. It is the asset.

When a vendor offers to plug an agent into your email, your CRM, your document store and your chat platform, the question is not just what the agent will do. The question is what the vendor will see. What they will store. What they will train on. What they will keep when the contract ends.

I do not like the idea of Microsoft scraping all our business emails. I do not think any CTO should. That is not a fringe view. It is a governance position. It belongs in every procurement conversation about agentic AI from the first meeting onwards.

Hierarchy of agents. Hierarchy of data. Ownership of both. These are the questions that decide whether you end up running the system, or the system ends up running you.

Where That Leaves A CTO

I am not anti-AI. Pivot is not anti-AI. Used carefully, inside well-defined processes with humans verifying the output, agentic tools can do useful work.

That is not what is being sold to enterprises right now. What is being sold is software that “actively runs the business.” Oracle’s phrase, not mine. Decisions in HR, finance, supply chain and regulatory output. The places where getting it wrong ends careers. The places the regulator has told you are yours to answer for.

Three things I would tell any CTO sitting with this decision.

Read what your vendor warrants in writing. If the answer is monitoring and observability, that is not a warranty. That is a service.

Ask your implementation partner what their stake in the outcome is. If their answer is implementation fees, that is an honest answer. Price the advice accordingly. Nobody recommending a system they will be paid to install is a neutral source.

Standard Before Agentic

Processes that don’t need an agent shouldn’t have one. Standard SAP is auditable, defensible and boring in the way the regulator wants you to be boring. Save the agents for places where the upside is real and the downside is recoverable.

Pivot doesn’t sell agents. We don’t earn licence revenue. That doesn’t make us right about everything. It means when we tell a client to slow down on this, we are not arguing against our own commission.

The CTO carries the risk. They should expect to get straight advice from the people in the room with them.

You can’t blame it on the box. Don’t put the box in charge of anything you can’t afford to get wrong.