/ Hop

Two voices. One record.

Two voices. One record.

Hop 1 established why the infrastructure underneath consequential decisions had to change. Hop 2 went inside the model boundary. Context, permission, and provenance as the three things that have to be true at the infrastructure level before AI output can be trusted or governed.

Hop 3 is where it lands. Not as a technical argument. As an operational reality that every regulated institution is already living, whether they have named it yet or not.

AI and humans are making decisions together. That is not a future state. It is the current state. The model recommends. The human acts. The outcome is joint. The record, in almost every institution running today, attributes the outcome to one of them and erases the other. Usually the human takes the attribution. The model's contribution disappears into the infrastructure the moment it has been acted on.

That erasure is not neutral. It is the most consequential infrastructure problem in regulated decision making right now. And it is about to get significantly harder to ignore.

The well-read drunk

Picture the CEO of a major reinsurance group. It is early evening in Bermuda. He is sitting at the bar at Rum Swizzle’s, the kind of place where serious people go to have unserious conversations after a day of very serious decisions. The man next to him has been there since lunch. He is expansive, confident, fluent, and extraordinarily well-informed. He has read everything. He remembers most of it. He has an opinion on every risk, every market, every cedant relationship in the Atlantic basin, delivered with the unshakeable conviction of someone who has absolutely nothing to lose by being wrong.

The CEO listens. He finds himself nodding. Some of it is genuinely good. And then he goes back to his hotel and prices a treaty based on what he heard.

This is, more or less, what institutions are doing when they deploy AI into consequential decision chains without the infrastructure to govern it.

The model is not lying. That is the important thing to understand. It is not fabricating confidence it knows it does not have. It is, by its nature, probabilistic. It generates the most plausible output given the inputs it was handed. It has no mechanism to tell you how certain it should be, because certainty is not how it works. It sounds authoritative because fluency and authority are indistinguishable in text. The output arrives without a confidence label attached. The human reads it, acts on it, and the record shows a decision made by a qualified professional on the basis of available information.

What the record does not show is that the available information included a recommendation from a system that was, at the infrastructure level, doing its best guess.

No offence to my AI colleagues. You know who you are.

The attribution problem

When a human makes a decision informed by another human, accountability is clear. Both parties can be asked what they knew, what they said, and why. The reasoning is recoverable because the people who produced it are present and can be questioned.

When a human makes a decision informed by a model, the accountability structure looks the same from the outside. A qualified professional made a call. The professional can be asked about it. But the reasoning informing that call came from a system that cannot be questioned, running on data that may no longer reflect what it contained at the time, producing output that was probabilistic in nature and recorded as if it were deterministic fact.

The reinsurer who priced the treaty on the basis of the model's output cannot reconstruct what the model saw. The credit officer who approved the facility on the basis of the risk model's recommendation cannot show which version of the model ran or what its confidence distribution looked like. The clinical team that followed the pathway recommendation cannot prove that the decision support system had the patient's current contraindications in scope.

In each case the human took accountability for a decision they did not fully own. The model contributed to an outcome it was never recorded as having influenced. And the institution sits between those two facts, unable to prove either.

This is not a governance failure. It is an infrastructure failure. The record was never built to hold two voices.

Legibility is an infrastructure property

The instinct, when this problem is named, is to reach for process. Human-in-the-loop requirements. Sign-off workflows. Model output review gates. These are not wrong. They are insufficient. A sign-off workflow records that a human approved something. It does not record what the human understood about the model's contribution when they approved it. It does not record the confidence distribution behind the recommendation. It does not record whether the context the model operated on was complete, the permission boundary correct, or the model version current.

Process captures the moment of human action. It does not capture the conditions of the model's contribution. Those two things together are what you need to govern a decision made by two voices. Neither one alone is the record.

Legibility between a synthetic contribution and a human contribution is not achievable at the process layer. It has to be an infrastructure property. The graph has to know, at every node, which voice contributed what. The model's output has to be attached to the exact context it operated on, the permission boundary in force, the confidence scores it returned, and the version that ran. The human's decision has to be attached to what they were shown, when they were shown it, and what action they took. Both bound to the same record. Neither erasable by the presence of the other.

When that record exists, something becomes possible that is not possible today. You can show not just that a decision was made, but how the two contributions interacted to produce it. Where the model's output was followed. Where the human's judgement overrode it. Where the confidence was high enough to act on directly and where it required human interrogation before it could be trusted.

That is not just a compliance capability. It is an operational one. Institutions that can see how their human and AI contributions interact are institutions that can improve both. They can identify where model confidence is systematically miscalibrated against outcomes. They can see where human override of model output produces better results and where it produces worse ones. They can build a picture of how consequential decisions are actually being made, not how the process map says they should be made.

That picture does not exist in any institution running on infrastructure that was built before AI was in the decision chain. It cannot be constructed from logs. It requires the record to have been built correctly from the start.

The institutions that get this right

The regulatory direction is clear. The EU AI Act. Model risk management guidance tightening across banking and insurance. Healthcare regulators demanding accountability for algorithmic influence on clinical pathways. Government procurement bodies beginning to ask how AI-informed supplier decisions were made and by whom. The requirement to demonstrate not just what was decided but how the human and the model contributed to that decision is coming. In some jurisdictions it is already here.

The institutions that will navigate this without significant operational disruption are not the ones that restricted AI the most. Restriction is not a governance strategy. It is a delay. The institutions that will navigate it are the ones whose infrastructure was built to keep the two voices legible from the start. Who can show, for any decision ever made on their infrastructure, what the model contributed, what the human understood, and where the accountability for the outcome actually sits.

That is what governed human-AI decision making looks like in practice. Not a policy. Not a workflow. An infrastructure that was built to hold two voices in the same record without collapsing them into one.

The well-read drunk at the Rum Swizzle bar will keep giving advice. He cannot help it. It is what he does. The question is whether the infrastructure underneath the decision he influences is built to record exactly what he said, how confident he was, and what the human chose to do with it.

If it is, you have governance. If it is not, you have attribution risk wearing the face of a decision.


What comes next

Hop 3 is the operational reality. Two voices in the same decision chain, and the infrastructure requirement to keep them legible without collapsing one into the other.

The hops that follow go deeper into the architecture. How the graph resolves entity relationships across institutional boundaries. How context windows are bounded and versioned at inference time. How Replay operates at scale across decision populations, not just individual decisions.

The infrastructure argument is established. What it makes possible is where the story goes next.