Inside the model boundary

Inside the model boundary.

Hop 1 established the infrastructure argument. The reasoning behind consequential decisions was evaporating at the moment those decisions were made. The graph fixes that. Provenance at every node. Replay as a native property of the architecture, not a feature bolted on afterward.

That argument holds. But it leaves a question open. For Replay to mean anything, the record it preserves has to be trustworthy. And for the record to be trustworthy, three things have to be true before the model runs. Not after. Not at the application layer. Before.

Context. Permission. Provenance.

Get any one of them wrong at the infrastructure level and no amount of logging, reporting, or documentation on top of it will produce a record that survives scrutiny. This is what it means to go inside the model boundary.

Context is not data

The most common misconception about AI in regulated decision making is that feeding a model more data makes it more accurate. Sometimes it does. More often it makes the model more confidently wrong in ways that are harder to detect.

A model assessing a complex reinsurance treaty is not failing because it lacks data. It is failing because the data it has been given has no structure around it that tells the model what the data means in relation to everything else the institution knows. The cedant's five-year loss development pattern exists in one system. The current exposure accumulation sits in another. The applicable policy wording version, the one that was actually in force when the risk was bound, is in a document that may or may not have been ingested, may or may not reflect the amendment agreed three months prior, and has no relationship in the infrastructure to the entity it governs.

The model runs. It produces a recommendation. The recommendation is based on context that is incomplete, unverified, and unstructured. Nobody knows this at the time. The output looks authoritative. The underwriter acts on it.

The same failure appears across every sector this problem touches. A credit model running over a structured facility without access to the borrower's current covenant position. A clinical decision support system recommending a pathway without the patient's current medication interactions in scope. A procurement model assessing a supplier without the current sanctions register as a live input. In each case the model is not broken. The context handed to it is.

Context, properly understood, is not a collection of data points. It is a structured set of relationships between entities, with trust levels, timestamps, and source attribution attached to each one. It has a boundary — what was in scope at the moment the model ran — and that boundary has to be recorded as part of the decision record. Not inferred afterward. Recorded at the time.

When context lives in the graph as a first-class property of every node and every relationship, the model boundary becomes explicit. You can show exactly what was in scope. You can show what was not. You can show why. That is what makes the output defensible. Not the model. The context the model operated on.

Permission is a reasoning problem, not an access problem

Most institutions have solved permission as an access control problem. Roles, credentials, authentication layers, data classification tiers. These are necessary. They are not sufficient. In regulated environments, permission is not just about who can see what. It is about what a model is allowed to reason over on behalf of whom.

In a reinsurance syndicate, a lead underwriter and a follow market participant are both authorised users of the same platform. They do not have the same view of the risk. The lead has access to the full submission, the loss history, the broker correspondence. The follow sees what they are entitled to see under the terms of the slip. If an AI model runs over the full dataset and produces a recommendation that is then presented to a follow market participant, something has gone wrong. Not at the access control layer. At the reasoning layer. The model reasoned over data the participant was not entitled to have in scope. The recommendation is contaminated. And there is no record of the contamination because the infrastructure was never built to capture it.

The same structure appears in banking. A relationship manager and a risk committee member have different permission boundaries over the same credit file. In healthcare, a referring clinician and a treating specialist see the same patient through different permission lenses. In each case, permission is not a binary. It is a relationship property. And if it is not modelled as a relationship property in the infrastructure, it cannot be enforced at the reasoning layer.

When permission is a graph property — attached to relationships between entities, not to database roles — the model boundary can be drawn correctly before inference runs. The model reasons only over what is in scope for the specific context and the specific participant. That boundary is recorded. It becomes part of the provenance chain. If a decision is ever challenged, you can show not just what the model saw, but what it was permitted to see, and that the two things were the same.

That is a materially different capability than access control. Access control keeps the wrong people out. Graph-native permission keeps the reasoning honest.

Provenance is the record that has to survive

Context and permission determine what goes into the model. Provenance is what comes out the other side. Not the recommendation. The complete record of the conditions under which the recommendation was produced.

Eight months after a complex reinsurance treaty is bound, a major loss event triggers a coverage dispute. The cedant's position is that the risk was priced incorrectly. The reinsurer's position is that the pricing reflected the information available at the time and the underwriting judgement applied to it. The regulator wants to understand how the AI model's output influenced the final decision, what data it operated on, and whether the human oversight applied was meaningful or nominal.

Without infrastructure built to capture provenance at the moment of inference, none of those questions can be answered with evidence. The model version that ran may no longer be in production. The data it operated on may have been updated. The context boundary is unrecorded. The permission scope is unverifiable. What remains is a number — the price — and the memory of the people who were in the room.

Memory does not survive a Lloyd's dispute. It does not survive a Basel IV model risk review. It does not survive a NICE appraisal challenge or a public procurement audit. In every one of those environments, the question is the same: can you prove what was known, what ran, and what was decided, at the exact moment the decision was made?

Provenance in the graph is not a log. A log records that something happened. Provenance records what was true at the moment it happened. The model version. The context in scope. The permission boundary in force. The confidence scores returned. The human action taken in response. All of it attached to the same node. All of it traversable backward from any subsequent point in the chain.

This is what Replay actually means at the infrastructure level. Not replaying the interface. Replaying the conditions. Showing, with evidence, that the decision was made correctly given what was known, what the model produced, and what the human understood at that moment. That record does not degrade. It does not depend on anyone's recollection. It was captured when it was true.

Why the application layer cannot fix this

There is a persistent belief that context, permission, and provenance can be retrofitted. That logging pipelines, audit middleware, and documentation workflows added on top of existing systems will produce the same result as infrastructure that captures these things natively.

They will not. And the reason is architectural, not organisational.

An application layer sits above the infrastructure. It can record what it sees. It cannot record what it did not have access to. If the context handed to the model was incomplete, the application layer records incomplete context. If the permission boundary was wrong at inference time, the application layer records the wrong boundary. If the model version that ran is no longer in the registry, the application layer has nothing to record. The application layer is downstream of every failure this infrastructure is designed to prevent. Adding it does not move the capture point. It just produces more documentation of the same gap.

The capture point has to be at the infrastructure layer. Context resolved and bounded before inference. Permission enforced as a graph property before the model boundary is drawn. Provenance attached at the moment of inference, not reconstructed afterward. This is not a design preference. It is the only architecture that produces a record capable of answering the questions that regulated institutions will be required to answer.

The institutions that understand this are not waiting for the regulatory requirement to crystallise before they act. They are building on infrastructure that captures these three things correctly because the alternative — scrambling to retrofit provenance onto systems that were never designed to preserve it — has a known outcome. It produces documentation that looks like an audit trail and survives almost nothing.

What comes next

Hop 2 is the inside of the model boundary. Context, permission, and provenance as the three properties that have to be true at the infrastructure level before any AI operating on consequential decisions can be trusted or governed.

Hop 3 goes to the frontier of where this is heading. AI and humans are already making decisions together. The question that nobody has properly answered yet is how you keep their contributions legible when they are operating in the same chain. Two voices. One record. That is an infrastructure problem. And it is the one that defines what governed human-AI decision making actually looks like in practice.