When AI agents start touching financial data, auditors get specific. The question isn't "did the close balance?" — they've always asked that. The new question is: "Show me what the system decided, why it decided it, and who approved it."

Controllers who have lived through this conversation know the anxiety. The close is done, the numbers tie, but somewhere in the stack a machine made a routing decision on a $78,000 reconciling item, flagged an anomaly in a travel account at 3 a.m., and recommended a period lock submission. The output is defensible. The decision trail? That depends entirely on what the system was built to preserve.

This post is about what that trail looks like when it's designed for an auditor from the start — not reconstructed for one after the fact. For context on why an agentic close looks different from an automated one, that piece walks through the broader model. Here we go one level deeper: what the agent actually leaves behind, and why it matters for your external audit.

Two kinds of "explainability"

The word has been used loosely enough that it's worth being precise.

One version of explainability is a generated summary. The agent finishes an action, notices you hovering over it, and produces a paragraph that sounds plausible — "I routed this item because the confidence score was above threshold and the amount was below materiality." It reads well. It might even be accurate. But it was written after the fact, from the same model that took the action, and it has no structural relationship to what actually happened inside the system. You can't give that paragraph to an auditor and call it documentation.

The other version is a captured trail. The system records what it did mid-flight — the data sources it queried, the model outputs it received, the reasoning checks it passed or failed — and makes that record available for review. That execution log is surfaced on demand; it isn't written for consumption after the fact.

That distinction is the one that matters for your audit. When a controller asks "how did this decision get made?" they deserve an answer from the second category. InsightLens is built on the second category. That choice shaped every part of the design.

Every agent output in InsightLens carries a 'Why?' affordance. The answer comes from the captured execution trail, not a generated summary.

What three layers actually means

Every consequential action an InsightLens agent takes produces three things. All three. Every time.

Layer 1: The output. The thing the controller sees — a routing decision, an anomaly flag, a period lock recommendation, a proposed journal entry. This is what lands in the cockpit chat and in the widget panel. Readable, actionable, referencing materiality and confidence scores in plain English.

Layer 2: The reasoning trail. Directly below the output, available with a single click, is the execution record: which SAP OData endpoints were called and what they returned (as hashes, never raw payloads), which BigQuery historical patterns were matched, which Vertex AI model scored the item and which features drove that score. For a $78,400 unreconciled entry, the reasoning trail might show that the top driver was a delta against a matching intercompany invoice on the counterparty side — a pattern the model recognized from a nearly identical $74,000 item in Period 11 of the prior year.

That's the reasoning chain the agent traversed, captured at runtime — not a summary written after the fact.

Layer 3: The signed audit row. Every action writes an immutable record to BigQuery — timestamped, signed, with a five-year retention. The record contains metadata: who, what, when, where, and a content hash that ties the audit entry back to the action. Raw financial payload stays in SAP, where it belongs. What the audit log holds is the decision record: the agent identity, the action taken, the governance assertions that cleared, and the Checkpoint outcome if one was required.

The architecture supports independent reproduction of the decision from Layer 2 and Layer 3 alone, without requiring access to the agent runtime. An auditor reviewing the close doesn't need to interview the system; the trail answers the question.

The three-layer drill-down on an escalated item. The SHAP features are live — they show exactly which signals the model weighted, ranked by contribution.

The governance layer is the enforcement mechanism

Explainability tells auditors what happened. The governance layer is what prevents the wrong thing from happening in the first place.

InsightLens operates under five reasoning principles that are evaluated before any consequential action executes: grounding (claims trace to source data), authority (the agent acts within its defined boundary), explainability (the trail is captured), reversibility (irreversible actions require a human Checkpoint), and materiality (items above your threshold cannot be auto-posted, regardless of confidence score).

The last two are worth dwelling on.

A 0.94-confidence routing decision on a $78,000 item is impressive. It's also wrong to act on automatically, because the amount is above your materiality threshold. The materiality principle isn't a business rule you can override in configuration. It's a structural constraint — the code path for auto-posting above threshold doesn't exist. An agent that scores high confidence on an above-materiality item still can't post. The governance check runs independently of the model's confidence.

The reversibility principle operates similarly for period lock. A period lock in SAP is irreversible without management override — once submitted, the fiscal period closes, and reopening it is an event your auditors will notice. InsightLens's agent can reach a 0.96 readiness score, identify no blockers, and recommend submission — but the submission itself requires a human Checkpoint. Always. The agent prepares the decision; the controller makes it.

The earlier piece in this series explains the broader model. What the governance layer adds is the structural guarantee that reasoning about a decision and taking that decision are two separate acts, and only the second one requires your authorization.

A 0.96 readiness score and a clean governance check — but the period lock still requires a controller Checkpoint. The architecture enforces this unconditionally.

What real-time anomaly detection looks like in practice

Most journal anomaly processes run on a cycle. Month-end, a sample of journals gets reviewed. High-risk accounts get a second look. The sample rates vary; the pattern is consistent. You're reviewing what happened, after the fact, against a subset of entries.

An event-driven architecture changes the timing. When SAP posts a document, an event fires. The event reaches the Journal Anomaly Agent in under two seconds. The agent runs the posting through a model trained on the last two years of journal entries for your entity — an Isolation Forest that has learned the normal cadence, the normal amounts, the normal accounts, and the normal users for every posting type. A posting that deviates materially from all of those dimensions gets a score and a flag before the controller has looked away from the screen.

Here's what that looks like in a live demonstration against SAP data: a posting appears in real time — €18,500 to a travel expense account, posted by the batch interface user at 3:14 a.m., document type vendor invoice. The model scores it at 0.92. Three features drive it: the amount is suspiciously round (real expense receipts rarely land flat), the batch user doesn't typically post to travel accounts, and 3 a.m. is outside the entity's normal business-hours window. The model also surfaces a comparable pattern from Period 7 of the prior year — a posting that was subsequently reversed as a miscoded vendor invoice.

The model doesn't know what the entry is. It knows the entry is unusual, it knows why it's unusual, and it can show its work. A controller dismissed it in one click with a note for WD+2 reclassification. The anomaly is closed; the signal is preserved. If the same pattern recurs next period, the agent will remember.

From SAP posting to scored anomaly: under two seconds. The reasoning trail shows exactly which features drove the score — the controller reviews, dismisses, and moves on.

The period lock guardrail is the logical counterpart to real-time anomaly detection — the mechanism that ensures nothing flagged remains unresolved before the period closes. That design merits its own treatment and we'll cover it in the next piece in this series.

What the CAO sees when the close is done

The controller has been in the cockpit for six hours. The CAO walks in and asks: "How did the close go, and are we audit-ready?"

That question has always required a reconstruction — gathering metrics from AFC, from the reconciliation platform, from journal review, from the period lock confirmation. By the time you've assembled the answer, the close has been done for a day.

The CAO command center in InsightLens answers that question as a single view, available the moment the period locks. Close duration against the trailing six-period average. Exception auto-resolution rate. Average controller review time per escalated item. Audit trail completeness — one hundred percent, every action signed. Governance violations caught and blocked — which is the number to highlight for your audit committee, because it proves the guardrails did something during the close.

In a demonstration scenario built on SAP data, the close ran 34% faster than the trailing average. Ninety-three percent of exceptions were handled without controller time. Twelve minutes of average controller review per escalated item, against a 38-minute baseline. Three governance violations blocked. Every agent action audited. These are demonstration metrics, not observed customer outcomes — but they reflect the instrumentation the architecture makes possible from day one.

Those metrics don't require assembly. They're the natural output of a close that was instrumented from the start.

Finance teams operating without a continuous, signed audit trail spend significant time reconstructing evidence for decisions made during the close — a preparation sprint that exists precisely because the decision trail wasn't designed to be auditable at the time of the close. When the trail is continuous and signed, that sprint compresses. See how the InsightLens Controllership module approaches this across the full close cycle.

The CAO command center: every close metric in a single view, available the moment the period locks. No reconstruction required.

The question worth sitting with

There is a reasonable version of skepticism here. You've seen finance technology vendors promise auditability before. The audit trail they delivered was a log file nobody could read, a report that required a DBA to extract, or a summary that turned out to be generated rather than captured.

The fair test is specific. Ask your next vendor: if your agent auto-routed a $70,000 reconciling item and your auditor asks why, what exactly do you hand them? Is it a pre-generated explanation, or the actual execution record? Does the audit entry exist in your own data infrastructure — in your BigQuery, under your retention policy, accessible to your team — or does it live in the vendor's systems and disappear with the contract?

These are answerable questions. The answers reveal whether explainability is a UI feature or an architecture property.

InsightLens is built on the position that it has to be an architecture property. The close can't be audit-ready by accident. The trail can't be reconstructed after the fact. The governance layer can't be a soft check that confident models can override. Those are the conditions under which a controller can look an auditor in the eye and say: "Here is what the agent did. Here is why. Here is where to look."

Start the conversation

If you're preparing for your next close and want to see the three-layer explainability trail in a live demo — walking through a full close cycle on SAP data — we'd welcome thirty minutes.

Book a 30-Minute Demo — see the cockpit, the reasoning trail, the governance layer, and the CAO command center end to end against a live close cycle.

Start a Finance AI Readiness Assessment — a structured conversation about where your close stands today: close cycle duration, exception handling, audit trail completeness, and where agent-assisted workflows would make the biggest difference.

When Your Auditor Asks How the AI Decided, What Do You Hand Them?