01 Use case · insurance

AI agents that pass an
AI Act audit.

Insurance is one of the most exposed verticals under the EU AI Act. Claim handling, fraud detection, risk-based pricing, and reassurer signoff all land squarely in Annex III high-risk territory. This page walks a single claim from intake to regulator audit, showing exactly which evidence Cullis Mastio captures at each step.


02 Why insurance is first

Three compliance regimes meet on
the same claim file.

The forcing function is regulatory, not commercial. Insurers deploying AI agents now face an overlapping set of obligations that all want the same evidence: who decided what, when, based on which model run, with which input.

EU AI Act

Annex III, point 5

"AI systems intended to be used by private entities to evaluate the creditworthiness of natural persons or establish their credit score" plus "AI systems intended to be used for risk assessment and pricing in relation to natural persons in the case of life and health insurance" are high-risk. Claim handling that drives payout decisions falls under the same risk class via Art. 6 read jointly with Annex III.

Articles 9 (risk management), 12 (record-keeping), 15 (accuracy, robustness, cybersecurity), 72 (post-market monitoring).

IDD

Insurance Distribution Directive Art. 17

Distributors must act "honestly, fairly and professionally in accordance with the best interests of their customers." When an AI agent assesses claims or prices a contract, the decision and the data behind it must be reconstructable for the customer's complaint procedure and for the national authority's supervisory review.

Articles 17 (general principles), 20 (information for customers), 30 (suitability assessment).

DORA

Digital Operational Resilience Act

Insurers are explicitly in scope under Art. 2(1)(c). Third-party ICT providers, including the software stack behind AI agents, must satisfy Art. 28 risk management requirements. Self-hosted does not exempt the vendor from third-party classification, but it reduces the controls the insurer must run on the provider's environment.

Articles 5 (governance), 6 (ICT risk management framework), 28 (third-party ICT risk).


03 The claim journey

One claim, six agents,
one audit chain.

The scenario below is the same one that runs in reference/insurance-demo/ in the public repository. You can clone it and replay it locally.

  1. 01

    Customer files a claim.

    The customer-facing portal receives the claim. A claim-intake agent process starts. Cullis Mastio mints a fresh x509 certificate for the process, signed by the org CA, with an embedded SPIFFE identity spiffe://yourcompany.eu/claim-intake. The cert lifetime is 15 minutes; rotation is automatic.

    Audit chain entries: agent enrollment, cert mint, SPIFFE ID assignment, cert thumbprint.

  2. 02

    The agent reads the claim and calls an LLM.

    The claim-intake agent loads the policy file, the customer's history, and the new claim narrative. It calls the LLM through Mastio's gateway. Mastio writes a hash of the prompt, the model identifier (claude-sonnet-4-6, gpt-5, or whichever you allow), the response token count, the latency, and the cost.

    Audit chain entries: LLM call event, prompt SHA-256, model name, token count, cost, latency, policy decision (allowed / denied / rate-limited).

  3. 03

    A fraud-detection agent reviews.

    The claim is enqueued for a second pass by a fraud-detection agent. It runs its own authentication, mints its own cert, gets its own audit trail. Mastio links the two events through a shared claim_id dimension: the audit chain shows fraud-review of claim 12345 was performed by fraud-detection at timestamp X with a hash that binds it to the intake event.

    Audit chain entries: linked event claim.fraud_review, parent event ID, outcome (cleared / flagged), reasoning hash.

  4. 04

    A senior human adjuster approves with override.

    A human, not an agent, opens the case in the claims dashboard. The dashboard authenticates them via Frontdesk SSO (Keycloak, Okta, or Azure AD). When they click approve, the dashboard captures the human's principal ID, the timestamp, the override reason they typed, and writes a signed event. The signature uses the human's own dashboard session key, not the agent's cert.

    Audit chain entries: human override event, principal ID, override reason, dashboard session hash, downstream action (payout / referral / denial).

  5. 05

    Reassurer agent at the partner carrier picks up.

    For claims above the retention threshold, a reassurer agent at the reassurance partner needs to ratify. This is the cross-organization step. Cullis Court federates the event: the reassurance partner's Mastio receives a sealed envelope that Court routes but cannot open. The reassurer agent reads the case, signs its decision, and the envelope returns through the same path.

    Audit chain entries: cross-org event reassurer.ratify, partner org ID, sealed envelope hash, Court routing receipt, dual-write confirmation on both Mastios.

  6. 06

    Eighteen months later, a regulator asks.

    An AI Act conformity assessor (or your national insurance authority) requests evidence for claim 12345. Your compliance team exports the audit chain segment for that claim ID. The export is a signed bundle that the regulator can verify externally, without trusting Cullis or your IT team. The verification proves the chain is unbroken from intake to payout, every signature ties back to a known principal, and no event was tampered with.

    Auditor verification: cullis-cli audit verify --bundle claim-12345.tar --pubkey regulator.pem returns OK or names the broken link.


04 What gets recorded

Every step, hashed and chained.

Each event written to the audit log is hashed and chained to the previous one. The chain head is anchored periodically via RFC 3161 timestamping. An auditor exporting the chain for a specific claim receives a self-contained bundle that verifies without needing to reach back to Cullis or your systems.

Step Event type Captured fields
01 Intake agent.enroll
cert.mint
agent_id, SPIFFE ID, cert thumbprint, expires_at, signing CA fingerprint
02 LLM call llm.call prompt SHA-256, model name, completion tokens, cost, latency_ms, policy decision, decision reason
03 Fraud review claim.fraud_review parent_event_id, fraud_agent_id, outcome enum, reasoning SHA-256
04 Human override human.approve principal_id, dashboard_session_id, override_reason, downstream_action, signature
05 Reassurer ratify reassurer.ratify partner_org_id, sealed_envelope_hash, court_routing_id, dual_write_confirmation_local, dual_write_confirmation_remote
06 Audit export audit.export requester_id, claim_id_filter, time_window, bundle_sha256, anchor_tsa_token

05 Where Cullis ends

The substrate, not the brain.

Compliance with the AI Act is not a single-vendor problem. Cullis is honest about its scope: identity, policy enforcement before the call, and tamper-evident audit. The rest stays where it should: with your data scientists, your MLOps team, and your fairness reviewers.

  • Does not analyze model bias or fairness across demographic groups. Use Fairlearn, AIF360, Aequitas, or your in-house tooling. Cullis provides the per-decision provenance those tools need to operate on a defensible data set.
  • Does not test model accuracy or calibration. Use your model evaluation pipeline. Cullis records which model ran when, so your evaluation can correlate accuracy drift with audit events.
  • Does not make the policy. The PDP enforces decisions you write (allow/deny rules, scope-based access, rate limits). Cullis runs them; your compliance team designs them.
  • Does not replace the human-in-the-loop requirement. Article 14 of the AI Act mandates human oversight on high-risk systems. Cullis makes the human override evidence trustworthy; it does not remove the human.
  • Does not file regulatory paperwork on your behalf. Conformity assessments, post-market monitoring reports, and incident notifications under Article 73 remain your operational responsibility. Cullis ships the substrate; you write the report.

06 Get started

A 90-day pilot on one BU.

We are looking for two to three insurance design partners in EU for the second half of 2026. The scope is conservative and reversible: one AI lab business unit, five to ten agents in non-production-critical workloads, no SLA. In exchange we ask for a 30-minute fortnightly review call and the willingness to give a verbal reference if the pilot works. No logo, no public quote required.

Scope

One BU, 5-10 agents

Pick one insurance line (motor, life, health, P&C) or one process (claim intake, fraud, pricing). Five to ten agent identities are enough to stress the audit chain without putting production at risk.

Duration

90 days

Long enough to see a real claim lifecycle. Short enough to decide cleanly. Termination clause: 30 days notice, source code escrow included.

Cost

EUR 15-25k all-in

Covers pilot license, onboarding, weekly review cadence, and a Phase-1 external pentest run in parallel by an independent assessor (mediaservice.net or NCC Group). The Letter of Attestation is yours at pilot end.

Output

Decision-ready evidence

A signed audit bundle for ten real claim journeys, a compliance mapping against your in-house AI Act and IDD controls, and a pentest Letter of Attestation. Enough to decide whether to go to production.