Back to Insights
Observability9 min read

85% of Australian AI Projects Hit an Explainability Wall. Here Is How to Build Through It.

A global study of 600 CIOs found 85% say traceability and explainability gaps have already delayed or stopped AI projects from reaching production. In Australia, that gap is now a compliance deadline. Here is the engineering playbook for closing it before December 2026.

Rahul Pagidi
Rahul Pagidi

Data Engineer. Azure 6x Microsoft Certified. Monash University.

In February 2026, Dataiku and Harris Poll surveyed 600 CIOs globally. The single finding that should concern every Australian technology leader: 85% say traceability or explainability gaps have already delayed or stopped AI projects from reaching production.

Not 85% in theory. In practice. Projects that were technically functional, potentially valuable — killed because nobody could explain what the system was doing well enough to get it through governance, legal review, or board sign-off.

In Australia, this problem is about to acquire a hard deadline.

From 10 December 2026, any Australian APP entity using AI or computer programs to make or substantially assist in decisions that significantly affect individuals must be able to produce a meaningful explanation of those decisions on request. The explainability gap that has been a soft blocker is becoming a hard legal requirement. The OAIC launched its first proactive compliance sweep in January 2026. The regulator is already checking.

This article is not about the philosophy of explainability. It is the engineering playbook — what you need to build, how it fits into your architecture, and how to get there before December without rebuilding everything from scratch.

What the Explainability Wall Actually Is

Before covering the solution, it is worth being precise about the problem — because "explainability" is used loosely enough that it obscures what needs to be built.

There are two different things people mean when they say AI explainability, and only one of them is your problem.

The first kind is interpretability: understanding why a model produces a given output from its internal weights, attention patterns, and activation states. This is the academic and philosophical version — making the black box transparent. For deep neural networks, it is hard, expensive, and often not necessary.

The second kind is operational observability: recording what the system did, what data it used, what steps it took, and what decision it produced — in enough detail that a human can reconstruct the reasoning. This is the pragmatic version. It is achievable with current engineering. It is what the December 2026 Privacy Act obligations require. And it is what 85% of CIOs are missing.

The good news: you do not need to open the black box. You need to build a transparent wrapper around it.

Why Australian Businesses Are Specifically Exposed Right Now

The Dataiku finding applies globally. Three factors make it more acute for Australian businesses in 2026.

The Privacy Act deadline is specific and approaching. The December 10, 2026 automated decision-making transparency obligations are not aspirational guidance — they are enforceable law. From that date, an individual who is significantly affected by an automated decision can request a meaningful explanation. If your system cannot produce one, that is a Privacy Act breach. The penalties for serious or repeated breaches reach AUD $50 million for body corporates.

The OAIC is already checking. Australia's privacy regulator launched its first proactive compliance sweep in January 2026, targeting approximately 60 organisations across financial services, health, retail, telecommunications, professional services, and digital platforms. The regulator is not waiting for complaints. It is actively looking at AI-related personal data handling. If you receive a sweep notice with systems that have no audit trail, you are already in a difficult position.

The time to build is shorter than it looks. Adding observability to an AI system that was not designed for it is not a configuration change — it is a retrofit. For complex production systems, this can be a 6–12 week engineering project. With eight months until the December deadline and typical enterprise delivery timelines, organisations that have not started are already in the pressure zone.

The Four Layers of Observability-First AI

Building through the explainability wall requires four layers, each addressing a different aspect of the problem. These can be built independently but are designed to work together.

Layer 1: Run-Level Logging

Every time your AI system executes — whether triggered by a human action, a scheduled job, or an event — it generates a run. That run needs a complete, immutable record:

  • Run ID: A unique identifier for this specific execution
  • Trigger: What caused the system to run (user request, schedule, API call, event)
  • Input snapshot: The exact data state at the moment the run began — not a reference to a live record that may have since changed, but an immutable copy of the inputs
  • Output: What the system produced — the decision, recommendation, score, or action
  • Status: Success, partial, failure, escalated
  • Duration and timestamps: When each phase started and completed
  • Agent or model version: Which specific version of the system was running at the time

The input snapshot is the piece most often missing from retrofitted systems. If an individual asks "why did your AI decline my application in October 2025?" and the underlying data has since changed, you need the October state — not the current state. Build input snapshotting from the start.

Layer 2: Step-Level Tracing

Run logs tell you what went in and what came out. Tracing tells you what happened in between.

A complex AI system — an agent that calls multiple tools, processes documents in stages, or applies a sequence of rules — needs distributed tracing. Each step in the process is a span in the trace, with:

  • Span ID and parent span ID (linking steps to the overall run)
  • Tool or sub-process invoked (what the system did at this step)
  • Inputs to this step (what data the step received)
  • Output of this step (what it produced)
  • Timing (how long this step took)
  • Decision or rule applied (what logic governed this step's output)

This is the layer that makes "what did your AI system do?" answerable in specific, step-by-step detail. For a loan document processing agent, the trace shows: read document, extracted income figure, cross-referenced against bank statement, identified discrepancy, flagged for human review. For a Privacy Act explanation request, this trace is the explanation.

OpenTelemetry is the open standard for distributed tracing and is well-supported across the major cloud providers (AWS X-Ray, Azure Monitor, Google Cloud Trace). For smaller systems, structured logging with trace IDs is often sufficient.

Layer 3: Decision Rationale

For decisions that significantly affect individuals — the specific category triggering the Privacy Act obligations — tracing alone may not be sufficient. The December 2026 obligations require a "meaningful explanation." That means a human-readable account of the key factors that led to the decision.

Building decision rationale requires a deliberate design choice at the output layer of your AI system: the system must generate an explanation alongside the decision, not just the decision itself.

For rule-based systems, this is straightforward: the decision rationale is the set of rules that fired. For ML model outputs (scores, classifications), the rationale is typically the top contributing features or signals.

Practical patterns:

  • For classification systems: log the confidence score, the top three features contributing to the classification, and their direction of influence (increased or decreased probability)
  • For scoring systems: log the normalised contribution of each input variable to the final score
  • For document processing agents: log the specific text excerpts and data points that informed each extraction decision
  • For agentic workflows: log the "thinking" or reasoning step the agent used to make its routing or action decision

These rationale records need to be stored alongside the run and trace logs — linked by the same run ID — so that when an explanation request arrives, you can pull a complete explanation in one query.

Layer 4: The Explanation Request Process

Observability infrastructure is useless without a process to use it. The December 2026 obligations require not just the technical capability to explain decisions but an operational process for doing so when asked.

The process needs four components:

Intake: A clearly communicated channel for individuals to submit explanation requests (covered in the privacy policy and any decision notifications)

Lookup: An internal capability to search your run logs, traces, and decision rationale by individual identifier, decision date, and product — and return a structured summary within a reasonable timeframe

Translation: The technical explanation in human-readable form — not "Model v2.3.1 returned output: 0.34" but "Your application was declined because the income figure on your payslip did not match the deposits in your bank statement for the same period."

Response: Delivery of the explanation to the individual, documented and logged

This process should be tested before December 2026 — not on the first live request. Run internal test cases, identify gaps in your logs, and adjust.

Industry Architectures for Australian Markets

Financial Services: Loan Decisioning

The decision: Application approved, declined, or referred to manual review.

Observability requirements: Input snapshots of all submitted documents and extracted data at decision time. Step-level traces showing document reading, data extraction, cross-referencing, and rule evaluation. Decision rationale showing which criteria were met, which were not, and what thresholds applied.

The explanation request scenario: "My mortgage application was declined by your AI. I want to know why."

What you need to return: A human-readable summary showing the key factors — income verification result, credit reference check, loan-to-value ratio — and which of those factors fell outside the lending criteria. Not the model internals. Not the algorithm. The business logic that produced the outcome.

Architecture note: Store run records in a write-once audit database (DynamoDB with point-in-time recovery, or Postgres with immutable event tables). Index by applicant identifier and decision date. Target query time for explanation lookup: under 30 seconds.

Healthcare: Clinical Triage

The decision: Referral urgency classification and routing.

Observability requirements: Input snapshot of the referral at submission time. Step-level trace showing clinical data extraction, urgency criteria evaluation, specialist routing decision. Decision rationale noting the clinical signals (symptoms, duration, age/comorbidity factors) that contributed to the urgency classification.

The explanation request scenario: A patient's GP asks why the referral was classified as routine rather than urgent.

What you need to return: The clinical factors assessed and the specific criteria that resulted in the routine classification — with the caveat that all classifications are subject to clinical review.

Architecture note: All processing must occur within Australian jurisdiction. Use AWS Sydney (ap-southeast-2) or Azure Australia East. Health information has heightened Privacy Act protections — store explanation records accordingly.

Professional Services: Contract Review

The decision: Contract risk rating and flagged clause identification.

Observability requirements: Input snapshot of the contract at review time. Trace showing clause identification, categorisation, and risk assessment steps. Decision rationale noting which clauses triggered risk flags and which specific contract terms or legislative references applied.

The explanation request scenario: A client asks why a particular clause was flagged as non-standard.

What you need to return: The specific clause text, the category it was classified as (e.g., limitation of liability), the firm's standard position on that category, and the deviation that triggered the flag.

Architecture note: Confidentiality is paramount. Run explanation requests over the same data-access control layer as the primary system — only authorised staff can retrieve explanation records for specific clients.

Building the Retrofit vs. Building It Right

For organisations with existing AI systems that lack observability, there are two paths.

The retrofit: Adding observability layers to an existing production system. This is achievable but requires careful design to avoid breaking existing functionality and to ensure the new logs are trustworthy (not reconstructed after the fact). Expect 6–12 weeks for a complex system. Start with run-level logging (usually the easiest to add) and work down to step-level tracing and decision rationale.

Building it right: For any new AI system, build observability in from the first line of code. The cost is approximately 20–30% more engineering time upfront. The benefit is a system that is compliant from day one, easier to debug, and dramatically easier to maintain. There is no retrofitting cost later.

For the December 2026 deadline: if your existing AI systems lack observability and the remediation timeline is tight, prioritise Tier 1 use cases — the ones making decisions that most significantly affect individuals (credit, employment, health, access to services). These carry the highest enforcement risk and the most urgent compliance need.

The Governance Layer on Top

Observability infrastructure enables governance but does not replace it. Once you have the logs, traces, and decision rationale, you need:

  • Periodic review: Regular sampling of decisions and their explanations by a human reviewer. Are the explanations coherent? Are they consistent with what the system was designed to do?
  • Anomaly detection: Alerts when decision rationale patterns change unexpectedly — a signal that model drift or data quality issues are affecting outputs
  • Explanation register: A record of all explanation requests received, responses provided, and outcomes — demonstrating to the OAIC that your process works
  • Escalation path: What happens when an explanation request reveals a decision that may have been incorrect? Who reviews it? What is the remediation process?

This governance layer is lightweight in a well-designed system. It is expensive and chaotic in a system built without observability.

The 90-Day Build Plan

For Australian businesses that are behind on this and need to get to compliance by December 2026, here is a practical sequence:

Days 1–14: Inventory and triage Identify every AI system that makes or substantially assists in decisions affecting individuals. For each: does it have run-level logging? Step-level tracing? Decision rationale? A process for explanation requests? Categorise by compliance gap size.

Days 15–45: Build run-level logging for Tier 1 systems For your highest-risk systems (lending, clinical triage, employment screening, access control), implement run-level logging with input snapshots first. This is the fastest win and the most critical gap to close.

Days 46–75: Add step-level tracing and decision rationale For each Tier 1 system: implement distributed tracing and add decision rationale generation at the output layer. This is the engineering-heavy phase — budget accordingly.

Days 76–90: Test the explanation request process Run internal test cases. Simulate an explanation request from an affected individual. Can you retrieve the relevant run record, trace, and rationale? Can you translate it into a human-readable explanation? What is the response time?

If you start now, you have time to do this properly. If you start in September, you are in emergency mode.

The Competitive Advantage You Can Build

The businesses that close the explainability gap before December 2026 are not just avoiding a compliance risk. They are building a capability that their competitors in the same sector — the ones that have deployed AI without observability — cannot match.

Observability-first AI can be deployed in APRA-regulated financial services where unexplainable AI cannot. It can be used for employment screening decisions where Privacy Act obligations are most acute. It can be presented to enterprise clients with their own data governance requirements as evidence of responsible AI.

The 85% of AI projects that hit the explainability wall are blocked from some of the highest-value use cases. The 15% that built observability in are operating without that constraint.

For Australian mid-market businesses: the explainability gap is solvable. The time to solve it is now.


*Akira Data builds observability-first AI systems for Australian businesses — run logs, distributed tracing, decision rationale, and explanation request processes built in from day one. Every engagement includes full audit trail infrastructure compliant with the December 2026 Privacy Act obligations.*

*Our [Privacy-Safe AI Implementation](/services#privacy) service (from AUD $20,000) includes a compliance gap analysis, observability architecture, and the technical build required to close your December 2026 gaps. Our [Agentic Workflow Build](/services#workflow) includes production-grade observability as a standard component — not an optional extra.*

*This article references the Dataiku/Harris Poll CIO survey (February 2026, n=600), the Privacy and Other Legislation Amendment Act 2024, and the OAIC's January 2026 compliance sweep. It is general information and does not constitute legal advice.*

Share this article