EvoXplain Ltd arXiv:2512.22240  ·  UK provisional patent filed  ·  AGPL-3.0
Falsifiable XAI for high-stakes machine learning

Identifiability in predictions does not imply identifiability in mechanisms.

EvoXplain detects when a model's explanations are pipeline artefacts rather than mechanistic reality — providing falsifiable evidence for model risk teams in pharma, biotech, financial services, and AI regulation.

01 / problem

Two models that agree on every prediction can disagree completely on why.

Modern ML pipelines exhibit mechanistic non-identifiability: equally accurate models reached by retraining the same pipeline with different seeds, splits, or hyperparameters can rely on disjoint feature sets to make their decisions. Averaged explanations across reruns produce a ghost — a vector that resembles no mechanism the model actually uses, yet is what regulators, clinicians, and auditors are typically shown.

~/audit/explanation_drift.log
$ audit --pipeline tcga_lr_cgrid --seeds 800..899 --lens shap [ok] predictive accuracy: 0.842 ± 0.011 stable [ok] prediction agreement: 0.971 stable [!!] attribution cosine var: 0.184 unstable [!!] decoupling ratio: ~41× predictions stable, mechanisms not $ cluster --within-split --silhouette-min 0.25 k* = 3 SHAP basins detected basin centroids encode disjoint biological pathways averaged explanation corresponds to no individual basin [verdict] ghost average detected — explanation is not mechanistic
02 / approach

A nine-block falsification battery for explanation reliability.

EvoXplain treats the attribution-vector manifold as a topology with reproducible basins that are pipeline-dependent and biologically meaningful. The battery is engineered to kill the multiplicity hypothesis under every plausible null — randomness, sampling noise, model instability, hyperparameter drift, attribution-method choice. Multiplicity that survives is real.

block 01
Stochasticity & seed robustness
killed · BC + TCGA
block 02
Sampling noise & null perturbation
killed · BC + TCGA
block 03
Model instability under retraining
killed · BC + TCGA
block 04
Hyperparameter sensitivity (factorial ARM-D)
killed · BC + TCGA
block 05
Attribution-method dependence (multi-lens)
killed · BC + TCGA
block 06
Cross-pipeline persistence (LR · DNN · XGB · ENet)
killed · BC + TCGA
block 07
Within-split clustering integrity
killed · BC + TCGA
block 08
Silhouette thresholding & k=1 null
killed · BC + TCGA
block 09
Semantic content of basin centroids
killed · BC + TCGA
03 / results

Validated on TCGA pan-cancer and BRCA subtype benchmarks.

Across a 100×100 design (100 splits × 100 seeds, seeds 800–899) on TCGA pan-cancer and TCGA-BRCA Luminal-vs-Basal, the falsification battery confirmed that explanation multiplicity persists under every null — and that the resulting basins encode distinct biological pathways, not noise.

9/9
falsification blocks killed
BC + TCGA pan-cancer
~41×
decoupling ratio
attribution var ÷ accuracy var
k*=3
SHAP basins on TCGA
k*=2 LIME · disjoint pathways
+0.85
inter-basin cosine
BRCA Luminal vs Basal
04 / applications

Built for environments where the wrong explanation has consequences.

EvoXplain is designed for organisations where ML decisions are subject to scientific scrutiny, regulatory audit, or legal challenge — and where averaged or single-seed explanations are insufficient evidence.

/ 01

Pharma & Biotech

Detect when biomarker attributions, drug-target rankings, or patient-stratification features are pipeline artefacts before they enter a study design. Distinguish reproducible biological signal from explanation drift across retraining runs.

aligned with: FDA AI/ML-based SaMD action plan · GxP model documentation
/ 02

Financial Services

Provide model risk teams with falsifiable evidence that explanation features are mechanistically grounded — not artefacts of the training pipeline. Integrates into model validation, challenger reviews, and adverse-action documentation.

aligned with: SR 11-7 model risk management · EBA model governance · ECOA
/ 03

Policy & Regulation

Equip auditors and conformity assessment bodies with a reproducible test for whether claimed explanations of high-risk AI systems are robust under retraining — addressing the gap between transparency obligations and post-hoc XAI fragility.

aligned with: EU AI Act art. 13/14 · ISO/IEC 42001 · NIST AI RMF
05 / contact

Open research. Commercial deployments by enquiry.

EvoXplain is released under AGPL-3.0 and protected by a UK provisional patent. The codebase, falsification battery, and reproducibility scripts are open. Commercial deployment, dual-licensing, and bespoke integration are handled through direct enquiry.

RESEARCH & OPEN SOURCE

The framework, preprint, and falsification harness are publicly available. Issues, replications, and academic collaborations are welcome.

COMMERCIAL & LICENSING

Integration into regulated ML pipelines, dual-licensing for commercial use, model-risk consulting, and conformity-assessment partnerships.