EvoXplain — XAI infrastructure for high-stakes ML

01 / problem

Two models that agree on every prediction can disagree completely on why.

Modern ML pipelines exhibit mechanistic non-identifiability: equally accurate models reached by retraining the same pipeline with different seeds, splits, or hyperparameters can rely on disjoint feature sets to make their decisions. Averaged explanations across reruns produce a ghost — a vector that resembles no mechanism the model actually uses, yet is what regulators, clinicians, and auditors are typically shown.

~/audit/explanation_drift.log

$ audit --pipeline tcga_lr_cgrid --seeds 800..899 --lens shap [ok] predictive accuracy: 0.842 ± 0.011 stable [ok] prediction agreement: 0.971 stable [!!] attribution cosine var: 0.184 unstable [!!] decoupling ratio: ~41× predictions stable, mechanisms not $ cluster --within-split --silhouette-min 0.25 → k* = 3 SHAP basins detected → basin centroids encode disjoint biological pathways → averaged explanation corresponds to no individual basin [verdict] ghost average detected — explanation is not mechanistic

02 / approach

A nine-block falsification battery for explanation reliability.

EvoXplain treats the attribution-vector manifold as a topology with reproducible basins that are pipeline-dependent and biologically meaningful. The battery is engineered to kill the multiplicity hypothesis under every plausible null — randomness, sampling noise, model instability, hyperparameter drift, attribution-method choice. Multiplicity that survives is real.

block 01

Stochasticity & seed robustness

killed · BC + TCGA

block 02

Sampling noise & null perturbation

killed · BC + TCGA

block 03

Model instability under retraining

killed · BC + TCGA

block 04

Hyperparameter sensitivity (factorial ARM-D)

killed · BC + TCGA

block 05

Attribution-method dependence (multi-lens)

killed · BC + TCGA

block 06

Cross-pipeline persistence (LR · DNN · XGB · ENet)

killed · BC + TCGA

block 07

Within-split clustering integrity

killed · BC + TCGA

block 08

Silhouette thresholding & k=1 null

killed · BC + TCGA

block 09

Semantic content of basin centroids

killed · BC + TCGA

03 / results

Validated on TCGA pan-cancer and BRCA subtype benchmarks.

Across a 100×100 design (100 splits × 100 seeds, seeds 800–899) on TCGA pan-cancer and TCGA-BRCA Luminal-vs-Basal, the falsification battery confirmed that explanation multiplicity persists under every null — and that the resulting basins encode distinct biological pathways, not noise.

9_/9

falsification blocks killed
BC + TCGA pan-cancer

~41×

decoupling ratio
attribution var ÷ accuracy var

k*=3

SHAP basins on TCGA
k*=2 LIME · disjoint pathways

+0.85

inter-basin cosine
BRCA Luminal vs Basal

04 / applications

Built for environments where the wrong explanation has consequences.

EvoXplain is designed for organisations where ML decisions are subject to scientific scrutiny, regulatory audit, or legal challenge — and where averaged or single-seed explanations are insufficient evidence.

/ 01

Pharma & Biotech

Detect when biomarker attributions, drug-target rankings, or patient-stratification features are pipeline artefacts before they enter a study design. Distinguish reproducible biological signal from explanation drift across retraining runs.

aligned with: FDA AI/ML-based SaMD action plan · GxP model documentation

/ 02

Financial Services

Provide model risk teams with falsifiable evidence that explanation features are mechanistically grounded — not artefacts of the training pipeline. Integrates into model validation, challenger reviews, and adverse-action documentation.

aligned with: SR 11-7 model risk management · EBA model governance · ECOA

/ 03

Policy & Regulation

Equip auditors and conformity assessment bodies with a reproducible test for whether claimed explanations of high-risk AI systems are robust under retraining — addressing the gap between transparency obligations and post-hoc XAI fragility.

aligned with: EU AI Act art. 13/14 · ISO/IEC 42001 · NIST AI RMF

05 / contact

Open research. Commercial deployments by enquiry.

EvoXplain is released under AGPL-3.0 and protected by a UK provisional patent. The codebase, falsification battery, and reproducibility scripts are open. Commercial deployment, dual-licensing, and bespoke integration are handled through direct enquiry.

RESEARCH & OPEN SOURCE

The framework, preprint, and falsification harness are publicly available. Issues, replications, and academic collaborations are welcome.

arxiv.org/abs/2512.22240 →

COMMERCIAL & LICENSING

Integration into regulated ML pipelines, dual-licensing for commercial use, model-risk consulting, and conformity-assessment partnerships.

contact@evoxplain.com →

Legal notice. EvoXplain is the subject of a UK provisional patent application (sole inventor: Chama Bensmail). The framework is licensed under AGPL-3.0; commercial use that does not satisfy AGPL-3.0 obligations requires a separate licence. EvoXplain Ltd (England & Wales).

Identifiability in predictions does not imply identifiability in mechanisms.

Two models that agree on every prediction can disagree completely on why.

A nine-block falsification battery for explanation reliability.

Validated on TCGA pan-cancer and BRCA subtype benchmarks.

Built for environments where the wrong explanation has consequences.

Pharma & Biotech

Financial Services

Policy & Regulation

Open research. Commercial deployments by enquiry.

RESEARCH & OPEN SOURCE

COMMERCIAL & LICENSING