Explainable ML for banks : Domain

Full Analysis

Model risk teams at large Canadian banks have started treating explainability not as a feature but as a validation criterion — meaning a model can be rejected at the governance stage if its explanations do not meet internal standards.

What model risk officers are checking

The evaluation typically covers three dimensions. Fidelity — does the explanation accurately reflect what the model computed, not just a plausible-sounding approximation? Consistency — does the same input produce the same explanation across separate inference runs? And stability — do small changes in input produce proportionally small changes in explanation, or does the explanation flip dramatically for minor feature variations? Many post-hoc explanation methods fail the stability test under stress, which model risk officers flag as a reliability concern during validation reviews.

The gap between vendor claims and validation outcomes

Several major ML platform vendors market their built-in explainability tools as audit-ready. In practice, model risk teams frequently find that these tools satisfy fidelity requirements but fail consistency checks when models are retrained on updated data. Explanations that shift significantly between model versions create a documentation burden — teams must re-validate explanation outputs every time the underlying model changes, which in production environments can mean quarterly reviews.

Where internal teams are landing

The banks making the most progress on this have separated model validation from explanation validation into distinct workstreams with separate sign-off requirements. A model can pass performance thresholds while its explanation layer remains under review. This prevents deployment delays caused by explanation issues from blocking model releases — and makes the accountability structure clearer for both technical and compliance stakeholders.

How Model Risk Teams Are Evaluating Explainability in 2024

What model risk officers are checking

The gap between vendor claims and validation outcomes

Where internal teams are landing

Three areas covered in depth across the programme

Regulatory Expectations

SHAP and LIME in Credit Scoring

Model Governance Workflows