Explaining Multiclass Compound Activity Predictions Using Counterfactuals and Shapley Values
Most machine learning (ML) models produce black box predictions that are difficult, if not impossible, to understand. In pharmaceutical research, black box predictions work against the acceptance of ML models for guiding experimental work. Hence, there is increasing interest in approaches for explainable ML, which is a part of explainable artificial intelligence (XAI), to better understand prediction outcomes. Herein, we have devised a test system for the rationalization of multiclass compound activity prediction models that combines two approaches from XAI for feature relevance or importance analysis, including counterfactuals (CFs) and Shapley additive explanations (SHAP). For compounds with different single- and dual-target activities, we identified small compound modifications that induce feature changes inverting class label predictions. In combination with feature mapping, CFs and SHAP value calculations provide chemically intuitive explanations for model decisions.