Granger-causal Attentive Mixtures of Experts: Learning Important Features with Neural Networks
- Working Paper
Rights / licenseIn Copyright - Non-Commercial Use Permitted
Knowledge of the importance of input features towards decisions made by machine-learning models is essential to increase our understanding of both the models and the underlying data. Here, we present a new approach to estimating feature importance with neural networks based on the idea of distributing the features of interest among experts in an attentive mixture of experts (AME). AMEs couple attentive gating networks with a Granger-causal objective to jointly produce accurate predictions as well as estimates of feature importance. Our experiments on an established benchmark and two real-world datasets show (i) that the feature importance estimates provided by AMEs compare favourably to those provided by state-of-the-art methods, (ii) that AMEs are significantly faster than existing methods, and (iii) that the associations discovered by AMEs are consistent with those reported by domain experts. In addition, we analyse the trade-off between predictive performance and estimation accuracy, the degree to which importance estimates of existing methods conform to predictive value, and whether a lower Granger-causal error on held-out data indicates a better feature importance estimation accuracy Show more
Journal / seriesarXiv
Pages / Article No.
Organisational unit09533 - Karlen, Walter (SNF-Förderprofessur)
167302 - Personalized management of low back pain with mHealth: Big Data opportunities, challenges and solutions (SNF)
Related publications and datasets
NotesUpdated version 28 May 2018.
MoreShow all metadata