Enhancing the accuracy of machine learning models using the super learner technique in digital soil mapping
- Journal Article
Digital soil mapping approaches predict soil properties based on the relationships between soil observations and related environmental covariates using techniques such as machine learning (ML) models. In this research, a wide range of ML models (12 base learners) were tested in predicting and mapping soil properties. Furthermore, a super learner approach was used to improve model accuracy by combining the predictions of the base learners. A major challenge of using super learner and complex models is that the exact contribution of individual covariates in the overall prediction is not always known. To address this issue, permutation feature importance (PFI) analysis was applied as a model-agnostic interpretation tool. The weights assigned to each ML base learner obtained from super learner, and feature importance values obtained from each ML base learner were used to quantify the contribution of individual covariates on the final prediction. The super learner and PFI techniques were tested by predicting a variety of soil physical and chemical properties of the Urmia Lake playa in Iran. As expected, the results indicated that the super learner had substantially higher accuracies for predicting soil properties in comparison to the individual base learners. For instance, the super learner showed an improved performance in comparison to linear regression by decreasing the root mean square error by an average of 46%. The PFI analysis revealed the important contribution of geomorphic and groundwater data in predicting soil properties. Overall, the proposed approach may be used for improving accuracy of ML models in digital soil mapping. Show more
Journal / seriesGeoderma
Pages / Article No.
SubjectSuper learner; Model-agnostic; Digital soil mapping; Saline soils; Urmia Playa Lake; Iran
MoreShow all metadata