Same data, different results? Machine learning approaches in bioacoustics


Loading...

Date

2025-08

Publication Type

Review Article

ETH Bibliography

yes

Citations

Web of Science:
Scopus:
Altmetric

Data

Abstract

Automated acoustic analysis is increasingly used in behavioural ecology, and determining caller identity is a key element for many investigations. However, variability in feature extraction and classification methods limits the comparability of results across species and studies, constraining conclusions we can draw about the ecology and evolution of the groups under study. We investigated the impact of using different feature extraction (spectro-temporal measurements, linear and Mel-frequency cepstral coefficients (MFCC), as well as highly comparative time-series analysis) and classification methods (discriminant function analysis, neural networks, random forests (RF), and support vector machines) on the consistency of caller identity classification accuracy across 16 mammalian datasets. We found that MFCCs and RFs yield consistently reliable results across datasets, facilitating a standardised approach across species that generates directly comparable data. These findings remained consistent across vocalisation sample sizes and number of individuals considered. We offer guidelines for processing and analysing mammalian vocalisations, fostering greater comparability and advancing our understanding of the evolutionary significance of acoustic communication in diverse mammalian species.

Publication status

published

Editor

Book title

Volume

16 (8)

Pages / Article No.

1574 - 1586

Publisher

Wiley

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

bioacoustics; call distinctiveness; individual identification; machine learning; method comparison; review; vocal communication

Organisational unit

Notes

Funding

Related publications and datasets