Beating Naive Bayes at Taxonomic Classification of 16S rRNA Gene Sequences


Loading...

Date

2021-06

Publication Type

Journal Article

ETH Bibliography

yes

Citations

Altmetric

Data

Abstract

Naive Bayes classifiers (NBC) have dominated the field of taxonomic classification of amplicon sequences for over a decade. Apart from having runtime requirements that allow them to be trained and used on modest laptops, they have persistently provided class-topping classification accuracy. In this work we compare NBC with random forest classifiers, neural network classifiers, and a perfect classifier that can only fail when different species have identical sequences, and find that in some practical scenarios there is little scope for improving on NBC for taxonomic classification of 16S rRNA gene sequences. Further improvements in taxonomy classification are unlikely to come from novel algorithms alone, and will need to leverage other technological innovations, such as ecological frequency information.

Publication status

published

Editor

Book title

Volume

12

Pages / Article No.

644487

Publisher

Frontiers Media

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Microbiome; Metagenomics; Marker-gene sequencing; Taxonomic classification; Machine learning; Neural networks

Organisational unit

09714 - Bokulich, Nicholas / Bokulich, Nicholas check_circle

Notes

Funding

Related publications and datasets