Download PDFOpen PDF in browser

Towards a Systematic Investigation of Deep Learning Approaches for Bacterial Taxonomic Classification Using the 16S rRNA Gene

EasyChair Preprint 9925

3 pagesDate: April 4, 2023

Abstract

Modern bacterial taxonomy revolves around bioinformatics-based analysis, leading to deeper insights into microbial communities and their composition. The 16S ribosomal RNA (16S rRNA) gene is a frequently used and well-established phylogenetic marker for in silico bacterial classification. With the rise of sequence data, novel machine learning methods are required to deal with the increasing complexity involved in analyses. In this project, Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and attention-based deep learning models were proposed to serve as efficient alternative approaches to bacterial classification. Machine learning models were trained and evaluated with a manually curated 16S dataset. Two sequence encoding strategies, k-mer and one-hot encoding, were studied and evaluated with the CNN- and RNN-based models respectively. Although a one-hot encoding approach allows for a greater variety of experimental comparisons, k-mer encoding showed superior results. The performance of deep learning models was compared against the conventional machine learning-based Ribosomal Database Project (RDP) Classifier in terms of accuracy and training time. The CNN model with 8-mer encoding showed 96.33% test accuracy at the genus level, 0.17%p higher than the RDP Classifier, demonstrating the potential of deep learning approaches for bacterial classification.

Keyphrases: 16S rRNA, Bacterial Classification, Convolutional Neural Network, Recurrent Neural Network, machine learning

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:9925,
  author    = {Yuju Ahn and Robbe Claeys and Moobeom Hong and Jihwan Lim and Inkyun Park},
  title     = {Towards a Systematic Investigation of Deep Learning Approaches for Bacterial Taxonomic Classification Using the 16S rRNA Gene},
  howpublished = {EasyChair Preprint 9925},
  year      = {EasyChair, 2023}}
Download PDFOpen PDF in browser