Download PDFOpen PDF in browserCurrent version

Comparison of Different Neural Network Architectures for Spoken Language Identification

EasyChair Preprint 10680, version 2

Versions: 123history
5 pagesDate: August 15, 2023

Abstract

This paper compares different neural network based archi- tectures on the spoken language identification task. To our best knowledge such a comparison of different models on the same dataset and the same set of languages does not yet exist. We incorporate 7 different models which include the latest architectures: a spectral images based Resnet model, a Convolutional Neural Network, a Bi-directional Long Short-Term Memory, a Convolutional Recurrent Neural Net- work, Wav2Vec 2.0, a transformer and a conformer. We also tackle audio with background noise and music by train- ing on data with similar accoustics. We finally also show that our models generalize well on third-party data.

Keyphrases: Conformer, Language Identification, Wav2vec 2.0, neural networks, transformer

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:10680,
  author    = {Tala Bazazo and Mohammad Zeineldeen and Christian Plahl and Ralf Schlüter and Hermann Ney},
  title     = {Comparison of Different Neural Network Architectures for Spoken Language Identification},
  howpublished = {EasyChair Preprint 10680},
  year      = {EasyChair, 2023}}
Download PDFOpen PDF in browserCurrent version