Vietnamese Automatic Speech Recognition with Transformer

EasyChair Preprint 7147

4 pages•Date: December 4, 2021

Duong Trinh Anh, Sam Dang Van, Tuan Do Van and Vi Ngo Van

Abstract

Recently, speech recognition using end-toend models is gradually becoming a trend and has superior performance compared to traditional methods. The most frequently used methods are the combination of attention-based methods use an attention mechanism and connectionist temporal classification (CTC) for supervised Learning for Automatic Speech Recognition (ASR). In this paper, we propose a speech recognition model using the transformer architecture and achieved the top 3 in 2021 the Vietnamese Language and Speech Processing contest with 8.83% word error rate (WER) on private-test set.

Keyphrases: Attention Mechanism, end-to-end speech recognition, transformer

Links:

https://easychair.org/publications/preprint/7Lnf

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:7147,
  author    = {Duong Trinh Anh and Sam Dang Van and Tuan Do Van and Vi Ngo Van},
  title     = {Vietnamese Automatic Speech Recognition with Transformer},
  howpublished = {EasyChair Preprint 7147},
  year      = {EasyChair, 2021}}

Download PDF Open PDF in browser