Download PDFOpen PDF in browserVietnamese Automatic Speech Recognition with TransformerEasyChair Preprint 71474 pages•Date: December 4, 2021AbstractRecently, speech recognition using end-toend models is gradually becoming a trend and has superior performance compared to traditional methods. The most frequently used methods are the combination of attention-based methods use an attention mechanism and connectionist temporal classification (CTC) for supervised Learning for Automatic Speech Recognition (ASR). In this paper, we propose a speech recognition model using the transformer architecture and achieved the top 3 in 2021 the Vietnamese Language and Speech Processing contest with 8.83% word error rate (WER) on private-test set. Keyphrases: Attention Mechanism, end-to-end speech recognition, transformer
|