Download PDFOpen PDF in browser

Cross-Lingual Speech Emotion Recognition Using English and Mandarin on Thai Data

EasyChair Preprint 15443

16 pagesDate: November 20, 2024

Abstract

This study explores the efficacy of cross-lingual Speech Emotion Recognition (SER) using Thai as a target language with training sets in English and Mandarin. The study evaluates the adaptability of SER models across linguistic boundaries, emphasizing the challenges and potential of leveraging well-resourced languages to enhance emotion recognition capabilities in a language with fewer resources. Through a series of experiments, the research investigates three primary aspects: the performance of same-corpus training within Thai, cross-lingual model application from English and Mandarin to Thai, and the effectiveness of transfer learning techniques in improving model accuracy. The findings indicate that Mandarin facilitates more effective cross-lingual SER with Thai compared to English. However, despite the initial promise, models trained on Mandarin or English and applied to Thai did not outperform those trained directly on Thai in the same-corpus settings, suggesting limited benefits from cross-lingual training without sophisticated adaptation methods. Transfer learning emerged as a pivotal strategy, particularly when models pre-trained on large datasets in Mandarin were fine-tuned with Thai data, showing improved performance, and suggesting a scalable approach for deploying SER systems in multilingual contexts.

Keyphrases: Cross-lingual, Thai language, deep learning, speech emotion recognition

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:15443,
  author    = {Kantapong Wonghirunruch},
  title     = {Cross-Lingual Speech Emotion Recognition Using English and Mandarin on Thai Data},
  howpublished = {EasyChair Preprint 15443},
  year      = {EasyChair, 2024}}
Download PDFOpen PDF in browser