Low Resource Malay Dialect Automatic Speech Recognition Modeling Using Transfer Learning from a Standard Malay Model

Tan, Tien Ping and Lei, Qin and Sarah Flora, Samson Juan and Jasmina Khaw, Yen Min (2024) Low Resource Malay Dialect Automatic Speech Recognition Modeling Using Transfer Learning from a Standard Malay Model. Pertanika Journal of Science and Technology, 32 (4). pp. 1545-1563. ISSN 2231-8526

[img] PDF
Low Resource Malay Dialect - Copy.pdf

Download (220kB)
Official URL: http://www.pertanika.upm.edu.my/pjst/browse/regula...

Abstract

Approaches to automatic speech recognition have transited from Hidden Markov Model (HMM)-based ASR to deep neural networks. The advantages of deep neural network approaches are that they can be developed quickly and perform better given large language resources. Nevertheless, dialect speech recognition is still challenging due to the limited resources. Transfer learning approaches have been proposed to improve speech recognition for low resources. In the first approach, the model is pre-trained on a large and diverse labeled dataset to learn the acoustic and language patterns from the speech signal. Then, the model parameters are updated with a new dataset, and the pre-trained model is fine-tuned on a low-resource language dataset. The fine-tuning process is usually completed by freezing the pre-trained layers and training the remaining layers of the model on the low-resource language corpus. Another approach is to use a pre-trained model to capture the compact and meaningful features as input to the encoder. Pre-training in this approach usually involves using unsupervised learning methods to train models on a corpus of large amounts of unmarked data. It enables the model to learn the general patterns and relationships between the input speech signals. This paper proposes a training recipe using transfer learning and Standard Malay models to improve automatic speech recognition for Kelantan and Sarawak Malay dialects.

Item Type: Article
Uncontrolled Keywords: Automatic speech recognition, Malay dialects, Malay language, transfer learning.
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Faculties, Institutes, Centres > Faculty of Computer Science and Information Technology
Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Depositing User: Samson Juan
Date Deposited: 06 Aug 2024 04:09
Last Modified: 06 Aug 2024 04:09
URI: http://ir.unimas.my/id/eprint/45519

Actions (For repository members only: login required)

View Item View Item