Tan, Tien Ping and Lei, Qin and Sarah Flora, Samson Juan and Jasmina Khaw, Yen Min (2024) Low Resource Malay Dialect Automatic Speech Recognition Modeling Using Transfer Learning from a Standard Malay Model. Pertanika Journal of Science and Technology, 32 (4). pp. 1545-1563. ISSN 2231-8526
PDF
Low Resource Malay Dialect - Copy.pdf Download (220kB) |
Abstract
Approaches to automatic speech recognition have transited from Hidden Markov Model (HMM)-based ASR to deep neural networks. The advantages of deep neural network approaches are that they can be developed quickly and perform better given large language resources. Nevertheless, dialect speech recognition is still challenging due to the limited resources. Transfer learning approaches have been proposed to improve speech recognition for low resources. In the first approach, the model is pre-trained on a large and diverse labeled dataset to learn the acoustic and language patterns from the speech signal. Then, the model parameters are updated with a new dataset, and the pre-trained model is fine-tuned on a low-resource language dataset. The fine-tuning process is usually completed by freezing the pre-trained layers and training the remaining layers of the model on the low-resource language corpus. Another approach is to use a pre-trained model to capture the compact and meaningful features as input to the encoder. Pre-training in this approach usually involves using unsupervised learning methods to train models on a corpus of large amounts of unmarked data. It enables the model to learn the general patterns and relationships between the input speech signals. This paper proposes a training recipe using transfer learning and Standard Malay models to improve automatic speech recognition for Kelantan and Sarawak Malay dialects.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Automatic speech recognition, Malay dialects, Malay language, transfer learning. |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology Faculties, Institutes, Centres > Faculty of Computer Science and Information Technology Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology |
Depositing User: | Samson Juan |
Date Deposited: | 06 Aug 2024 04:09 |
Last Modified: | 06 Aug 2024 04:09 |
URI: | http://ir.unimas.my/id/eprint/45519 |
Actions (For repository members only: login required)
View Item |