Using Closely-related Language to Build an ASR for a Very Under-resourced Language: Iban

Juan, Sarah Samson and Besacier, Laurent and Lecouteux, Benjamin and Tan, Tien-Ping (2014) Using Closely-related Language to Build an ASR for a Very Under-resourced Language: Iban. In: COCOSDA 2014, Phuket, Thailand.

COCOSDA-sarahsamsonjuan.pdf - Submitted Version

Download (221kB) | Preview


This paper describes our work on automatic speech recognition system (ASR) for an under-resourced language, namely the Iban language, which is spoken in Sarawak, a Malaysian Borneo state. To begin this study, we collected 8 hours of speech data due to no resources yet for ASR concerning this language. Following the lack of resources, we employed bootstrapping techniques on a closely-related language to build the Iban system. For this case, we utilized Malay data to bootstrap the grapheme-to-phoneme system (G2P) for the target language. We also developed several G2Ps to acquire Iban pronunciation dictionaries, which were later evaluated on the Iban ASR for obtaining the best version. Subsequently, we conducted experiments on cross-lingual ASR by using subspace Gaussian Mixture Models (SGMM) where the shared parameters obtained in either monolingual or multilingual fashion. From our observations, using out-of-language data as source language provided lower WER when Iban data is very imited.

Item Type: Proceeding (Paper)
Uncontrolled Keywords: automatic speech recognition, acoustic modelling, subspace Gaussian mixture model, bootstrapping grapheme-to-phoneme, unimas, university, universiti, Borneo, Malaysia, Sarawak, Kuching, Samarahan, ipta, education, Universiti Malaysia Sarawak
Subjects: Q Science > Q Science (General)
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Depositing User: Samson Juan
Date Deposited: 16 Oct 2015 01:22
Last Modified: 16 Oct 2015 01:22

Actions (For repository members only: login required)

View Item View Item