Samson Juan, Sarah and Besacier, Laurent and Lecouteux, Benjamin and Tien-Ping, Tan (2015) Merging of native and non-native speech for low-resource accented ASR. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9449. pp. 255-266. ISSN 3029743
|
PDF
No 35 (abstrak).pdf Download (65kB) | Preview |
Abstract
This paper presents our recent study on low-resource automatic speech recognition (ASR) system with accented speech. We propose multi-accent Subspace Gaussian Mixture Models (SGMM) and accent-specific Deep Neural Networks (DNN) for improving non-native ASR performance. In the SGMM framework, we present an original language weighting strategy to merge the globally shared parameters of two models based on native and non-native speech espectively. In the DNN framework, a native deep neural net is fine-tuned to non-native speech. Over the non-native baseline, we achieved relative improvement of 15% for multi-accent SGMM and 34% for accent-specific DNN with speaker adaptation.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Automatic speech recognition, Cross-lingual acoustic modelling, Non-native speech, Low-resource system, Multi-accent SGMM, Accent-specific DNN, unimas, university, universiti, Borneo, Malaysia, Sarawak, Kuching, Samarahan, ipta, education, research, Universiti Malaysia Sarawak |
Subjects: | T Technology > T Technology (General) |
Divisions: | Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology Faculties, Institutes, Centres > Faculty of Computer Science and Information Technology Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology |
Depositing User: | Saman |
Date Deposited: | 20 May 2016 01:14 |
Last Modified: | 21 Oct 2016 07:34 |
URI: | http://ir.unimas.my/id/eprint/12098 |
Actions (For repository members only: login required)
View Item |