Merging of native and non-native speech for low-resource accented ASR

Samson Juan, Sarah and Besacier, Laurent and Lecouteux, Benjamin and Tien-Ping, Tan (2015) Merging of native and non-native speech for low-resource accented ASR. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9449. pp. 255-266. ISSN 3029743

[img]
Preview
PDF
No 35 (abstrak).pdf

Download (65kB) | Preview

Abstract

This paper presents our recent study on low-resource automatic speech recognition (ASR) system with accented speech. We propose multi-accent Subspace Gaussian Mixture Models (SGMM) and accent-specific Deep Neural Networks (DNN) for improving non-native ASR performance. In the SGMM framework, we present an original language weighting strategy to merge the globally shared parameters of two models based on native and non-native speech espectively. In the DNN framework, a native deep neural net is fine-tuned to non-native speech. Over the non-native baseline, we achieved relative improvement of 15% for multi-accent SGMM and 34% for accent-specific DNN with speaker adaptation.

Item Type: Article
Uncontrolled Keywords: Automatic speech recognition, Cross-lingual acoustic modelling, Non-native speech, Low-resource system, Multi-accent SGMM, Accent-specific DNN, unimas, university, universiti, Borneo, Malaysia, Sarawak, Kuching, Samarahan, ipta, education, research, Universiti Malaysia Sarawak
Subjects: T Technology > T Technology (General)
Divisions: Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Depositing User: Saman
Date Deposited: 20 May 2016 01:14
Last Modified: 21 Oct 2016 07:34
URI: http://ir.unimas.my/id/eprint/12098

Actions (For repository members only: login required)

View Item View Item