BENCHMARKING WHISPER OPENAI ON SARAWAK LANGUAGES

GERALD EINSTEIN CORNELIUS (2023) BENCHMARKING WHISPER OPENAI ON SARAWAK LANGUAGES. [Final Year Project Report] (Unpublished)

[img] PDF (Please get the password by email to repository@unimas.my, or call ext: 3914/ 3942/ 3933)
Gerald Einstein ft.pdf

Download (1MB)

Abstract

The end-to-end (E2E) model is influentially reshaping the automatic speech recognition (ASR) scene, supplanting traditional ASR models such as the Hidden Markov model (HMM) and Deep Neural Network (DNN)-based hybrid models. In essence, it displaces crucial components of these traditional ASR models by simplifying the module-based design into a single-network architecture inside a deep learning framework. Interestingly, this simplified technique does not hinder the performance of this worthy successor of a model in recognising speech, while it even yields results that are superior to those of traditional ASR models. Recognising its infinite potential, OpenAI have developed the robust Whisper model based on the E2E, encoder-decoder transformer. While the aforementioned model performs exceptionally well for English ASR, its undetermined performance on low resource languages is a topic of research interest. In this work, the performance evaluation of the Whisper model on Sarawak languages will be explored. This model will be evaluated using speech data from under-resourced Sarawak languages, namely the Sarawak Malay, Iban, Melanau, and the Bidayuh dialects of Jagoi and Bukar Sadong. Fundamentally, a systematic literature review (SLR) and the development of an ASR system built on the Whisper model to uncover the recognition accuracy of Whisper OpenAI on Sarawak languages are the key highlights of this work. The experiment results obtained from the developed ASR system, based on the Word Error Rate (WER) evaluation metric may serve as a baseline for future works based on the integrated Whisper model for under-resource Sarawak languages.

Item Type: Final Year Project Report
Additional Information: Project Report (BSc.) -- Universiti Malaysia Sarawak, 2023.
Uncontrolled Keywords: Automatic speech recognition, end-to-end, Whisper, under-resourced, Sarawak languages
Subjects: P Language and Literature > PE English
Divisions: Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Faculties, Institutes, Centres > Faculty of Computer Science and Information Technology
Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Depositing User: Dan
Date Deposited: 18 Jan 2024 03:34
Last Modified: 18 Jan 2024 03:34
URI: http://ir.unimas.my/id/eprint/44201

Actions (For repository members only: login required)

View Item View Item