WEB APPLICATION SPEECH RECORDER AND TRANSLATOR USING GOOGLE CLOUD SPEECH-TO-TEXT AND CLOUD TRANSLATION API

Chan, Jiet Ying (2020) WEB APPLICATION SPEECH RECORDER AND TRANSLATOR USING GOOGLE CLOUD SPEECH-TO-TEXT AND CLOUD TRANSLATION API. [Final Year Project Report] (Unpublished)

[img] PDF (Please get the password from TECHNICAL & DIGITIZATION MANAGEMENT UNIT, ext: 082-583913/ 082-583914)
Chan Jiet Ying.pdf
Restricted to Registered users only

Download (3MB)

Abstract

The most popular video websites, YouTube has about 2 billion users worldwide who speak and understand different languages as of May 2019. Subtitles are essential for the users to understand the content of a video. However, not all the video owners provide subtitles for their video and it causes the potential audiences to hardly gain the content of the video. By combining Automatic Speech Recognition (ASR) and translation technologies, it is conceivable to generate subtitles automatically for the video viewers. The early Automatic Speech Recognition Technology in 1950s used a template-based recognition which was unable to match the input speech signals with the pre-stored acoustic models of different lengths. With the rise of hidden Markov models and Artificial Neural Network model, there is a lot of applications introduce and apply Automatic Speech Recognition Technology using deep learning. One of the applications is Google Cloud Speech-to-Text which combines ASR technology and cloud computing. It is one of the services in Google Cloud Platform which aims to provide flexible delivery of computing services over the Internet or the “Cloud”. This project proposed a speech recorder and translator using Google Cloud Speech-to-Text and Translation Application Programming Interface (API) which records the currently playing audio through the device’s microphone and generate translation text of the audio. The requirements of the proposed system are gathered from questionnaire and also through some document analysis on current related works. Most of the respondents of the questionnaire stated that high translation accuracy is important for the proposed system. The testing of the proposed system focuses on the translation between English and Chinese for educational videos. There is also an evaluation on the performance of Google Cloud Speech-to-Text and Translation. This study is expected to contribute to the area of speech recognition in the field of computer science and technology. It will help the researchers to explore the combination of speech recognition, Google Cloud Speech-to-Text and Translation API.

Item Type: Final Year Project Report
Additional Information: Project Report (BSc.) -- Universiti Malaysia Sarawak, 2020.
Uncontrolled Keywords: video websites, YouTube, Automatic Speech Recognition (ASR), Google Cloud Platform.
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Depositing User: Gani
Date Deposited: 25 Jan 2021 08:52
Last Modified: 25 Jan 2021 08:52
URI: http://ir.unimas.my/id/eprint/34022

Actions (For repository members only: login required)

View Item View Item