Improving Attentive Sequence-to-Sequence Generative-Based Chatbot Model Using Deep Neural Network Approach

Wan Solehah, Wan Ahmad (2022) Improving Attentive Sequence-to-Sequence Generative-Based Chatbot Model Using Deep Neural Network Approach. Masters thesis, Universiti Malaysia Sarawak.

[img] PDF
Final Submission of Thesis Form - Signed.pdf
Restricted to Repository staff only

Download (257kB)
[img] PDF
MSc Thesis_Wan Solehah Wan Ahmad (24pgs).pdf

Download (344kB)
[img] PDF (Please get the password by email to repository@unimas.my, or call ext: 3914/ 3942/ 3933)
Wan Solehah ft.pdf
Restricted to Registered users only

Download (1MB)

Abstract

Deep Neural Network (DNN) is a combination method between two different subfields of Machine Learning application, including the Artificial Neural Network (ANN) and Deep Learning (DL). An example of the DNN model is the Attentive Sequence-to-Sequence (Seq2Seq) model that was first created to tackle a problem setting in language processing. One of the applications is the chatbot model that works explicitly to accurately respond to users' inquiries. Through the years, a chatbot application has seen some improvement, from generating hard-generic responses to more flexible response. The adoption of DNN method into chatbot application produces a new generation chatbot that called as Generative-Based Chatbot. However, it is difficult to create and train a Generative-Based chatbot model that can maintain relevancy of dialogue generation in a long conversation. Hence, this research’s objective aimed to propose an optimization strategy based on Structural Modification and Optimizing Training Network for improving the lacking of accuracy of response in the chatbot application, to propose the algorithm enhancement to improve the current attention mechanism in the Attentive Sequence-to-Sequence model and the network’s training optimization of its inability to memorize the dialogue history, and lastly, to evaluate the accuracy of response of the proposed solution through data training on loss function and real data testing. The structural modification that is based on a slight modification in Additive Attention mechanism. The method is by adding a scaling factor for the dimension of the decoder hidden state. The other one is the network training’s environment optimization that is done through hyperparameter optimization by selecting and fine-tuning high impact parameters which include Optimizer, Learning Rate and Dropout to reduce error rate (loss function). The strategies applied showed that the final accuracy obtained through the training after implementing a modification in the algorithm is at 81% accuracy rate compared to the basic model that recorded its final accuracy at 79% accuracy rate. Meanwhile, after modification and optimization, the model's performance recorded its final value of accuracy and loss rate at 87% and 0.51, respectively. The result indicates the performance of the optimized model outperforms the baseline model.

Item Type: Thesis (Masters)
Subjects: T Technology > T Technology (General)
Divisions: Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Faculties, Institutes, Centres > Faculty of Computer Science and Information Technology
Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Depositing User: WAN SOLEHAH BINTI WAN AHMAD
Date Deposited: 27 Oct 2023 04:30
Last Modified: 11 Mar 2024 23:45
URI: http://ir.unimas.my/id/eprint/43212

Actions (For repository members only: login required)

View Item View Item