The Effect Of Optimizers On The Generalizability Additive Neural Attention For Customer Support Twitter Dataset In Chatbot Application

Sinarwati, Mohamad Suhaili and Naomie, Salim and Mohamad Nazim, Jambli (2024) The Effect Of Optimizers On The Generalizability Additive Neural Attention For Customer Support Twitter Dataset In Chatbot Application. Baghdad Science Journal, 21 (2(SI)). pp. 655-661. ISSN 2411-798

[img] PDF
The Effect.pdf

Download (223kB)
Official URL: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article...

Abstract

When optimizing the performance of neural network-based chatbots, determining the optimizer is one of the most important aspects. Optimizers primarily control the adjustment of model parameters such as weight and bias to minimize a loss function during training. Adaptive optimizers such as ADAM have become a standard choice and are widely used for their invariant parameter updates' magnitudes concerning gradient scale variations, but often pose generalization problems. Alternatively, Stochastic Gradient Descent (SGD) with Momentum and the extension of ADAM, the ADAMW, offers several advantages. This study aims to compare and examine the effects of these optimizers on the chatbot CST dataset. The effectiveness of each optimizer is evaluated based on its sparse-categorical loss during training and BLEU in the inference phase, utilizing a neural generative attention-based additive scoring function. Despite memory constraints that limited ADAMW to ten epochs, this optimizer showed promising results compared to configurations using early stopping techniques. SGD provided higher BLEU scores for generalization but was very time-consuming. The results highlight the importance of finding a balance between optimization performance and computational efficiency, positioning ADAMW as a promising alternative when training efficiency and generalization are primary concerns.

Item Type: Article
Uncontrolled Keywords: ADAM, ADAMW, Neural Network-based Chatbot, Optimizer, SGD.
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Academic Faculties, Institutes and Centres > Centre for Pre-University Studies
Faculties, Institutes, Centres > Centre for Pre-University Studies
Academic Faculties, Institutes and Centres > Centre for Pre-University Studies
Depositing User: Mohamad Suhaili
Date Deposited: 03 Apr 2024 07:50
Last Modified: 03 Apr 2024 07:50
URI: http://ir.unimas.my/id/eprint/44542

Actions (For repository members only: login required)

View Item View Item