Evaluating Generative Neural Attention Weights-Based Chatbot on Customer Support Twitter Dataset

Sinarwati, Mohamad Suhaili and Naomie, Salim and Mohamad Nazim, Jambli (2024) Evaluating Generative Neural Attention Weights-Based Chatbot on Customer Support Twitter Dataset. International Journal of Computer and Systems Engineering, 18 (7). pp. 346-352. ISSN 1307-6892

[img] PDF
Evaluating Generative.pdf

Download (169kB)
Official URL: https://publications.waset.org/10013687/evaluating...

Abstract

Sequence-to-sequence (seq2seq) models augmented with attention mechanisms are increasingly important in automated customer service. These models, adept at recognizing complex relationships between input and output sequences, are essential for optimizing chatbot responses. Central to these mechanisms are neural attention weights that determine the model’s focus during sequence generation. Despite their widespread use, there remains a gap in the comparative analysis of different attention weighting functions within seq2seq models, particularly in the context of chatbots utilizing the Customer Support Twitter (CST) dataset. This study addresses this gap by evaluating four distinct attention-scoring functions—dot, multiplicative/general, additive, and an extended multiplicative function with a tanh activation parameter — in neural generative seq2seq models. Using the CST dataset, these models were trained and evaluated over 10 epochs with the AdamW optimizer. Evaluation criteria included validation loss and BLEU scores implemented under both greedy and beam search strategies with a beam size of k = 3. Results indicate that the model with the tanh-augmented multiplicative function significantly outperforms its counterparts, achieving the lowest validation loss (1.136484) and the highest BLEU scores (0.438926 under greedy search, 0.443000 under beam search, k = 3). These findings emphasize the crucial influence of selecting an appropriate attention-scoring function to enhance the performance of seq2seq models for chatbots, particularly highlighting the model integrating tanh activation as a promising approach to improving chatbot quality in customer support contexts.

Item Type: Article
Uncontrolled Keywords: Attention weight, chatbot, encoder-decoder, neural generative attention, score function, sequence-to-sequence.
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Academic Faculties, Institutes and Centres > Centre for Pre-University Studies
Faculties, Institutes, Centres > Centre for Pre-University Studies
Academic Faculties, Institutes and Centres > Centre for Pre-University Studies
Depositing User: Mohamad Suhaili
Date Deposited: 09 Jul 2025 03:28
Last Modified: 09 Jul 2025 03:28
URI: http://ir.unimas.my/id/eprint/48715

Actions (For repository members only: login required)

View Item View Item