Comparisons of DNA Sequence Representation Methods for Deep Learning Modelling

Shu En, Chia and Lee, Nung Kion (2022) Comparisons of DNA Sequence Representation Methods for Deep Learning Modelling. In: IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET), 13-15 September 2022, Kota Kinabalu, Sabah.

[img] PDF
Comparisons of DNA - Copy.pdf

Download (81kB)
Official URL: https://ieeexplore.ieee.org/document/9936754

Abstract

Learning the enhancer sequence grammar from protein-DNA interaction via a computational approach is a challenging task because the features associated with the recognition codes are ill-defined. While sequence features are not the only way to define the sequence characteristics, they are the most effective. Deep learning neural networks have become the key technique for modeling those features for the classification task. Nevertheless, effective learning of deep learning requires enhancer sequence features to be represented and encoded into suitable matrix form. The aims of this paper is to evaluate six sequence feature representation/encoding methods for convolutional neural networks modelling. Using a histone marks dataset as input data, our results indicate k-mer feature achieved the best performance, followed by word-based features, which performed favorably better than one-hot encoding. The random-walk feature, nevertheless, performed the worst. Moreover, our finding provides strong evidence to use kmer/word features instead of the popular one-hot encoding for histone sequence in CNN modeling.

Item Type: Proceeding (Paper)
Uncontrolled Keywords: enhancers, sequence encoding, deep learning, convolutional neural networks, DNA motifs.
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
T Technology > T Technology (General)
Divisions: Academic Faculties, Institutes and Centres > Faculty of Cognitive Sciences and Human Development
Faculties, Institutes, Centres > Faculty of Cognitive Sciences and Human Development
Academic Faculties, Institutes and Centres > Faculty of Cognitive Sciences and Human Development
Depositing User: Lee
Date Deposited: 13 Jul 2023 00:20
Last Modified: 13 Jul 2023 00:20
URI: http://ir.unimas.my/id/eprint/42246

Actions (For repository members only: login required)

View Item View Item