An empirical study of feature selection for text categorization based on term weightage

Bong, Chih How and Kulathuramaiyer, Narayanan (2004) An empirical study of feature selection for text categorization based on term weightage. In: 2004 IEEE/WIC/ACM International Conference on Web Intelligence.

[img]
Preview
PDF
An+emperical+study+of+feature+selection+for+TEXT+categorization+based+on+term+weightage+%28+abstract%29.pdf

Download (66kB) | Preview

Abstract

This paper proposes a local feature selection (FS) measure namely, Categorical Descriptor Term (CTD) for text categorization. It is derived based on classic term weighting scheme, TFIDF. The method explicitly chooses feature set for each category by only selecting set of terms from relevant category. Although past literatures have suggested that the use of features from irrelevant categories can improve the measure of text categorization, we believe that by incorporating only relevant feature can be highly effective. The experimental comparison is carried out between CTD and five wellknown feature selection measures: Information Gain, Chi-Square, Correlation Coefficient, Odd Ratio and GSS Coefficient. The results also show that our proposed method can perform comparatively well with other FS measures, especially on collection with highly overlapped topics.

Item Type: Proceeding (Paper)
Additional Information: Universiti Malaysia Sarawak, UNIMAS
Uncontrolled Keywords: text categorization, weightage, Universiti Malaysia Saarwak, UNIMAS, IPTA, education, universiti, university, kuching, samarahan, sarawak, malaysia
Subjects: A General Works > AC Collections. Series. Collected works
T Technology > T Technology (General)
Divisions: Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Faculties, Institutes, Centres > Faculty of Computer Science and Information Technology
Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Depositing User: Karen Kornalius
Date Deposited: 19 Mar 2014 07:03
Last Modified: 24 Mar 2015 01:04
URI: http://ir.unimas.my/id/eprint/1190

Actions (For repository members only: login required)

View Item View Item