Joint Distance and Information Content Word Similarity Measure

Issa, Atoum and Bong, Chih How (2014) Joint Distance and Information Content Word Similarity Measure. In: Soft Computing Applications and Intelligent Systems. Springer Berlin Heidelberg, pp. 257-267.

[img]
Preview
PDF
Joint Distance and Information Content Word Similarity Measure (abstract).pdf

Download (351kB) | Preview

Abstract

Measuring semantic similarity between words is very important to many applications related to information retrieval and natural language processing. In the paper, we have discovered that word similarity metrics suffer from the drawback of obtaining equal similarities of two words, if they have the same path and depth values in WordNet. Likewise information content methods which depend on word probability of a corpus tend to posture the same drawback. This paper proposes a new hybrid semantic similarity to overcome the drawbacks by exploiting advantages of Li and Lin methods. On a benchmark set of human judgments on Miller Charles and Rubenstein Goodenough data sets, the proposed approach outperforms existing methods in distance and information content based methods.

Item Type: Book Section
Uncontrolled Keywords: semantic similarity; similarity measures; edge counting; information content; word similarity; WordNet, unimas, university, universiti, Borneo, Malaysia, Sarawak, Kuching, Samarahan, ipta, education, research, Universiti Malaysia Sarawak
Subjects: T Technology > T Technology (General)
Divisions: Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Depositing User: Karen Kornalius
Date Deposited: 03 Aug 2015 08:01
Last Modified: 03 Aug 2015 08:01
URI: http://ir.unimas.my/id/eprint/8458

Actions (For repository members only: login required)

View Item View Item