JOURNAL ARTICLE

Semantic text similarity using corpus-based word similarity and string similarity

Aminul IslamDiana Inkpen

Year: 2008 Journal:   ACM Transactions on Knowledge Discovery from Data Vol: 2 (2)Pages: 1-25   Publisher: Association for Computing Machinery

Abstract

We present a method for measuring the semantic similarity of texts using a corpus-based measure of semantic word similarity and a normalized and modified version of the Longest Common Subsequence (LCS) string matching algorithm. Existing methods for computing text similarity have focused mainly on either large documents or individual words. We focus on computing the similarity between two sentences or two short paragraphs. The proposed method can be exploited in a variety of applications involving textual knowledge representation and knowledge discovery. Evaluation results on two different data sets show that our method outperforms several competing methods.

Keywords:
Computer science Similarity (geometry) Semantic similarity String metric Artificial intelligence Word (group theory) Natural language processing Longest common subsequence problem Focus (optics) String searching algorithm String (physics) Representation (politics) Information retrieval Matching (statistics) Pattern matching Mathematics Algorithm

Metrics

480
Cited By
20.35
FWCI (Field Weighted Citation Impact)
60
Refs
1.00
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Short Answer Grading Using String Similarity And Corpus-Based Similarity

WaelAly A.

Journal:   International Journal of Advanced Computer Science and Applications Year: 2012 Vol: 3 (11)
JOURNAL ARTICLE

Short Answer Grading Using String Similarity And Corpus-Based Similarity

- WaelAly A.

Journal:   Greater South Information System Year: 2012
JOURNAL ARTICLE

Short Answer Grading Using String Similarity And Corpus-Based Similarity

- WaelAly A.

Journal:   Greater South Information System Year: 2012
JOURNAL ARTICLE

WORD SENSE DISAMBIGUATION USING FUZZY SEMANTIC-BASED STRING SIMILARITY MODEL

Amir Abd-RashidShuzlina Abdul-RahmanNor Nadiah YusofAzlinah Mohamed

Journal:   MALAYSIAN JOURNAL OF COMPUTING Year: 2018 Vol: 3 (2)Pages: 154-154
JOURNAL ARTICLE

Semantic Text Similarity

T. Keerthana

Journal:   International Journal for Research in Applied Science and Engineering Technology Year: 2020 Vol: 8 (6)Pages: 273-276
© 2026 ScienceGate Book Chapters — All rights reserved.