JOURNAL ARTICLE

On the Equivalence of Information Retrieval Methods for Automated Traceability Link Recovery

Abstract

We present an empirical study to statistically analyze the equivalence of several traceability recovery methods based on Information Retrieval (IR) techniques. The analysis is based on Principal Component Analysis and on the analysis of the overlap of the set of candidate links provided by each method. The studied techniques are the Jensen-Shannon (JS) method, Vector Space Model (VSM), Latent Semantic Indexing (LSI), and Latent Dirichlet Allocation (LDA). The results show that while JS, VSM, and LSI are almost equivalent, LDA is able to capture a dimension unique to the set of techniques which we considered.

Keywords:
Latent Dirichlet allocation Computer science Probabilistic latent semantic analysis Principal component analysis Vector space model Latent semantic analysis Equivalence (formal languages) Data mining Traceability Set (abstract data type) Search engine indexing Topic model Information retrieval Artificial intelligence Mathematics

Metrics

195
Cited By
48.05
FWCI (Field Weighted Citation Impact)
33
Refs
1.00
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Software Engineering Research
Physical Sciences →  Computer Science →  Information Systems
Web Data Mining and Analysis
Physical Sciences →  Computer Science →  Information Systems
Data Quality and Management
Social Sciences →  Decision Sciences →  Management Science and Operations Research
© 2026 ScienceGate Book Chapters — All rights reserved.