JOURNAL ARTICLE

Automatically Determining Versions of Scholarly Articles

Abstract

Background: Repositories of scholarly articles should provide authoritative information about the materials they distribute and should distribute those materials in keeping with pertinent laws. To do so, it is important to have accurate information about the versions of articles in a collection.Analysis: This article presents a simple statistical model to classify articles as author manuscripts or versions of record, with parameters trained on a collection of articles that have been hand-annotated for version. The algorithm achieves about 94 percent accuracy on average (cross-validated).Conclusion and implications: The average pairwise annotator agreement among a group of experts was 94 percent, showing that the method developed in this article displays performance competitive with human experts.

Keywords:
Pairwise comparison Computer science Information retrieval Simple (philosophy) Data science Data mining Artificial intelligence

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
5
Refs
0.03
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Data Mining Algorithms and Applications
Physical Sciences →  Computer Science →  Information Systems

Related Documents

JOURNAL ARTICLE

Describing versions of scholarly articles

T. Scott Plutchak

Journal:   Positioning the Profession: the Tenth International Congress on Medical Librarianship Year: 2009 Vol: 8 (11)Pages: 1-9
JOURNAL ARTICLE

Automatically generating Wikipedia articles

Christina SauperRegina Barzilay

Year: 2009 Vol: 1 Pages: 208-208
JOURNAL ARTICLE

Referencing in Scholarly Articles

Allison W. PearsonPramodita Sharma

Journal:   Family Business Review Year: 2015 Vol: 28 (3)Pages: 188-192
© 2026 ScienceGate Book Chapters — All rights reserved.