JOURNAL ARTICLE

Mining algorithmic complexity in full-text scholarly documents

Abstract

Non-textual document elements (NTDE) like charts, diagrams, algorithms play an important role to present key information in scientific documents [1]. Recent advancements in information retrieval systems tap this information to answer more complex queries by mining text pertaining to non-textual document elements. However, linking between document elements and corresponding text can be non-trivial. For instance, linking text related to algorithmic complexity with consequent root algorithm could be challenging. These elements are sometime placed at the start or at the end of the page instead of following the flow of document text, and the discussion about these elements may or may not be on the same page. In recent years, quite a few attempts have been made to extract NTDE [2-3]. These techniques are actively applied for effective document summarization, to improve the existing IR systems. Generally, asymptotic notations are used to identify the complexity lines in full text. We mine the relevant complexities of algorithms from full text by comparing the metadata of algorithm with context of paragraph in which complexity related discussion is made by authors. In this paper, we presented a mechanism for identification of algorithmic complexity lines using regular expressions, algorithmic metadata compilation of algorithms, and linking complexity related textual lines to algorithmic metadata.

Keywords:
Computer science Automatic summarization Metadata Information retrieval Context (archaeology) Paragraph World Wide Web

Metrics

4
Cited By
0.40
FWCI (Field Weighted Citation Impact)
5
Refs
0.69
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Semantic Web and Ontologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Data Mining Algorithms and Applications
Physical Sciences →  Computer Science →  Information Systems

Related Documents

JOURNAL ARTICLE

Open access to scholarly full‐text documents

Péter Jacsó

Journal:   Online Information Review Year: 2006 Vol: 30 (5)Pages: 587-594
JOURNAL ARTICLE

Text Mining Scholarly Publications using APIs

Sarraf, IshitaFu, YuanxiSchneider, Jodi

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2023
© 2026 ScienceGate Book Chapters — All rights reserved.