JOURNAL ARTICLE

Intelligent Information Retrieval: Handling Variability in Document Structure

Aarushi GuptaAkhil ChawlaK S ShushruthaMohana

Year: 2022 Journal:   2022 3rd International Conference on Smart Electronics and Communication (ICOSEC) Vol: 2 Pages: 1635-1640

Abstract

Every corporation's day-to-day activities entail dealing with a vast array of diverse data formats, such as work orders, techno's, maintenance papers, and so on, many of which are selectable or scanned PDFs. These tasks demand several hours of human labor to extract the necessary data from these papers for further processing, and analysis incurring significant financial toll to these corporations. As a result, there is enormous potential for the creation of a digital solution that enables sophisticated OCR implementation, leading to the automation of the entire information extraction process. This paper provides a thorough examination of information extraction process focusing to deliver a high-quality complete functional solution and suggests a solution that incorporates critical preprocessing required for accurate information extraction and makes use of the capabilities of Faster R-CNN for document layout analysis as well as a range of approaches for efficient data extraction depending on data type. The multistage document analysis and information extraction tool also provides options for template definition enabling their reusability for batch processing large amounts of unstructured data.

Keywords:
Computer science Information retrieval Document retrieval Artificial intelligence

Metrics

1
Cited By
0.12
FWCI (Field Weighted Citation Impact)
16
Refs
0.31
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Semantic Web and Ontologies
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

BOOK

Intelligent Document Retrieval

˜The œSpringer international series on information retrieval Year: 2005
JOURNAL ARTICLE

Document structure-driven investigative information retrieval

Tuomas KetolaThomas Roelleke

Journal:   Information Systems Year: 2023 Vol: 121 Pages: 102315-102315
BOOK-CHAPTER

Intelligent Interface for Web Information Retrieval with Document Understanding

Rahul KhokaleMohammad Atique

Lecture notes in computer science Year: 2014 Pages: 21-31
BOOK-CHAPTER

Document Information Retrieval

Stefan KlinkKoichi KiseAndreas DengelMarkus JunkerStefan Agne

Advances in pattern recognition Year: 2007 Pages: 351-378
© 2026 ScienceGate Book Chapters — All rights reserved.