BOOK-CHAPTER

Extracting Relations from Italian Wikipedia Using Self-Training

Abstract

In this paper, we describe a supervised approach for extracting relations from Wikipedia. In particular, we exploit a self-training strategy for enriching a small number of manually labeled triples with new self-labeled examples. We integrate the supervised stage in WikiOIE, an existing framework for unsupervised extraction of relations from Wikipedia. We rely on WikiOIE and its unsupervised pipeline for extracting the initial set of unlabelled triples. An evaluation involving different algorithms and parameters proves that self-training helps to improve performance. Finally, we provide a dataset of about three million triples extracted from the Italian version of Wikipedia and perform a preliminary evaluation conducted on a sample dataset, obtaining promising results.

Keywords:
Exploit Computer science Pipeline (software) Set (abstract data type) Sample (material) Artificial intelligence Training set Relationship extraction Machine learning Labeled data Data mining Information retrieval Information extraction

Metrics

2
Cited By
0.73
FWCI (Field Weighted Citation Impact)
8
Refs
0.74
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Wikis in Education and Collaboration
Social Sciences →  Social Sciences →  Communication
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.