JOURNAL ARTICLE

Semi-supervised expert metadata extraction based on co-training style

Abstract

Aiming at the problem that requiring large amounts of labeled training data while using supervised learning to extract the expert metadata, a semi-supervised expert metadata extraction method based on co-training style is proposed. Firstly, according to the characteristics of expert metadata, we select expert metadata features and label a certain amount of metadata samples, then train two classifiers with maximum entropy and conditional random respectively. Secondly, two classifiers are used to label metadata items in the unlabeled expert home pages; when the classification results of one type metadata in one expert page satisfy the confidence requirement, analyze the differences of each type metadata labeled by two classifiers; for the metadata satisfying the difference requirement, the better performing classifier for one type metadata is selected to label the certain type metadata, then the labeled expert homepage is obtained as the labeled sample. Finally, use the above-mentioned labeled expert homepage to extend training samples, and retrain two new classifiers, then iterate until two classifiers are convergent. In the experiment, we collected 2000 expert home pages; the results indicate that the semi-supervised expert metadata extraction method based on co-training style outperforms a number of supervised methods, which reduces the amount of manual labeling work effectively.

Keywords:
Metadata Computer science Classifier (UML) Metadata repository Data element Meta Data Services Artificial intelligence Supervised learning Information retrieval Machine learning World Wide Web Artificial neural network

Metrics

1
Cited By
0.00
FWCI (Field Weighted Citation Impact)
17
Refs
0.15
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Web Data Mining and Analysis
Physical Sciences →  Computer Science →  Information Systems
Expert finding and Q&A systems
Physical Sciences →  Computer Science →  Information Systems
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Question classification based on co-training style semi-supervised learning

Zhengtao YuLei SuLina LiQuan ZhaoCunli MaoJianyi Guo

Journal:   Pattern Recognition Letters Year: 2010 Vol: 31 (13)Pages: 1975-1980
JOURNAL ARTICLE

Semi-supervised Learning Based on Graph Stochastic Co-Training

Victor SineglazovSerhii Yarovyi

Journal:   Electronics and Control Systems Year: 2023 Vol: 3 (77)Pages: 9-16
BOOK-CHAPTER

Semi-supervised Co-training Algorithm Based on Assisted Learning

Hongli WangRongyi Cui

Communications in computer and information science Year: 2011 Pages: 538-545
© 2026 ScienceGate Book Chapters — All rights reserved.