Sequence labeling is to assign class labels to the states of a sequence, given the observation of the sequence. We develop an approach to automatically learn a sequence labeling model from a limited amount of labeled training examples and some amount of unlabeled data using conditional random fields. Our approach consists of two phases. The objective of the first phase is to choose some useful unlabeled data based on the assigned labels and the prediction probabilities of the current learned model. The useful unlabeled data is then analyzed in the second phase using a classification method. This classification method is to classify the incorrectly labeled states of the useful labeled data by considering their observation and the labels assigned by the current conditional random fields model. The useful unlabeled data is then exploited to improve the learning. We have conducted extensive experiments to demonstrate the effectiveness of our approach.
Feng JiaoShaojun WangChi‐Hoon LeeRussell GreinerDale Schuurmans
Nataliya SokolovskaThomas LavergneOlivier CappéFrançois Yvon
Romansha ChopraNivedita SinghZhenning YangN. Ch. S. N. Iyengar
David DuvenaudBenjamin M. MarlinKevin J. Murphy