Scaling Conditional Random Field with Application to Chinese Word Segmentation

Hai Zhao; Chunyu Kit

doi:10.1109/icnc.2007.648

ScienceGate Book Chapters

JOURNAL ARTICLE

Scaling Conditional Random Field with Application to Chinese Word Segmentation

Hai Zhao Chunyu Kit

Year: 2007 Pages: 95-99

DOI: 10.1109/icnc.2007.648

Get Full-Text PDF Get Analytical Report

Abstract

As a powerful sequence labeling model, conditional random field (CRF) has been applied to a number of natural language processing (NLP) tasks successfully. However, the high complexity of CRF training only allows a very small tag (or label)1 set, because the training becomes intractable as the tag set enlarges. This paper proposes an improved decomposed training and joint decoding algorithm for CRF learning. Instead of training a single CRF model for all tags, it trains a binary sub-CRF independently for each tag. A predicted tag sequence is then produced by a joint decoding algorithm based on the probabilistic output of all sub-CRFs involved. To test its effectiveness, this approach is applied to tackle Chinese word segmentation (CWS) as a character tagging problem. Our evaluation shows that it can reduce time and memory cost by 20-39% and 44-50%, respectively, without any significant performance loss on various large-scale data sets.

Keywords:

Conditional random field CRFS Computer science Sequence labeling Decoding methods Artificial intelligence Word (group theory) Text segmentation Probabilistic logic Sequence (biology) Segmentation Set (abstract data type) Test set Natural language processing Speech recognition Pattern recognition (psychology) Algorithm Mathematics Task (project management)

Metrics

Cited By

0.39

FWCI (Field Weighted Citation Impact)

Refs

0.71

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Scaling Conditional Random Field with Application to Chinese Word Segmentation

Abstract

Metrics

Citation History

Topics

Related Documents

Chinese Word Segmentation Based on Conditional Random Field

Chinese word segmentation based on conditional random fields with character clustering

Research on Chinese Word Segmentation Based on Conditional Random Fields

Chinese segmentation and new word detection using conditional random fields

Chinese Word Segmentation Model of Bidirectional Long Short-Term Memory- Conditional Random Field Integrating Attention