JOURNAL ARTICLE

POC-NLW Template Based Tagging Method for Chinese Word Segmentation

Abstract

In Chinese word segmentation, disambiguation and unknown words identification are becoming the two key issues. In this paper, a two-stage strategy based system is constructed to deal with these problems. First, an n-gram based model is applied to do the basic segmentation as well as disambiguation in some extent. Then, in the second stage, a language tagging template, named POC-NLW, is adopted to carry out a character sequence tagging procedure based on hidden Markov model, which is used to refine the results from the first stage and to identify unknown words. Several detailed experiments have been implemented on the SIGHAN Bakeoff 2005 corpus. Experimental results show that the method can achieve high accuracy on word segmentation, as well as on unknown words identification, with appreciable processing efficiency. This method is characterized by the good interoperability and expansionary over different kinds of unknown words, thus it is applicable for practical Chinese information processing applications

Keywords:
Computer science Segmentation Artificial intelligence Hidden Markov model Word (group theory) Character (mathematics) Natural language processing Sequence labeling Identification (biology) Text segmentation Key (lock) Carry (investment) Sequence (biology) Word identification Pattern recognition (psychology) Speech recognition Word recognition Linguistics

Metrics

4
Cited By
1.18
FWCI (Field Weighted Citation Impact)
16
Refs
0.82
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Handwritten Text Recognition Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

POC-NLW Template for Chinese Word Segmentation

Bo ChenWeiran XuPeng TaoJun Guo

Journal:   Meeting of the Association for Computational Linguistics Year: 2006 Vol: 66 Pages: 177-180
JOURNAL ARTICLE

Chinese Word Segmentation as Character Tagging

Nianwen Xue

Year: 2003 Vol: 8 (1)Pages: 29-48
JOURNAL ARTICLE

Chinese word segmentation as LMR tagging

Nianwen XueLibin Shen

Year: 2003 Vol: 17 Pages: 176-179
© 2026 ScienceGate Book Chapters — All rights reserved.