EXPLOITING SUBTREES IN AUTO‐PARSED DATA TO IMPROVE DEPENDENCY PARSING

Wenliang Chen; Jun’ichi Kazama; Kiyotaka Uchimoto; Kentaro Torisawa

doi:10.1111/j.1467-8640.2012.00451.x

ScienceGate Book Chapters

JOURNAL ARTICLE

EXPLOITING SUBTREES IN AUTO‐PARSED DATA TO IMPROVE DEPENDENCY PARSING

Wenliang Chen Jun’ichi Kazama Kiyotaka Uchimoto Kentaro Torisawa

Year: 2012 Journal: Computational Intelligence Vol: 28 (3)Pages: 426-451 Publisher: Wiley

DOI: 10.1111/j.1467-8640.2012.00451.x

Get Full-Text PDF Get Analytical Report

Abstract

Dependency parsing has attracted considerable interest from researchers and developers in natural language processing. However, to obtain a high‐accuracy dependency parser, supervised techniques require a large volume of hand‐annotated data, which are extremely expensive. This paper presents a simple and effective approach for improving dependency parsing with subtrees derived from unannotated data, which are easy to obtain. First, we use a baseline parser to parse large‐scale unannotated data. Then, we extract subtrees from dependency parse trees in the auto‐parsed data. Next, the extracted subtrees are classified into several sets according to their frequency. Finally, we design new features based on the subtree sets for parsing algorithms. To demonstrate the effectiveness of our proposed approach, we conduct experiments on the English Penn Treebank and Chinese Penn Treebank. The results show that our approach significantly outperforms baseline systems. It also achieves the best accuracy for the Chinese data and an accuracy competitive with the best known systems for the English data.

Keywords:

Treebank Computer science Parsing Artificial intelligence Dependency grammar Dependency (UML) Natural language processing Bottom-up parsing Top-down parsing

Metrics

Cited By

2.65

FWCI (Field Weighted Citation Impact)

Refs

0.91

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Data Mining Algorithms and Applications

Physical Sciences → Computer Science → Information Systems

EXPLOITING SUBTREES IN AUTO‐PARSED DATA TO IMPROVE DEPENDENCY PARSING

Abstract

Metrics

Citation History

Topics

Related Documents

Improving dependency parsing with subtrees from auto-parsed data

Using Short Dependency Relations from Auto-Parsed Data for Chinese Dependency Parsing

Semi-supervised Dependency Parsing using Bilexical Contextual Features from Auto-Parsed Data

Dependency Graphs and TEITOK: Exploiting Dependency Parsing

Improve Chinese Semantic Dependency Parsing via Syntactic Dependency Parsing