JOURNAL ARTICLE

Building a large Chinese corpus annotated with semantic dependency

Abstract

At present most of corpora are annotated mainly with syntactic knowledge. In this paper, we attempt to build a large corpus and annotate semantic knowledge with dependency grammar. We believe that words are the basic units of semantics, and the structure and meaning of a sentence consist mainly of a series of semantic dependencies between individual words. A 1,000,000-word-scale corpus annotated with semantic dependency has been built. Compared with syntactic knowledge, semantic knowledge is more difficult to annotate, for ambiguity problem is more serious. In the paper, the strategy to improve consistency is addressed, and congruence is defined to measure the consistency of tagged corpus.. Finally, we will compare our corpus with other well-known corpora.

Keywords:
Computer science Natural language processing Artificial intelligence Ambiguity Dependency (UML) Sentence Consistency (knowledge bases) Dependency grammar Semantics (computer science) Semantic role labeling Information retrieval

Metrics

29
Cited By
1.92
FWCI (Field Weighted Citation Impact)
4
Refs
0.88
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Second Language Acquisition and Learning
Social Sciences →  Psychology →  Developmental and Educational Psychology
© 2026 ScienceGate Book Chapters — All rights reserved.