Complex-Valued Relative Positional Encodings for Transformer

Gang Yang; Hongzhe Xu

doi:10.1109/nnice58320.2023.10105716

ScienceGate Book Chapters

JOURNAL ARTICLE

Complex-Valued Relative Positional Encodings for Transformer

Gang Yang Hongzhe Xu

Year: 2023 Pages: 373-379

DOI: 10.1109/nnice58320.2023.10105716

Get Full-Text PDF Get Analytical Report

Abstract

Recently, the self-attention mechanism (Transformer) has shown its advantages in various natural language processing (NLP) tasks. Since positional information is crucial to NLP tasks, the positional encoding has become a critical factor in improving the performance of the Transformer. In this paper, we present a simple but effective complex-valued relative positional encoding (CRPE) method. Specifically, we map the query and key vectors to the complex domain based on their positions. Hence, the attention weights will directly contain the relative positional information by the dot product between the complex-valued query and key vectors. To demonstrate the effectiveness of our method, we use four typical NLP tasks: named entity recognition, text classification, machine translation, and language modeling. The datasets of these tasks comprise texts of varying lengths. In the experiments, our method outperforms the baseline positional encodings across all datasets. The results show that our method is more effective for long and short texts while containing fewer parameters.

Keywords:

Computer science Transformer Artificial intelligence Natural language processing Machine translation Encoding (memory) Machine learning

Metrics

Cited By

0.26

FWCI (Field Weighted Citation Impact)

Refs

0.53

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Text and Document Classification Technologies

Physical Sciences → Computer Science → Artificial Intelligence

Complex-Valued Relative Positional Encodings for Transformer

Abstract

Metrics

Citation History

Topics

Related Documents

Transformer Language Models without Positional Encodings Still Learn Positional Information

Untied Positional Encodings for Efficient Transformer-Based Speech Recognition

Algebraic Positional Encodings

ExPe: Exact Positional Encodings for Generative Transformer Models with Extrapolating Capabilities

Incorporating Noisy Length Constraints into Transformer with Length-aware Positional Encodings