Searching Optimal Solutions for Sequence-to-Sequence Models

Wanyu Du

doi:10.18130/v3-mafm-fq76

ScienceGate Book Chapters

JOURNAL ARTICLE

Searching Optimal Solutions for Sequence-to-Sequence Models

Wanyu Du

Year: 2020 Journal: Libra

DOI: 10.18130/v3-mafm-fq76

Get Full-Text PDF Get Analytical Report

Abstract

Sequence-to-sequence generation applications take source texts as inputs, and automatically generate new texts that satisfy specific target requirements, such as generating a paraphrase, translating to another language, answering a question, etc. There are two key challenges in sequence-to-sequence generation applications: first, how to encode source texts into informative representations that preserve rich semantic information; second, how to generate target texts that look like human-generated texts. In this thesis, I develop probabilistic models to encode informative context representations from source texts using variational autoencoders, and investigate different learning algorithms to train models that can effectively generate better target texts. For learning context representations with variational autoencoders, I identify the limitation of using variational autoencoders for sequence-to-sequence models is that applying the standard normal prior is likely to trap the variational posterior into local optimal, thus preventing the model from learning rich context representations. Therefore, I propose to adapt the attention mechanism and learn some empirical priors to help the model get rid of the local optimal and learn better context representations. For investigating different learning algorithms for sequence-to-sequence models, I present an empirical study on different learning algorithms (e.g. Reinforce, Dagger) to analyze how they can the training-inference discrepancy when training sequence-to-sequence models. I apply different learning algorithms in state-of-the-art model in paraphrase generation tasks, and find that Dagger constantly contributes to better performance.

Keywords:

Context (archaeology) ENCODE Probabilistic logic Paraphrase Prior probability Key (lock) Semantics (computer science) Context model

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.34

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Advanced Text Analysis Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Searching Optimal Solutions for Sequence-to-Sequence Models

Abstract

Metrics

Topics

Related Documents

Searching Protein Sequence Databases — Is Optimal Best?

Sequence Searching

NLP - Sequence-to-Sequence Models

Sequence banks: Searching for sequence similarities

Sequence Alignment and Searching Sequence Databases