JOURNAL ARTICLE

Searching Optimal Solutions for Sequence-to-Sequence Models

Abstract

Sequence-to-sequence generation applications take source texts as inputs, and automatically generate new texts that satisfy specific target requirements, such as generating a paraphrase, translating to another language, answering a question, etc. There are two key challenges in sequence-to-sequence generation applications: first, how to encode source texts into informative representations that preserve rich semantic information; second, how to generate target texts that look like human-generated texts. In this thesis, I develop probabilistic models to encode informative context representations from source texts using variational autoencoders, and investigate different learning algorithms to train models that can effectively generate better target texts. For learning context representations with variational autoencoders, I identify the limitation of using variational autoencoders for sequence-to-sequence models is that applying the standard normal prior is likely to trap the variational posterior into local optimal, thus preventing the model from learning rich context representations. Therefore, I propose to adapt the attention mechanism and learn some empirical priors to help the model get rid of the local optimal and learn better context representations. For investigating different learning algorithms for sequence-to-sequence models, I present an empirical study on different learning algorithms (e.g. Reinforce, Dagger) to analyze how they can the training-inference discrepancy when training sequence-to-sequence models. I apply different learning algorithms in state-of-the-art model in paraphrase generation tasks, and find that Dagger constantly contributes to better performance.

Keywords:
Context (archaeology) ENCODE Probabilistic logic Paraphrase Prior probability Key (lock) Semantics (computer science) Context model

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.34
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.