DISSERTATION

Encoder-decoder neural networks

Nal Kalchbrenner

Year: 2017 University:   Oxford University Research Archive (ORA) (University of Oxford)   Publisher: University of Oxford

Abstract

This thesis introduces the concept of an encoder-decoder neural network and develops architectures for the construction of such networks. Encoder-decoder neural networks are probabilistic conditional generative models of high-dimensional structured items such as natural language utterances and natural images. Encoder-decoder neural networks estimate a probability distribution over structured items belonging to a target set conditioned on structured items belonging to a source set. The distribution over structured items is factorized into a product of tractable conditional distributions over individual elements that compose the items. The networks estimate these conditional factors explicitly. We develop encoder-decoder neural networks for core tasks in natural language processing and natural image and video modelling. In Part I, we tackle the problem of sentence modelling and develop deep convolutional encoders to classify sentences; we extend these encoders to models of discourse. In Part II, we go beyond encoders to study the longstanding problem of translating from one human language to another. We lay the foundations of neural machine translation, a novel approach that views the entire translation process as a single encoder-decoder neural network. We propose a beam search procedure to search over the outputs of the decoder to produce a likely translation in the target language. Besides known recurrent decoders, we also propose a decoder architecture based solely on convolutional layers. Since the publication of these new foundations for machine translation in 2013, encoder-decoder translation models have been richly developed and have displaced traditional translation systems both in academic research and in large-scale industrial deployment. In services such as Google Translate these models process in the order of a billion translation queries a day. In Part III, we shift from the linguistic domain to the visual one to study distributions over natural images and videos. We describe two- and three- dimensional recurrent and convolutional decoder architectures and address the longstanding problem of learning a tractable distribution over high-dimensional natural images and videos, where the likely samples from the distribution are visually coherent. The empirical validation of encoder-decoder neural networks as state-of- the-art models of tasks ranging from machine translation to video prediction has a two-fold significance. On the one hand, it validates the notions of assigning probabilities to sentences or images and of learning a distribution over a natural language or a domain of natural images; it shows that a probabilistic principle of compositionality, whereby a high- dimensional item is composed from individual elements at the encoder side and whereby a corresponding item is decomposed into conditional factors over individual elements at the decoder side, is a general method for modelling cognition involving high-dimensional items; and it suggests that the relations between the elements are best learnt in an end-to-end fashion as non-linear functions in distributed space. On the other hand, the empirical success of the networks on the tasks characterizes the underlying cognitive processes themselves: a cognitive process as complex as translating from one language to another that takes a human a few seconds to perform correctly can be accurately modelled via a learnt non-linear deterministic function of distributed vectors in high-dimensional space.

Keywords:
Computer science Machine translation Encoder Convolutional neural network Artificial intelligence Artificial neural network Natural language Sentence Set (abstract data type) Deep learning Natural language processing Theoretical computer science Speech recognition Programming language

Metrics

1
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

BOOK-CHAPTER

Tamil Paraphrase Detection Using Encoder-Decoder Neural Networks

B. Senthil KumarD. ThenmozhiS. Kayalvizhi

IFIP advances in information and communication technology Year: 2020 Pages: 30-42
JOURNAL ARTICLE

Software Reliability Prediction through Encoder-Decoder Recurrent Neural Networks

Chen LiJunjun ZhengHiroyuki OkamuraTadashi Dohi

Journal:   International Journal of Mathematical Engineering and Management Sciences Year: 2022 Vol: 7 (3)Pages: 325-340
JOURNAL ARTICLE

Encoder-decoder based convolutional neural networks for image forgery detection

Fatima Zahra El BiachImad IalaHicham LaanayaKhalid Minaoui

Journal:   Multimedia Tools and Applications Year: 2021 Vol: 81 (16)Pages: 22611-22628
JOURNAL ARTICLE

PottsMGNet: A Mathematical Explanation of Encoder-Decoder Based Neural Networks

Xue‐Cheng TaiHao LiuRaymond H. Chan

Journal:   SIAM Journal on Imaging Sciences Year: 2024 Vol: 17 (1)Pages: 540-594
© 2026 ScienceGate Book Chapters — All rights reserved.