Generative Adversarial learning with Negative Data Augmentation for Semi-supervised Text Classification

Shahriar Shayesteh; Diana Inkpen

doi:10.32473/flairs.v35i.130722

ScienceGate Book Chapters

JOURNAL ARTICLE

Generative Adversarial learning with Negative Data Augmentation for Semi-supervised Text Classification

Shahriar Shayesteh Diana Inkpen

Year: 2022 Journal: Proceedings of the ... International Florida Artificial Intelligence Research Society Conference Vol: 35 Publisher: George A. Smathers Libraries

DOI: 10.32473/flairs.v35i.130722

Get Full-Text PDF Get Analytical Report

Abstract

In recent years, semi-supervised generative adversarial networks (SS-GANs) models such as GAN-BERT have achieved promising results on the text classification task. One of the techniques used in these models to mitigate the generator from mode collapse is feature matching (FM). Although FM addresses some of the critical issues of SS-GANs, these models still suffer from mode collapse with missing coverage outside the data manifold. Moreover, FM loosely tries to match the distribution between the real data and the fake generated samples. By doing this, the generator can generate fake samples inside high-density regions in the data manifold, where the discriminator learns to misclassify them as out-of-data-manifold regions. In this work, we employ the negative data augmentation (NDA) technique, for the first time in text classification, to alleviate the mentioned problems. NDA is a unique way of producing out-of-distribution fake examples by applying mixup transformation on the fake samples and augmented real data. In our new model (NDA-GAN), we produce NDA samples by combining the generator's output with the contextual representation of the real data. As a result of the mixing, NDA samples are less likely to place in the high-density regions, and due to blending with real data representations, these samples reasonably preserve a close distance to the data manifold. Consequently, the NDA samples increase the discriminator's power to find the optimal decision boundary. Our experimental results demonstrate that the negative augmented samples improve the overall accuracy of our proposed model and make it more confident when detecting out-of-distribution samples.

Keywords:

Discriminator Generator (circuit theory) Computer science Generative grammar Manifold (fluid mechanics) Representation (politics) Boundary (topology) Artificial intelligence Pattern recognition (psychology) Feature (linguistics) Mode (computer interface) Generative model Decision boundary Matching (statistics) Mixing (physics) Key (lock) Power (physics) Machine learning Mathematics Statistics Physics Support vector machine

Metrics

Cited By

0.07

FWCI (Field Weighted Citation Impact)

Refs

0.24

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Digital Media Forensic Detection

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Generative Adversarial Networks and Image Synthesis

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Anomaly Detection Techniques and Applications

Physical Sciences → Computer Science → Artificial Intelligence

Generative Adversarial learning with Negative Data Augmentation for Semi-supervised Text Classification

Abstract

Metrics

Citation History

Topics

Related Documents

Data augmentation for supervised learning with generative adversarial networks

GSDA: Generative adversarial network-based semi-supervised data augmentation for ultrasound image classification

Semi-Supervised Learning with Generative Adversarial Networks for Pathological Speech Classification

Semi-supervised Text Regression with Conditional Generative Adversarial Networks

Semi-supervised classification-aware cross-modal deep adversarial data augmentation