Concept-based explanations for neural networks

Posada Moreno, Andres Felipe

doi:10.18154/rwth-2024-09511

ScienceGate Book Chapters

JOURNAL ARTICLE

Concept-based explanations for neural networks

Posada Moreno, Andres Felipe

Year: 2024 Journal: RWTH Publications (RWTH Aachen) Publisher: RWTH Aachen University

DOI: 10.18154/rwth-2024-09511

Get Full-Text PDF Get Analytical Report

Abstract

Convolutional neural networks (CNNs) play a crucial role in data-intensive tasks across various industries. The integration of CNNs within Industry 4.0 has accelerated automation, particularly in visual quality control and predictive maintenance. Despite their adoption, CNNs are often used in underspecified tasks, which may lead them to learn undesirable spurious correlations or biases from data. This can result in unexpected behaviors that undermine the reliability of automated systems. This risk highlights the need for robust explainability methods to ensure models perform tasks for the right reasons and align with human expert knowledge. This dissertation addresses CNN interpretability through Explainable Artificial Intelligence (XAI), specifically by developing Concept Extraction (CE) methods. These methods, aimed at automatically extracting distinguishable, high-level patterns learned by models, have shown promising results. But they fail when dealing with industrial data and models. The core hypothesis of our work is that by considering the properties of models, such as how they encode scale or translation of features, we can improve CE methods for industrial applications where these properties are meaningful. This work rethinks CE in CNNs by considering model properties and usage constraints. We extend patch-based methods with SPACE (Scale-Preserving Automatic Concept Extraction) to handle scale variance of features such as pinholes or scratches. Additionally, we introduce ECLAD (Extracting Concepts with Local Aggregated Descriptors), which is capable of not only extracting but also localizing concepts. We validate these methods in real-world industrial applications, providing a clear explanation of the visual features used by the models, as well as allowing the detection of undesired biases. Lastly, we develop CoRe (Concept Regularization) to retrain models and mitigate undesired biases using human feedback and concept-based regularization. Through our work, we introduce SPACE, ECLAD, and CoRe, methods tailored to industrial models and data, providing detailed and robust explanations for understanding CNNs. We demonstrate that a global analysis of a trained model allows experts to identify and mitigate undesired biases. Beyond the domain of industrial AI, this thesis improves CE methods in general. Our contributions to XAI include novel methods for extracting scale-sensitive concepts from models, providing pixel-wise localization of concept, and developing recourse mechanisms to reduce the importance of undesired biases.

Keywords:

Interpretability Convolutional neural network Spurious relationship Scale (ratio) ENCODE Quality (philosophy) Artificial neural network Reliability (semiconductor) Feature (linguistics)

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.53

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Sustainable Industrial Ecology

Physical Sciences → Engineering → Industrial and Manufacturing Engineering

Chemistry and Chemical Engineering

Physical Sciences → Environmental Science → Environmental Chemistry

Recycling and utilization of industrial and municipal waste in materials production

Physical Sciences → Engineering → Building and Construction

Concept-based explanations for neural networks

Abstract

Metrics

Topics

Related Documents

Chapter 15. Human-Centered Concept Explanations for Neural Networks

Towards Global Explanations of Convolutional Neural Networks With Concept Attribution

View-based Explanations for Graph Neural Networks

Entropy-Based Logic Explanations of Neural Networks

Explanations for Neural Networks by Neural Networks