MMF-ViT: A multi-scale multi-domain frequency-aware vision Transformer for MRI-based Alzheimer's classification

Ying Liu; Xiaoli Yang

doi:10.3934/era.2025263

ScienceGate Book Chapters

JOURNAL ARTICLE

MMF-ViT: A multi-scale multi-domain frequency-aware vision Transformer for MRI-based Alzheimer's classification

Ying Liu Xiaoli Yang

Year: 2025 Journal: Electronic Research Archive Vol: 33 (10)Pages: 5916-5936 Publisher: American Institute of Mathematical Sciences

DOI: 10.3934/era.2025263

Get Full-Text PDF Get Analytical Report

Abstract

Alzheimer's disease (AD) is a progressive neurodegenerative disorder that imposes a substantial burden on families and healthcare systems. Mild cognitive impairment (MCI), as an intermediate stage between normal aging and AD, can be further divided into progressive MCI (pMCI) and stable MCI (sMCI) based on follow-up outcomes. Unlike the marked differences observed between cognitively normal (CN) individuals and AD patients, sMCI and pMCI share highly similar characteristics, making early identification of pMCI extremely challenging. Although deep learning methods based on structural magnetic resonance imaging (sMRI) have advanced AD classification, research on predicting MCI progression remains limited due to the high similarity between sMCI and pMCI as well as the substantial cost of prospectively collecting longitudinal data. Accurate early identification of pMCI is essential for timely intervention, slowing disease progression, and reducing healthcare costs. Therefore, this study focused on the early identification of progressive MCI. To address this, we proposed a novel vision Transformer framework, the multi-scale multi-domain frequency-aware vision Transformer (MMF-ViT), which employs a multi-scale cross-domain fusion (MSCDF) module to enable deep interaction between spatial and frequency domain features, thereby enhancing the modeling of fine-grained brain structural variations. The multi-scale frequency encoder (MSFE) and multi-scale context encoder (MSCE) were designed to extract and fuse frequency and spatial information, effectively improving classification performance. Experimental results on the ADNI dataset demonstrate that MMF-ViT achieves an accuracy of 72.84% and an AUC of 72.99% for sMCI versus pMCI classification, significantly outperforming mainstream 2D and 3D models. In AD vs. CN classification, MMF-ViT also achieves an accuracy of 85.59%, highlighting its strong feature representation capability and practical potential.

Keywords:

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.49

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Brain Tumor Detection and Classification

Life Sciences → Neuroscience → Neurology

Medical Image Segmentation Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Retinal Imaging and Analysis

Health Sciences → Medicine → Radiology, Nuclear Medicine and Imaging

MMF-ViT: A multi-scale multi-domain frequency-aware vision Transformer for MRI-based Alzheimer's classification

Abstract

Metrics

Topics

Related Documents

RMSF-ViT: Randomized Multi-scale Fusion Vision Transformer

SQ-ViT: A Multi-Scale Vision Transformer With Quaternion for Endoscopic Images Classification

Aux-ViT : Classification of Alzheimer's Disease from MRI based on Vision Transformer with Auxiliary Branch

Fourier ViT: A Multi-scale Vision Transformer with Fourier Transform for Histopathological Image Classification

Energy-Aware Multi-Modal Vision Transformer (ViT) based C-V2X Cooperative Perception in CAVs