JOURNAL ARTICLE

MMF-ViT: A multi-scale multi-domain frequency-aware vision Transformer for MRI-based Alzheimer's classification

Ying LiuXiaoli Yang

Year: 2025 Journal:   Electronic Research Archive Vol: 33 (10)Pages: 5916-5936   Publisher: American Institute of Mathematical Sciences

Abstract

Alzheimer's disease (AD) is a progressive neurodegenerative disorder that imposes a substantial burden on families and healthcare systems. Mild cognitive impairment (MCI), as an intermediate stage between normal aging and AD, can be further divided into progressive MCI (pMCI) and stable MCI (sMCI) based on follow-up outcomes. Unlike the marked differences observed between cognitively normal (CN) individuals and AD patients, sMCI and pMCI share highly similar characteristics, making early identification of pMCI extremely challenging. Although deep learning methods based on structural magnetic resonance imaging (sMRI) have advanced AD classification, research on predicting MCI progression remains limited due to the high similarity between sMCI and pMCI as well as the substantial cost of prospectively collecting longitudinal data. Accurate early identification of pMCI is essential for timely intervention, slowing disease progression, and reducing healthcare costs. Therefore, this study focused on the early identification of progressive MCI. To address this, we proposed a novel vision Transformer framework, the multi-scale multi-domain frequency-aware vision Transformer (MMF-ViT), which employs a multi-scale cross-domain fusion (MSCDF) module to enable deep interaction between spatial and frequency domain features, thereby enhancing the modeling of fine-grained brain structural variations. The multi-scale frequency encoder (MSFE) and multi-scale context encoder (MSCE) were designed to extract and fuse frequency and spatial information, effectively improving classification performance. Experimental results on the ADNI dataset demonstrate that MMF-ViT achieves an accuracy of 72.84% and an AUC of 72.99% for sMCI versus pMCI classification, significantly outperforming mainstream 2D and 3D models. In AD vs. CN classification, MMF-ViT also achieves an accuracy of 85.59%, highlighting its strong feature representation capability and practical potential.

Keywords:

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
29
Refs
0.49
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Brain Tumor Detection and Classification
Life Sciences →  Neuroscience →  Neurology
Medical Image Segmentation Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Retinal Imaging and Analysis
Health Sciences →  Medicine →  Radiology, Nuclear Medicine and Imaging
© 2026 ScienceGate Book Chapters — All rights reserved.