JOURNAL ARTICLE

MM-Path: Multi-modal, Multi-granularity Path Representation Learning

Abstract

Developing effective path representations has become increasingly essential across various fields within intelligent transportation. Although pre-trained path representation learning models have shown improved performance, they predominantly focus on the topological structures from single modality data, i.e., road networks, overlooking the geometric and contextual features associated with path-related images, e.g., remote sensing images. Similar to human understanding, integrating information from multiple modalities can provide a more comprehensive view, enhancing both representation accuracy and generalization. However, variations in information granularity impede the semantic alignment of road network-based paths (road paths) and image-based paths (image paths), while the heterogeneity of multi-modal data poses substantial challenges for effective fusion and utilization. In this paper, we propose a novel Multi-modal, Multi-granularity Path Representation Learning Framework (MM-Path), which can learn a generic path representation by integrating modalities from both road paths and image paths. To enhance the alignment of multi-modal data, we develop a multi-granularity alignment strategy that systematically associates nodes, road sub-paths, and road paths with their corresponding image patches, ensuring the synchronization of both detailed local information and broader global contexts. To address the heterogeneity of multi-modal data effectively, we introduce a graph-based cross-modal residual fusion component designed to comprehensively fuse information across different modalities and granularities. Finally, we conduct extensive experiments on two large-scale real-world datasets under two downstream tasks, validating the effectiveness of the proposed MM-Path.

Keywords:
Granularity Path (computing) Computer science Modal Representation (politics) Theoretical computer science Artificial intelligence Computer network Materials science Programming language

Metrics

5
Cited By
23.87
FWCI (Field Weighted Citation Impact)
36
Refs
0.98
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

BOOK-CHAPTER

Multi-modal Sensors Path Merging

Léo BaudouinYoucef MezouarOmar Ait-AiderHélder Araújo

Advances in intelligent systems and computing Year: 2015 Pages: 191-201
JOURNAL ARTICLE

Multi-granularity Waveband- and Wavelength- Path Network

Ken-ichi Sato

Year: 2010 Vol: 1 Pages: CFJ1-CFJ1
BOOK-CHAPTER

Multi-granularity Complex Network Representation Learning

Peisen LiGuoyin WangJun HuYun Li

Lecture notes in computer science Year: 2020 Pages: 236-250
JOURNAL ARTICLE

Multi-Semantic Path Representation Learning for Travel Time Estimation

Liangzhe HanBowen DuJingjing LinLeilei SunXucheng LiYizhou Peng

Journal:   IEEE Transactions on Intelligent Transportation Systems Year: 2021 Vol: 23 (8)Pages: 13108-13117
© 2026 ScienceGate Book Chapters — All rights reserved.