JOURNAL ARTICLE

YuYin: a multi-task learning model of multi-modal e-commerce background music recommendation

Le MaXinda WuRuiyuan TangChongjun ZhongKejun Zhang

Year: 2023 Journal:   EURASIP Journal on Audio Speech and Music Processing Vol: 2023 (1)   Publisher: Springer Nature

Abstract

Abstract Appropriate background music in e-commerce advertisements can help stimulate consumption and build product image. However, many factors like emotion and product category should be taken into account, which makes manually selecting music time-consuming and require professional knowledge and it becomes crucial to automatically recommend music for video. For there is no e-commerce advertisements dataset, we first establish a large-scale e-commerce advertisements dataset Commercial-98K, which covers major e-commerce categories. Then, we proposed a video-music retrieval model YuYin to learn the correlation between video and music. We introduce a weighted fusion module (WFM) to fuse emotion features and audio features from music to get a more fine-grained music representation. Considering the similarity of music in the same product category, YuYin is trained by multi-task learning to explore the correlation between video and music by cross-matching video, music, and tag as well as a category prediction task. We conduct extensive experiments to prove YuYin achieves a remarkable improvement in video-music retrieval on Commercial-98K.

Keywords:
Computer science Task (project management) Modal Product (mathematics) Similarity (geometry) Matching (statistics) Multimedia Scale (ratio) Representation (politics) Information retrieval Artificial intelligence Image (mathematics)

Metrics

2
Cited By
0.54
FWCI (Field Weighted Citation Impact)
54
Refs
0.59
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.