Cross-Modal Recipe Embeddings by Disentangling Recipe Contents and Dish Styles

Yu Sugiyama; ‪Keiji Yanai‬

doi:10.1145/3474085.3475422

ScienceGate Book Chapters

JOURNAL ARTICLE

Cross-Modal Recipe Embeddings by Disentangling Recipe Contents and Dish Styles

Yu Sugiyama ‪Keiji Yanai‬

Year: 2021 Pages: 2501-2509

DOI: 10.1145/3474085.3475422

Get Full-Text PDF Get Analytical Report

Abstract

Nowadays, cooking recipe sharing sites on the Web are widely used, and play a major role in everyday home cooking. Since cooking recipes consist of dish photos and recipe texts, cross-modal recipe search is being actively explored. To enable cross-modal search, both food image features and cooking text recipe features are embedded into the same shared space in general. However, in most of the existing studies, a one-to-one correspondence between a recipe text and a dish image in the embedding space is assumed, although an unlimited number of photos with different serving styles and different plates can be associated with the same recipe. In this paper, we propose a RDE-GAN (Recipe Disentangled Embedding GAN) which separates food image information into a recipe image feature and a non-recipe shape feature. In addition, we generate a food image by integrating both the recipe embedding and a shape feature. Since the proposed embedding is free from serving and plate styles which are unrelated to cooking recipes, the experimental results showed that it outperformed the existing methods on cross-modal recipe search. We also confirmed that only either shape or recipe elements can be changed at the time of food image generation.

Keywords:

Recipe Embedding Modal Computer science Feature (linguistics) Artificial intelligence Image (mathematics) Space (punctuation) Mathematics Algorithm Computer vision Geography Linguistics

Metrics

Cited By

1.43

FWCI (Field Weighted Citation Impact)

Refs

0.83

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Image Retrieval and Classification Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Video Analysis and Summarization

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Cross-Modal Recipe Embeddings by Disentangling Recipe Contents and Dish Styles

Abstract

Metrics

Citation History

Topics

Related Documents

Improving Cross-Modal Recipe Embeddings with Cross Decoder

Cross-Modal Recipe Retrieval: How to Cook this Dish?

Transformer-Based Cross-Modal Recipe Embeddings with Large Batch Training

Mask-based Food Image Synthesis with Cross-Modal Recipe Embeddings

Video-Based Cross-Modal Recipe Retrieval