Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows

Shirai, Keisuke, Hashimoto, Atsushi, Nishimura, Taichi, Kameko, Hirotaka, Kurita, Shuhei, Ushiku, Yoshitaka, Mori, Shinsuke

Sep-13-2022–arXiv.org Artificial Intelligence

We present a new multimodal dataset called Visual Recipe Flow, which enables us to learn each cooking action result in a recipe text. The dataset consists of object state changes and the workflow of the recipe text. The state change is represented as an image pair, while the workflow is represented as a recipe flow graph (r-FG). The image pairs are grounded in the r-FG, which provides the cross-modal relation. With our dataset, one can try a range of applications, from multimodal commonsense reasoning and procedural text generation.

annotation, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Sep-13-2022

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > New York County > New York City (0.04)
- Asia > Japan
  - Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)

Genre:
- Workflow (0.88)
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks (0.95)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found