AITopics | Jang, Yunseok

Collaborating Authors

Jang, Yunseok

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

YTCommentQA: Video Question Answerability in Instructional Videos

Yang, Saelyne, Park, Sunghyun, Jang, Yunseok, Lee, Moontae

arXiv.org Artificial IntelligenceJan-30-2024

Instructional videos provide detailed how-to guides for various tasks, with viewers often posing questions regarding the content. Addressing these questions is vital for comprehending the content, yet receiving immediate answers is difficult. While numerous computational models have been developed for Video Question Answering (Video QA) tasks, they are primarily trained on questions generated based on video content, aiming to produce answers from within the content. However, in real-world situations, users may pose questions that go beyond the video's informational boundaries, highlighting the necessity to determine if a video can provide the answer. Discerning whether a question can be answered by video content is challenging due to the multi-modal nature of videos, where visual and verbal information are intertwined. To bridge this gap, we present the YTCommentQA dataset, which contains naturally-generated questions from YouTube, categorized by their answerability and required modality to answer -- visual, script, or both. Experiments with answerability classification tasks demonstrate the complexity of YTCommentQA and emphasize the need to comprehend the combined role of visual and script information in video reasoning. The dataset is available at https://github.com/lgresearch/YTCommentQA.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2401.17343

Country:

Europe (0.93)
North America > United States > New York (0.14)
North America > United States > Illinois (0.14)
Asia > Middle East > UAE (0.14)

Genre:

Instructional Material > Course Syllabus & Notes (0.85)
Research Report (0.64)

Industry:

Education > Educational Technology > Media (0.61)
Education > Educational Technology > Audio & Video (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Unsupervised Task Graph Generation from Instructional Video Transcripts

Logeswaran, Lajanugen, Sohn, Sungryull, Jang, Yunseok, Lee, Moontae, Lee, Honglak

arXiv.org Artificial IntelligenceMay-2-2023

This work explores the problem of generating task graphs of real-world activities. Different from prior formulations, we consider a setting where text transcripts of instructional videos performing a real-world activity (e.g., making coffee) are provided and the goal is to identify the key steps relevant to the task as well as the dependency relationship between these key steps. We propose a novel task graph generation approach that combines the reasoning capabilities of instruction-tuned language models along with clustering and ranking components to generate accurate task graphs in a completely unsupervised manner. We show that the proposed approach generates more accurate task graphs compared to a supervised learning approach on tasks from the ProceL and CrossTask datasets.

logic & formal reasoning, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2302.09173

Genre:

Workflow (0.93)
Research Report (0.64)
Instructional Material > Course Syllabus & Notes (0.61)

Industry:

Education > Educational Technology > Audio & Video (0.70)
Education > Educational Technology > Media (0.61)
Education > Health & Safety > School Nutrition (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.47)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.46)

Add feedback

Multimodal Subtask Graph Generation from Instructional Videos

Jang, Yunseok, Sohn, Sungryull, Logeswaran, Lajanugen, Luo, Tiange, Lee, Moontae, Lee, Honglak

arXiv.org Artificial IntelligenceFeb-16-2023

Real-world tasks consist of multiple inter-dependent subtasks (e.g., a dirty pan needs to be washed before it can be used for cooking). In this work, we aim to model the causal dependencies between such subtasks from instructional videos describing the task. This is a challenging problem since complete information about the world is often inaccessible from videos, which demands robust learning mechanisms to understand the causal structure of events. We present Multimodal Subtask Graph Generation (MSG2), an approach that constructs a Subtask Graph defining the dependency between a task's subtasks relevant to a task from noisy web videos. Graphs generated by our multimodal approach are closer to human-annotated graphs compared to prior approaches. MSG2 further performs the downstream task of next subtask prediction 85% and 30% more accurately than recent video transformer models in the ProceL and CrossTask datasets, respectively.

logic & formal reasoning, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2302.08672

Genre:

Research Report (0.81)
Instructional Material > Course Syllabus & Notes (0.61)

Industry:

Education > Educational Technology > Media (0.61)
Education > Educational Technology > Audio & Video (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.47)

Add feedback

Diversity-Sensitive Conditional Generative Adversarial Networks

Yang, Dingdong, Hong, Seunghoon, Jang, Yunseok, Zhao, Tianchen, Lee, Honglak

arXiv.org Machine LearningJan-25-2019

We propose a simple yet highly effective method that addresses the mode-collapse problem in the Conditional Generative Adversarial Network (cGAN). Although conditional distributions are multi-modal (i.e., having many modes) in practice, most cGAN approaches tend to learn an overly simplified distribution where an input is always mapped to a single output regardless of variations in latent code. To address such issue, we propose to explicitly regularize the generator to produce diverse outputs depending on latent codes. The proposed regularization is simple, general, and can be easily integrated into most conditional GAN objectives. Additionally, explicit regularization on generator allows our method to control a balance between visual quality and diversity. We demonstrate the effectiveness of our method on three conditional generation tasks: image-to-image translation, image inpainting, and future video prediction. We show that simple addition of our regularization to existing models leads to surprisingly diverse generations, substantially outperforming the previous approaches for multi-modal conditional generation specifically designed in each individual task.

deep learning, neural network, regularization, (18 more...)

arXiv.org Machine Learning

1901.09024

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback