AITopics | Jeon, Byeongguk

Collaborating Authors

Jeon, Byeongguk

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Latent Action Pretraining from Videos

Ye, Seonghyeon, Jang, Joel, Jeon, Byeongguk, Joo, Sejune, Yang, Jianwei, Peng, Baolin, Mandlekar, Ajay, Tan, Reuben, Chao, Yu-Wei, Lin, Bill Yuchen, Liden, Lars, Lee, Kimin, Gao, Jianfeng, Zettlemoyer, Luke, Fox, Dieter, Seo, Minjoon

arXiv.org Artificial IntelligenceOct-15-2024

We introduce Latent Action Pretraining for general Action models (LAPA), an unsupervised method for pretraining Vision-Language-Action (VLA) models without ground-truth robot action labels. Existing Vision-Language-Action models require action labels typically collected by human teleoperators during pretraining, which significantly limits possible data sources and scale. In this work, we propose a method to learn from internet-scale videos that do not have robot action labels. We first train an action quantization model leveraging VQ-VAE-based objective to learn discrete latent actions between image frames, then pretrain a latent VLA model to predict these latent actions from observations and task descriptions, and finally finetune the VLA on small-scale robot manipulation data to map from latent to robot actions. Experimental results demonstrate that our method significantly outperforms existing techniques that train robot manipulation policies from large-scale videos. Furthermore, it outperforms the state-of-the-art VLA model trained with robotic action labels on real-world manipulation tasks that require language conditioning, generalization to unseen objects, and semantic generalization to unseen instructions. Training only on human manipulation videos also shows positive transfer, opening up the potential for leveraging web-scale data for robotics foundation model.

artificial intelligence, lapa, latent action, (15 more...)

arXiv.org Artificial Intelligence

2410.11758

Genre: Research Report > New Finding (0.87)

Industry: Leisure & Entertainment > Games (0.46)

Technology: Information Technology > Artificial Intelligence > Robots > Manipulation (0.86)

Add feedback

Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Models

Kim, Gangwoo, Kim, Sungdong, Jeon, Byeongguk, Park, Joonsuk, Kang, Jaewoo

arXiv.org Artificial IntelligenceOct-23-2023

Questions in open-domain question answering are often ambiguous, allowing multiple interpretations. One approach to handling them is to identify all possible interpretations of the ambiguous question (AQ) and to generate a long-form answer addressing them all, as suggested by Stelmakh et al., (2022). While it provides a comprehensive response without bothering the user for clarification, considering multiple dimensions of ambiguity and gathering corresponding knowledge remains a challenge. To cope with the challenge, we propose a novel framework, Tree of Clarifications (ToC): It recursively constructs a tree of disambiguations for the AQ -- via few-shot prompting leveraging external knowledge -- and uses it to generate a long-form answer. ToC outperforms existing baselines on ASQA in a few-shot setup across the metrics, while surpassing fully-supervised baselines trained on the whole training set in terms of Disambig-F1 and Disambig-ROUGE. Code is available at https://github.com/gankim/tree-of-clarifications.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2310.14696

Country:

Europe (0.93)
North America > United States (0.46)
Asia > Middle East > UAE (0.14)

Genre:

Research Report (0.82)
Personal > Honors (0.46)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Leisure & Entertainment > Sports > Baseball (1.00)
Media > Television (0.70)
Media > Film (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback