AITopics | Chen, Shang-Fu

Collaborating Authors

Chen, Shang-Fu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning

Hiranaka, Ayano, Chen, Shang-Fu, Lai, Chieh-Hsin, Kim, Dongjun, Murata, Naoki, Shibuya, Takashi, Liao, Wei-Hsiang, Sun, Shao-Hua, Mitsufuji, Yuki

arXiv.org Artificial IntelligenceOct-7-2024

Controllable generation through Stable Diffusion (SD) fine-tuning aims to improve fidelity, safety, and alignment with human guidance. Existing reinforcement learning from human feedback methods usually rely on predefined heuristic reward functions or pretrained reward models built on large-scale datasets, limiting their applicability to scenarios where collecting such data is costly or difficult. To effectively and efficiently utilize human feedback, we develop a framework, HERO, which leverages online human feedback collected on the fly during model learning. Specifically, HERO features two key mechanisms: (1) Feedback-Aligned Representation Learning, an online training method that captures human feedback and provides informative learning signals for fine-tuning, and (2) Feedback-Guided Image Generation, which involves generating images from SD's refined initialization samples, enabling faster convergence towards the evaluator's intent. We demonstrate that HERO is 4x more efficient in online feedback for body part anomaly correction compared to the best existing method. Additionally, experiments show that HERO can effectively handle tasks like reasoning, counting, personalization, and reducing NSFW content with only 0.5K online feedback.

artificial intelligence, human feedback, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.05116

Country: North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Diffusion Model-Augmented Behavioral Cloning

Wang, Hsiang-Chun, Chen, Shang-Fu, Hsu, Ming-Hao, Lai, Chun-Mao, Sun, Shao-Hua

arXiv.org Artificial IntelligenceNov-19-2023

Imitation learning addresses the challenge of learning by observing an expert's demonstrations without access to reward signals from environments. Most existing imitation learning methods that do not require interacting with environments either model the expert distribution as the conditional probability p(a|s) (e.g., behavioral cloning, BC) or the joint probability p(s, a). Despite its simplicity, modeling the conditional probability with BC usually struggles with generalization. While modeling the joint probability can improve generalization performance, the inference procedure is often time-consuming, and the model can suffer from manifold overfitting. This work proposes an imitation learning framework that benefits from modeling both the conditional and joint probability of the expert distribution. Our proposed diffusion model-augmented behavioral cloning (DBC) employs a diffusion model trained to model expert behaviors and learns a policy to optimize both the BC loss (conditional) and our proposed diffusion model loss (joint). DBC outperforms baselines in various continuous control tasks in navigation, robot arm manipulation, dexterous manipulation, and locomotion. We design additional experiments to verify the limitations of modeling either the conditional probability or the joint probability of the expert distribution, as well as compare different generative models. Ablation studies justify the effectiveness of our design choices. Recently, the success of deep reinforcement learning (DRL) (Mnih et al., 2015; Lillicrap et al., 2016; Arulkumaran et al., 2017) has inspired the research community to develop DRL frameworks to control robots, aiming to automate the process of designing sensing, planning, and control algorithms by letting the robot learn in an end-to-end fashion. Yet, acquiring complex skills through trial and error can still lead to undesired behaviors even with sophisticated reward design (Christiano et al., 2017; Leike et al., 2018; Lee et al., 2019). Moreover, the exploring process could damage expensive robotic platforms or even be dangerous to humans (Garcıa and Fernández, 2015; Levine et al., 2020). To overcome this issue, imitation learning (i.e., learning from demonstration) (Schaal, 1997; Osa et al., 2018) has received growing attention, whose aim is to learn a policy from expert demonstrations, which are often more accessible than appropriate reward functions for reinforcement learning. Among various imitation learning directions, adversarial imitation learning (Ho and Ermon, 2016; Zolna et al., 2021; Kostrikov et al., 2019) and inverse reinforcement learning (Ng and Russell, 2000; Abbeel and Ng, 2004) have achieved encouraging results in a variety of domains. Yet, these methods require interacting with environments, which can still be expensive or even dangerous. On the other hand, behavioral cloning (BC) (Pomerleau, 1989; Bain and Sammut, 1995) does not require interacting with environments.

machine learning, reinforcement learning, state-action pair, (18 more...)

arXiv.org Artificial Intelligence

2302.13335

Genre:

Research Report > New Finding (0.67)
Instructional Material > Course Syllabus & Notes (0.45)

Industry: Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Learning Facial Liveness Representation for Domain Generalized Face Anti-spoofing

Chen, Zih-Ching, Tsao, Lin-Hsi, Fu, Chin-Lun, Chen, Shang-Fu, Wang, Yu-Chiang Frank

arXiv.org Artificial IntelligenceAug-16-2022

Face anti-spoofing (FAS) aims at distinguishing face spoof attacks from the authentic ones, which is typically approached by learning proper models for performing the associated classification task. In practice, one would expect such models to be generalized to FAS in different image domains. Moreover, it is not practical to assume that the type of spoof attacks would be known in advance. In this paper, we propose a deep learning model for addressing the aforementioned domain-generalized face anti-spoofing task. In particular, our proposed network is able to disentangle facial liveness representation from the irrelevant ones (i.e., facial content and image domain features). The resulting liveness representation exhibits sufficient domain invariant properties, and thus it can be applied for performing domain-generalized FAS. In our experiments, we conduct experiments on five benchmark datasets with various settings, and we verify that our model performs favorably against state-of-the-art approaches in identifying novel types of spoof attacks in unseen image domains.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Artificial Intelligence

2208.07828

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.49)

Add feedback

Order-Free RNN With Visual Attention for Multi-Label Classification

Chen, Shang-Fu (National Taiwan University) | Chen, Yi-Chen (National Taiwan University) | Yeh, Chih-Kuan (Carnegie Mellon University) | Wang, Yu-Chiang Frank (National Taiwan University)

AAAI ConferencesFeb-8-2018

While a number of research works (Zhang and Zhou 2006; Nam et al. 2014; Gong et al. 2013; Wei et al. 2014; We propose a recurrent neural network (RNN) based model Wang et al. 2016) start to advance the CNN architecture for image multi-label classification. Our model uniquely integrates for multi-label classification, CNN-RNN (Wang et al. and learning of visual attention and Long Short 2016) embeds image and semantic structures by projecting Term Memory (LSTM) layers, which jointly learns the labels both features into a joint embedding space. By further of interest and their co-occurrences, while the associated utilizing the component of Long Short Term Memory image regions are visually attended. Different from existing (LSTM) (Hochreiter and Schmidhuber 1997), a recurrent approaches utilize either model in their network architectures, neural network (RNN) structure is introduced to memorize training of our model does not require predefined long-term label dependency. As a result, CNN-RNN exhibits label orders. Moreover, a robust inference process is introduced promising multi-label classification performance with crosslabel so that prediction errors would not propagate and thus correlation implicitly preserved.

deep learning, multi-label classification, neural network, (20 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Asia > Taiwan (0.14)
North America > United States (0.14)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback