AITopics | Instructional Material

Collaborating Authors

Instructional Material

Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of Gaussian Processes

Neural Information Processing SystemsOct-10-2024, 02:50:29 GMT

Continuously learning to solve unseen tasks with limited experience has been extensively pursued in meta-learning and continual learning, but with restricted assumptions such as accessible task distributions, independently and identically distributed tasks, and clear task delineations. However, real-world physical tasks frequently violate these assumptions, resulting in performance degradation. This paper proposes a continual online model-based reinforcement learning approach that does not require pre-training to solve task-agnostic problems with unknown task boundaries. We maintain a mixture of experts to handle nonstationarity, and represent each different type of dynamics with a Gaussian Process to efficiently leverage collected data and expressively model uncertainty. We propose a transition prior to account for the temporal dependencies in streaming data and update the mixture online via sequential variational inference.

gaussian process, infinite mixture, task-agnostic online reinforcement learning

Neural Information Processing Systems

Genre: Instructional Material > Online (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)

Add feedback

UNIQ: Offline Inverse Q-learning for Avoiding Undesirable Demonstrations

Hoang, Huy, Mai, Tien, Varakantham, Pradeep

arXiv.org Artificial IntelligenceOct-10-2024

We address the problem of offline learning a policy that avoids undesirable demonstrations. Unlike conventional offline imitation learning approaches that aim to imitate expert or near-optimal demonstrations, our setting involves avoiding undesirable behavior (specified using undesirable demonstrations). To tackle this problem, unlike standard imitation learning where the aim is to minimize the distance between learning policy and expert demonstrations, we formulate the learning task as maximizing a statistical distance, in the space of state-action stationary distributions, between the learning policy and the undesirable policy. This significantly different approach results in a novel training objective that necessitates a new algorithm to address it. Our algorithm, UNIQ, tackles these challenges by building on the inverse Q-learning framework, framing the learning problem as a cooperative (non-adversarial) task. We then demonstrate how to efficiently leverage unlabeled data for practical training. Our method is evaluated on standard benchmark environments, where it consistently outperforms state-of-the-art baselines. The code implementation can be accessed at: https://github.com/hmhuy0/UNIQ. Reinforcement learning (RL) is a powerful framework for learning to maximize expected returns and has achieved remarkable success across various domains. However, applying reinforcement learning to real-world problems is challenging due to difficulties in designing reward functions and the requirement for extensive online interactions with the environment. While some approaches have addressed these challenges, they often rely on costly datasets, requiring either accurate labeling or clean, consistent data, which is often impractical. Imitation learning (Abbeel & Ng, 2004; Ziebart et al., 2008; Kelly et al., 2019) offers a more feasible alternative, enabling agents to learn directly from expert demonstrations without the need for explicit reward signals.

demonstration, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2410.08307

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Singapore (0.04)

Genre:

Research Report > New Finding (0.67)
Instructional Material > Course Syllabus & Notes (0.48)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Gentle Introduction and Tutorial on Deep Generative Models in Transportation Research

Choi, Seongjin, Jin, Zhixiong, Ham, Seung Woo, Kim, Jiwon, Sun, Lijun

arXiv.org Artificial IntelligenceOct-10-2024

Deep Generative Models (DGMs) have rapidly advanced in recent years, becoming essential tools in various fields due to their ability to learn complex data distributions and generate synthetic data. Their importance in transportation research is increasingly recognized, particularly for applications like traffic data generation, prediction, and feature extraction. This paper offers a comprehensive introduction and tutorial on DGMs, with a focus on their applications in transportation. It begins with an overview of generative models, followed by detailed explanations of fundamental models, a systematic review of the literature, and practical tutorial code to aid implementation. The paper also discusses current challenges and opportunities, highlighting how these models can be effectively utilized and further developed in transportation research. This paper serves as a valuable reference, guiding researchers and practitioners from foundational knowledge to advanced applications of DGMs in transportation research.

data mining, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.07066

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Quebec > Montreal (0.14)
Asia > South Korea > Seoul > Seoul (0.04)
(11 more...)

Genre:

Workflow (1.00)
Research Report > Promising Solution (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Information Technology > Security & Privacy (1.00)
Transportation > Passenger (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(5 more...)

Add feedback

Meta-Learning from Learning Curves for Budget-Limited Algorithm Selection

Nguyen, Manh Hung, Sun-Hosoya, Lisheng, Guyon, Isabelle

arXiv.org Machine LearningOct-10-2024

Training a large set of machine learning algorithms to convergence in order to select the best-performing algorithm for a dataset is computationally wasteful. Moreover, in a budget-limited scenario, it is crucial to carefully select an algorithm candidate and allocate a budget for training it, ensuring that the limited budget is optimally distributed to favor the most promising candidates. Casting this problem as a Markov Decision Process, we propose a novel framework in which an agent must select in the process of learning the most promising algorithm without waiting until it is fully trained. At each time step, given an observation of partial learning curves of algorithms, the agent must decide whether to allocate resources to further train the most promising algorithm (exploitation), to wake up another algorithm previously put to sleep, or to start training a new algorithm (exploration). In addition, our framework allows the agent to meta-learn from learning curves on past datasets along with dataset meta-features and algorithm hyperparameters. By incorporating meta-learning, we aim to avoid myopic decisions based solely on premature learning curves on the dataset at hand. We introduce two benchmarks of learning curves that served in international competitions at WCCI'22 and AutoML-conf'22, of which we analyze the results. Our findings show that both meta-learning and the progression of learning curves enhance the algorithm selection process, as evidenced by methods of winning teams and our DDQN baseline, compared to heuristic baselines or a random search. Interestingly, our cost-effective baseline, which selects the best-performing algorithm w.r.t. a small budget, can perform decently when learning curves do not intersect frequently.

algorithm, algorithm selection, dataset, (15 more...)

arXiv.org Machine Learning

2410.07696

Country:

North America > United States > California (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre:

Instructional Material (0.88)
Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Implicit Generation and Modeling with Energy Based Models

Neural Information Processing SystemsOct-9-2024, 20:11:22 GMT

Energy based models (EBMs) are appealing due to their generality and simplicity in likelihood modeling, but have been traditionally difficult to train. We present techniques to scale MCMC based EBM training on continuous neural networks, and we show its success on the high-dimensional data domains of ImageNet32x32, ImageNet128x128, CIFAR-10, and robotic hand trajectories, achieving better samples than other likelihood models and nearing the performance of contemporary GAN approaches, while covering all modes of the data. We highlight some unique capabilities of implicit generation such as compositionality and corrupt image reconstruction and inpainting. Finally, we show that EBMs are useful models across a wide variety of tasks, achieving state-of-the-art out-of-distribution classification, adversarially robust classification, state-of-the-art continual online class learning, and coherent long term predicted trajectory rollouts.

classification, implicit generation and modeling

Neural Information Processing Systems

Genre: Instructional Material > Online (0.67)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Add feedback

Aligning Silhouette Topology for Self-Adaptive 3D Human Pose Recovery

Neural Information Processing SystemsOct-9-2024, 19:20:35 GMT

Articulation-centric 2D/3D pose supervision forms the core training objective in most existing 3D human pose estimation techniques. Except for synthetic source environments, acquiring such rich supervision for each real target domain at deployment is highly inconvenient. However, we realize that standard foreground silhouette estimation techniques (on static camera feeds) remain unaffected by domain-shifts. Motivated by this, we propose a novel target adaptation framework that relies only on silhouette supervision to adapt a source-trained model-based regressor. However, in the absence of any auxiliary cue (multi-view, depth, or 2D pose), an isolated silhouette loss fails to provide a reliable pose-specific gradient and requires to be employed in tandem with a topology-centric loss.

aligning silhouette topology, estimation technique, human pose recovery, (3 more...)

Neural Information Processing Systems

Genre: Instructional Material (0.62)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.64)
Information Technology > Artificial Intelligence > Machine Learning (0.54)

Add feedback

Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization

Neural Information Processing SystemsOct-9-2024, 17:43:19 GMT

Recently, we have witnessed great progress in the field of medical imaging classification by adopting deep neural networks. However, the recent advanced models still require accessing sufficiently large and representative datasets for training, which is often unfeasible in clinically realistic environments. When trained on limited datasets, the deep neural network is lack of generalization capability, as the trained deep neural network on data within a certain distribution (e.g. the data captured by a certain device vendor or patient population) may not be able to generalize to the data with another distribution. In this paper, we introduce a simple but effective approach to improve the generalization capability of deep neural networks in the field of medical imaging classification. Motivated by the observation that the domain variability of the medical images is to some extent compact, we propose to learn a representative feature space through variational encoding with a novel linear-dependency regularization term to capture the shareable information among medical data collected from different domains.

deep neural network, generalization capability, linear-dependency regularization, (5 more...)

Neural Information Processing Systems

Genre:

Instructional Material > Online (0.93)
Instructional Material > Course Syllabus & Notes (0.93)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Health Care Technology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Online Continual Learning with Maximal Interfered Retrieval

Neural Information Processing SystemsOct-9-2024, 14:03:12 GMT

Continual learning, the setting where a learning agent is faced with a never-ending stream of data, continues to be a great challenge for modern machine learning systems. In particular the online or "single-pass through the data" setting has gained attention recently as a natural setting that is difficult to tackle. Methods based on replay, either generative or from a stored memory, have been shown to be effective approaches for continual learning, matching or exceeding the state of the art in a number of standard benchmarks. These approaches typically rely on randomly selecting samples from the replay memory or from a generative model, which is suboptimal. In this work, we consider a controlled sampling of memories for replay.

maximal interfered retrieval, online continual learning, replay

Neural Information Processing Systems

Genre: Instructional Material > Online (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Neural Networks Fail to Learn Periodic Functions and How to Fix It

Neural Information Processing SystemsOct-9-2024, 13:35:11 GMT

Previous literature offers limited clues on how to learn a periodic function using modern neural networks. We start with a study of the extrapolation properties of neural networks; we prove and demonstrate experimentally that the standard activations functions, such as ReLU, tanh, sigmoid, along with their variants, all fail to learn to extrapolate simple periodic functions. We hypothesize that this is due to their lack of a periodic" inductive bias. As a fix of this problem, we propose a new activation, namely, x \sin 2(x), which achieves the desired periodic inductive bias to learn a periodic function while maintaining a favorable optimization property of the \relu -based activations. Experimentally, we apply the proposed method to temperature and financial data prediction.

artificial intelligence, machine learning, neural network fail, (2 more...)

Neural Information Processing Systems

Genre: Instructional Material (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Improved Schemes for Episodic Memory-based Lifelong Learning

Neural Information Processing SystemsOct-9-2024, 11:55:25 GMT

Current deep neural networks can achieve remarkable performance on a single task. However, when the deep neural network is continually trained on a sequence of tasks, it seems to gradually forget the previous learned knowledge. This phenomenon is referred to as catastrophic forgetting and motivates the field called lifelong learning. Recently, episodic memory based approaches such as GEM and A-GEM have shown remarkable performance. In this paper, we provide the first unified view of episodic memory based approaches from an optimization's perspective.

episodic memory-based lifelong learning, improved scheme, mega-rom, (3 more...)

Neural Information Processing Systems

Genre: Instructional Material (0.72)

Industry:

Health & Medicine > Consumer Health (0.96)
Education > Educational Setting > Continuing Education (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (0.96)

Add feedback