Goto

Collaborating Authors

 darc




Channel-Wise MLPs Improve the Generalization of Recurrent Convolutional Networks

Breslow, Nathan

arXiv.org Artificial Intelligence

We investigate the impact of channel-wise mixing via multi-layer perceptrons (MLPs) on the generalization capabilities of recurrent convolutional networks. Specifically, we compare two architectures: DARC (Depth Aware Recurrent Convolution), which employs a simple recurrent convolutional structure, and DAMP (Depth Aware Multi-layer Perceptron), which extends DARC with a gated MLP for channel mixing. Using the Re-ARC benchmark, we find that DAMP significantly outperforms DARC in both in-distribution and out-of-distribution generalization under exact-match grading criteria. These results suggest that explicit channel mixing through MLPs enables recurrent convolutional networks to learn more robust and generalizable computational patterns. Our findings have implications for neural program synthesis and highlight the potential of DAMP as a target architecture for hypernetwork approaches.


Off-Dynamics Reinforcement Learning via Domain Adaptation and Reward Augmented Imitation

Guo, Yihong, Wang, Yixuan, Shi, Yuanyuan, Xu, Pan, Liu, Anqi

arXiv.org Artificial Intelligence

Training a policy in a source domain for deployment in the target domain under a dynamics shift can be challenging, often resulting in performance degradation. Previous work tackles this challenge by training on the source domain with modified rewards derived by matching distributions between the source and the target optimal trajectories. However, pure modified rewards only ensure the behavior of the learned policy in the source domain resembles trajectories produced by the target optimal policies, which does not guarantee optimal performance when the learned policy is actually deployed to the target domain. In this work, we propose to utilize imitation learning to transfer the policy learned from the reward modification to the target domain so that the new policy can generate the same trajectories in the target domain. Our approach, Domain Adaptation and Reward Augmented Imitation Learning (DARAIL), utilizes the reward modification for domain adaptation and follows the general framework of generative adversarial imitation learning from observation (GAIfO) by applying a reward augmented estimator for the policy optimization step. Theoretically, we present an error bound for our method under a mild assumption regarding the dynamics shift to justify the motivation of our method. Empirically, our method outperforms the pure modified reward method without imitation learning and also outperforms other baselines in benchmark off-dynamics environments.


Double Actor-Critic with TD Error-Driven Regularization in Reinforcement Learning

Chen, Haohui, Chen, Zhiyong, Liu, Aoxiang, Fang, Wentuo

arXiv.org Artificial Intelligence

To obtain better value estimation in reinforcement learning, we propose a novel algorithm based on the double actor-critic framework with temporal difference error-driven regularization, abbreviated as TDDR. TDDR employs double actors, with each actor paired with a critic, thereby fully leveraging the advantages of double critics. Additionally, TDDR introduces an innovative critic regularization architecture. Compared to classical deterministic policy gradient-based algorithms that lack a double actor-critic structure, TDDR provides superior estimation. Moreover, unlike existing algorithms with double actor-critic frameworks, TDDR does not introduce any additional hyperparameters, significantly simplifying the design and implementation process. Experiments demonstrate that TDDR exhibits strong competitiveness compared to benchmark algorithms in challenging continuous control tasks.


A Trust Region Approach for Few-Shot Sim-to-Real Reinforcement Learning

Daoudi, Paul, Prieur, Christophe, Robu, Bogdan, Barlier, Merwan, Santos, Ludovic Dos

arXiv.org Machine Learning

Simulation-to-Reality Reinforcement Learning (Sim-to-Real RL) seeks to use simulations to minimize the need for extensive real-world interactions. Specifically, in the few-shot off-dynamics setting, the goal is to acquire a simulator-based policy despite a dynamics mismatch that can be effectively transferred to the real-world using only a handful of real-world transitions. In this context, conventional RL agents tend to exploit simulation inaccuracies resulting in policies that excel in the simulator but underperform in the real environment. To address this challenge, we introduce a novel approach that incorporates a penalty to constrain the trajectories induced by the simulator-trained policy inspired by recent advances in Imitation Learning and Trust Region based RL algorithms. We evaluate our method across various environments representing diverse Sim-to-Real conditions, where access to the real environment is extremely limited. These experiments include high-dimensional systems relevant to real-world applications. Across most tested scenarios, our proposed method demonstrates performance improvements compared to existing baselines. Reinforcement Learning (RL) is often applied in simulation before deploying the learned policy on real systems (Ju et al., 2022; Muratore et al., 2019; Kaspar et al., 2020; Witman et al., 2019). This approach is considered to be one of the safest and most efficient ways of obtaining a near-optimal policy for complex systems (Jiang et al., 2021; Salvato et al., 2021; Hsu et al., 2023), as many of the challenges of applying RL to real-world systems (Dulac-Arnold et al., 2021) are mitigated. The agent can sample the simulator at will (Kamthe & Deisenroth, 2018; Schwarzer et al., 2021) without having to consider any safety constraints (Garcıa & Fernández, 2015; Achiam et al., 2017) during training. However, simulators of complex systems are often inaccurate. Indeed, many physical laws such as contact forces, material elasticity, and fluid dynamics are difficult to model, leading simulators to rely on approximations (Koenig & Howard, 2004; Todorov et al., 2012).


Reinforcement Learning for Predicting Traffic Accidents

Cho, Injoon, Rajendran, Praveen Kumar, Kim, Taeyoung, Har, Dongsoo

arXiv.org Artificial Intelligence

As the demand for autonomous driving increases, it is paramount to ensure safety. Early accident prediction using deep learning methods for driving safety has recently gained much attention. In this task, early accident prediction and a point prediction of where the drivers should look are determined, with the dashcam video as input. We propose to exploit the double actors and regularized critics (DARC) method, for the first time, on this accident forecasting platform. We derive inspiration from DARC since it is currently a state-of-the-art reinforcement learning (RL) model on continuous action space suitable for accident anticipation. Results show that by utilizing DARC, we can make predictions 5\% earlier on average while improving in multiple metrics of precision compared to existing methods. The results imply that using our RL-based problem formulation could significantly increase the safety of autonomous driving.


Researchers use AI-based test to predict the retinal disease geographic atrophy

#artificialintelligence

As part of a study released in Progress in Retinal and Eye Research, 113 patients were examined using Detection of Apoptosis in Retinal Cells (DARC) to detect areas of the eye indicative of the retinal disease geographic atrophy. The study was conducted by experts at Imperial College London. "DARC (Detection of Apoptosing Retinal Cells) is a retinal imaging technology that has been developed within the last 2 decades from basic laboratory science to Phase 2 clinical trials," according to the findings. "It uses ANX776 (fluorescently labelled Annexin A5) to identify stressed and apoptotic cells in the living eye. During its development, DARC has undergone biochemistry optimisation, scale-up and GMP manufacture and extensive preclinical evaluation."


It's all in the research: Using AI to solve issues in health care

#artificialintelligence

The University of Alberta uses SAS Viya to help its researchers expand their capacity for big data analysis and support the use of open source software and other tools popular among students. Conducting research is not a straightforward process, and the terabytes of data cascading into labs (both physical and virtual) requires serious horsepower to analyze. Personal desktops and small servers are increasingly coming up short in meeting the demands of artificial intelligence and machine learning projects. Data also comes in various shapes and sizes. Researchers often combine data related to diagnostic imaging, risk prediction, clinical trials and much more.


Eye test uses AI to predict macular degeneration

Daily Mail - Science & tech

A new eye test that uses artificial intelligence AI to study retina scans can predict age-related macular degeneration (AMD) three years before symptoms start. The first part of the'pioneering' test, developed by researchers at University College London, is called DARC. DARC involves injecting dye into a person's bloodstream to illuminate'stressed' endothelial cells in the retina, so they appear bright white under a fluorescent camera. These'stressed' retinal cells could lead to abnormalities and later leaking blood vessels – causing AMD, which can severely compromise the central field of vision. The second part of the test uses an AI algorithm, trained to detect whether the highlighted white spots are around the macula – which indicates high AMD risk.