Goto

Collaborating Authors

 research



Structured Reinforcement Learning for Combinatorial Decision-Making

Hoppe, Heiko, Baty, Léo, Bouvier, Louis, Parmentier, Axel, Schiffer, Maximilian

arXiv.org Machine Learning

Reinforcement learning (RL) is increasingly applied to real-world problems involving complex and structured decisions, such as routing, scheduling, and assortment planning. These settings challenge standard RL algorithms, which struggle to scale, generalize, and exploit structure in the presence of combinatorial action spaces. We propose Structured Reinforcement Learning (SRL), a novel actor-critic framework that embeds combinatorial optimization layers into the actor neural network. We enable end-to-end learning of the actor via Fenchel-Young losses and provide a geometric interpretation of SRL as a primal-dual algorithm in the dual of the moment polytope. Across six environments with exogenous and endogenous uncertainty, SRL matches or surpasses the performance of unstructured RL and imitation learning on static tasks and improves over these baselines by up to 92% on dynamic problems, with improved stability and convergence speed.


Anthropic's Claude can now read your emails

Engadget

Anthropic announced that its Claude AI can integrate with Google Workspace. This tie-in allows the AI assistant to access any information in Gmail, Google Documents and Google Calendar. Enterprise-level customers even get a special cataloguing option for Documents that aims to offer even better speed and accuracy when retrieving information. This update could make Claude more helpful when it comes to using the chatbot for scheduling or accessing information within the Google ecosystem. The blog post with the announcement specified that the Enterprise option comes with special security controls for confidentiality, but doesn't detail if or how other users might be able to keep Claude from accessing sensitive information that might be stored in an email or document.


Optimization-Augmented Machine Learning for Vehicle Operations in Emergency Medical Services

Rautenstrauß, Maximiliane, Schiffer, Maximilian

arXiv.org Artificial Intelligence

Minimizing response times to meet legal requirements and serve patients in a timely manner is crucial for Emergency Medical Service (EMS) systems. Achieving this goal necessitates optimizing operational decision-making to efficiently manage ambulances. Against this background, we study a centrally controlled EMS system for which we learn an online ambulance dispatching and redeployment policy that aims at minimizing the mean response time of ambulances within the system by dispatching an ambulance upon receiving an emergency call and redeploying it to a waiting location upon the completion of its service. We propose a novel combinatorial optimization-augmented machine learning pipeline that allows to learn efficient policies for ambulance dispatching and redeployment. In this context, we further show how to solve the underlying full-information problem to generate training data and propose an augmentation scheme that improves our pipeline's generalization performance by mitigating a possible distribution mismatch with respect to the considered state space. Compared to existing methods that rely on augmentation during training, our approach offers substantial runtime savings of up to 87.9% while yielding competitive performance. To evaluate the performance of our pipeline against current industry practices, we conduct a numerical case study on the example of San Francisco's 911 call data. Results show that the learned policies outperform the online benchmarks across various resource and demand scenarios, yielding a reduction in mean response time of up to 30%.


Preference Elicitation for Multi-objective Combinatorial Optimization with Active Learning and Maximum Likelihood Estimation

Defresne, Marianne, Mandi, Jayanta, Guns, Tias

arXiv.org Artificial Intelligence

Real-life combinatorial optimization problems often involve several conflicting objectives, such as price, product quality and sustainability. A computationally-efficient way to tackle multiple objectives is to aggregate them into a single-objective function, such as a linear combination. However, defining the weights of the linear combination upfront is hard; alternatively, the use of interactive learning methods that ask users to compare candidate solutions is highly promising. The key challenges are to generate candidates quickly, to learn an objective function that leads to high-quality solutions and to do so with few user interactions. We build upon the Constructive Preference Elicitation framework and show how each of the three properties can be improved: to increase the interaction speed we investigate using pools of (relaxed) solutions, to improve the learning we adopt Maximum Likelihood Estimation of a Bradley-Terry preference model; and to reduce the number of user interactions, we select the pair of candidates to compare with an ensemble-based acquisition function inspired from Active Learning. Our careful experimentation demonstrates each of these improvements: on a PC configuration task and a realistic multi-instance routing problem, our method selects queries faster, needs fewer queries and synthesizes higher-quality combinatorial solutions than previous CPE methods.


Towards Constraint-Based Adaptive Hypergraph Learning for Solving Vehicle Routing: An End-to-End Solution

Wang, Zhenwei, Bai, Ruibin, Zhang, Tiehua

arXiv.org Artificial Intelligence

The application of learning based methods to vehicle routing problems has emerged as a pivotal area of research in combinatorial optimization. These problems are characterized by vast solution spaces and intricate constraints, making traditional approaches such as exact mathematical models or heuristic methods prone to high computational overhead or reliant on the design of complex heuristic operators to achieve optimal or near optimal solutions. Meanwhile, although some recent learning-based methods can produce good performance for VRP with straightforward constraint scenarios, they often fail to effectively handle hard constraints that are common in practice. This study introduces a novel end-to-end framework that combines constraint-oriented hypergraphs with reinforcement learning to address vehicle routing problems. A central innovation of this work is the development of a constraint-oriented dynamic hyperedge reconstruction strategy within an encoder, which significantly enhances hypergraph representation learning. Additionally, the decoder leverages a double-pointer attention mechanism to iteratively generate solutions. The proposed model is trained by incorporating asynchronous parameter updates informed by hypergraph constraints and optimizing a dual loss function comprising constraint loss and policy gradient loss. The experiment results on benchmark datasets demonstrate that the proposed approach not only eliminates the need for sophisticated heuristic operators but also achieves substantial improvements in solution quality.


Force Aware Branch Manipulation To Assist Agricultural Tasks

Rijal, Madhav, Shrestha, Rashik, Smith, Trevor, Gu, Yu

arXiv.org Artificial Intelligence

This study presents a methodology to safely manipulate branches to aid various agricultural tasks. Humans in a real agricultural environment often manipulate branches to perform agricultural tasks effectively, but current agricultural robots lack this capability. This proposed strategy to manipulate branches can aid in different precision agriculture tasks, such as fruit picking in dense foliage, pollinating flowers under occlusion, and moving overhanging vines and branches for navigation. The proposed method modifies RRT* to plan a path that satisfies the branch geometric constraints and obeys branch deformable characteristics. Re-planning is done to obtain a path that helps the robot exert force within a desired range so that branches are not damaged during manipulation. Experimentally, this method achieved a success rate of 78% across 50 trials, successfully moving a branch from different starting points to a target region.


Achieving Green AI with Energy-Efficient Deep Learning Using Neuromorphic Computing

Communications of the ACM

Deep learning (DL) systems have been widely adopted in many industrial and business applications, dramatically improving human productivity, and enabling new industries. However, deep learning has a carbon emission problem.a For example, training a single DL model can consume as much as 656,347 kilowatt-hours of energy and generate up to 626,155 pounds of CO2 emissions, approximately equal to the total lifetime carbon footprint of five cars. Therefore, in pursuit of sustainability, the computational and carbon costs of DL have to be reduced. Modeled after systems in the human brain and nervous system, neuromorphic computing has the potential to be the implementation of choice for low-power DL systems.


👾 Your guide to AI: March 2023

#artificialintelligence

Welcome to the latest issue of your guide to AI, an editorialized newsletter covering key developments in AI research, industry, geopolitics and startups during February 2023. We wrote an op-ed for Sifted on how generative AI will change the software landscape and commented for TIME's cover story on ChatGPT. On the politics side, we reviewed and recommended spinout policy reform in Tony Blair Institute for Global Change's paper A New National Purpose and were included in Politico's 20 people who matter in UK technology. Air Street was featured in Insider's list of top AI investors See some of you at London.AI on Thurs 9 March w/DeepMind, Adept, Palantir and Basecamp Research. Register for our one-day RAAIS conference on research and applied AI 23 June 2023 in London. We'll be hosting speakers from Meta AI, Cruise, Intercom, Genentech, Northvolt and more to come! FYI, you might have to read this issue in full online vs. in your inbox. As usual, we love hearing what you're up to and what's on your mind, just hit reply or forward to your friends:-) Building large-scale AI models requires enormous computing power, which has emerged as the soft power of our time.


Developing an aging clock using deep learning on retinal images – Google AI Blog

#artificialintelligence

Aging is a process that is characterized by physiological and molecular changes that increase an individual's risk of developing diseases and eventually dying. Being able to measure and estimate the biological signatures of aging can help researchers identify preventive measures to reduce disease risk and impact. Researchers have developed "aging clocks" based on markers such as blood proteins or DNA methylation to measure individuals' biological age, which is distinct from one's chronological age. These aging clocks help predict the risk of age-related diseases. But because protein and methylation markers require a blood draw, non-invasive ways to find similar measures could make aging information more accessible.