AITopics

2403.07718

Country:

North America > Canada > Quebec > Montreal (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Software (1.00)
Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

arXiv.org Artificial IntelligenceJun-14-2024

Mix Q-learning for Lane Changing: A Collaborative Decision-Making Method in Multi-Agent Deep Reinforcement Learning

Bi, Xiaojun, He, Mingjie, Sun, Yiwen

Lane-changing decisions, which are crucial for autonomous vehicle path planning, face practical challenges due to rule-based constraints and limited data. Deep reinforcement learning has become a major research focus due to its advantages in data acquisition and interpretability. However, current models often overlook collaboration, which affects not only impacts overall traffic efficiency but also hinders the vehicle's own normal driving in the long run. To address the aforementioned issue, this paper proposes a method named Mix Q-learning for Lane Changing(MQLC) that integrates a hybrid value Q network, taking into account both collective and individual benefits for the greater good. At the collective level, our method coordinates the individual Q and global Q networks by utilizing global information. This enables agents to effectively balance their individual interests with the collective benefit. At the individual level, we integrated a deep learning-based intent recognition module into our observation and enhanced the decision network. These changes provide agents with richer decision information and more accurate feature extraction for improved lane-changing decisions. This strategy enables the multi-agent system to learn and formulate optimal decision-making strategies effectively. Our MQLC model, through extensive experimental results, impressively outperforms other state-of-the-art multi-agent decision-making methods, achieving significantly safer and faster lane-changing decisions.

agent, lane-changing decision, vehicle, (16 more...)

2406.09755

Country:

Asia > China > Heilongjiang Province > Harbin (0.04)
Asia > China > Beijing > Beijing (0.04)
South America > Brazil > São Paulo (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Transportation > Ground > Road (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Roy, Vivekananda, Khare, Kshitij, Hobert, James P.

The data augmentation algorithm

arXiv.org Machine LearningJun-14-2024

The data augmentation (DA) algorithms are popular Markov chain Monte Carlo (MCMC) algorithms often used for sampling from intractable probability distributions. This review article comprehensively surveys DA MCMC algorithms, highlighting their theoretical foundations, methodological implementations, and diverse applications in frequentist and Bayesian statistics. The article discusses tools for studying the convergence properties of DA algorithms. Furthermore, it contains various strategies for accelerating the speed of convergence of the DA algorithms, different extensions of DA algorithms and outlines promising directions for future research. This paper aims to serve as a resource for researchers and practitioners seeking to leverage data augmentation techniques in MCMC algorithms by providing key insights and synthesizing recent developments.

algorithm, da algorithm, markov chain, (16 more...)

arXiv.org Machine Learning

2406.10464

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Virginia > Alexandria County > Alexandria (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
North America > United States > California > Alameda County > Hayward (0.04)

Genre:

Research Report (1.00)
Overview (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Yeganeh, Yavar Taheri, Jafari, Mohsen, Matta, Andrea

Active Inference Meeting Energy-Efficient Control of Parallel and Identical Machines

We investigate the application of active inference in developing energy-efficient control agents for manufacturing systems. Active inference, rooted in neuroscience, provides a unified probabilistic framework integrating perception, learning, and action, with inherent uncertainty quantification elements. Our study explores deep active inference, an emerging field that combines deep learning with the active inference decision-making framework. Leveraging a deep active inference agent, we focus on controlling parallel and identical machine workstations to enhance energy efficiency. We address challenges posed by the problem's stochastic nature and delayed policy response by introducing tailored enhancements to existing agent architectures. Specifically, we introduce multi-step transition and hybrid horizon methods to mitigate the need for complex planning. Our experimental results demonstrate the effectiveness of these enhancements and highlight the potential of the active inference-based approach.

active inference, agent, inference, (15 more...)

2406.09322

Genre:

Research Report (0.70)
Overview (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
(2 more...)

PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance

Gan, Qijun, Wang, Song, Wu, Shengtao, Zhu, Jianke

Recently, artificial intelligence techniques for education have been received increasing attentions, while it still remains an open problem to design the effective music instrument instructing systems. Although key presses can be directly derived from sheet music, the transitional movements among key presses require more extensive guidance in piano performance. In this work, we construct a piano-hand motion generation benchmark to guide hand movements and fingerings for piano playing. To this end, we collect an annotated dataset, PianoMotion10M, consisting of 116 hours of piano playing videos from a bird's-eye view with 10 million annotated hand poses. We also introduce a powerful baseline model that generates hand motions from piano audios through a position predictor and a position-guided gesture generator. Furthermore, a series of evaluation metrics are designed to assess the performance of the baseline model, including motion similarity, smoothness, positional accuracy of left and right hands, and overall fidelity of movement distribution. Despite that piano key presses with respect to music scores or audios are already accessible, PianoMotion10M aims to provide guidance on piano fingering for instruction purposes.

dataset, hand motion, pianomotion10m, (15 more...)

2406.09326

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (0.50)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Bagewadi, Smita, Chatterjee, Avhishek

Learning the Influence Graph of a High-Dimensional Markov Process with Memory

Motivated by multiple applications in social networks, nervous systems, and financial risk analysis, we consider the problem of learning the underlying (directed) influence graph or causal graph of a high-dimensional multivariate discrete-time Markov process with memory. At any discrete time instant, each observed variable of the multivariate process is a binary string of random length, which is parameterized by an unobservable or hidden [0,1]-valued scalar. The hidden scalars corresponding to the variables evolve according to discrete-time linear stochastic dynamics dictated by the underlying influence graph whose nodes are the variables. We extend an existing algorithm for learning i.i.d. graphical models to this Markovian setting with memory and prove that it can learn the influence graph based on the binary observations using logarithmic (in number of variables or nodes) samples when the degree of the influence graph is bounded. The crucial analytical contribution of this work is the derivation of the sample complexity result by upper and lower bounding the rate of convergence of the observed Markov process with memory to its stationary distribution in terms of the parameters of the influence graph.

algorithm, graph, probability, (13 more...)

2406.09338

Country:

North America > United States (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)
Asia > India > Tamil Nadu > Chennai (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

ALPHAGMUT: A Rationale-Guided Alpha Shape Graph Neural Network to Evaluate Mutation Effects

Wang, Boshen, Ye, Bowei, Xu, Lin, Liang, Jie

In silico methods evaluating the mutation effects of missense mutations are providing an important approach for understanding mutations in personal genomes and identifying disease-relevant biomarkers. However, existing methods, including deep learning methods, heavily rely on sequence-aware information, and do not fully leverage the potential of available 3D structural information. In addition, these methods may exhibit an inability to predict mutations in domains difficult to formulate sequence-based embeddings. In this study, we introduce a novel rationale-guided graph neural network AlphaGMut to evaluate mutation effects and to distinguish pathogenic mutations from neutral mutations. We compute the alpha shapes of protein structures to obtain atomic-resolution edge connectivities and map them to an accurate residue-level graph representation. We then compute structural-, topological-, biophysical-, and sequence properties of the mutation sites, which are assigned as node attributes in the graph. These node attributes could effectively guide the graph neural network to learn the difference between pathogenic and neutral mutations using k-hop message passing with a short training period. We demonstrate that AlphaGMut outperforms state-of-the-art methods, including DeepMind's AlphaMissense, in many performance metrics. In addition, AlphaGMut has the advantage of performing well in alignment-free settings, which provides broader prediction coverage and better generalization compared to current methods requiring deep sequence-aware information.

graph, mutation, residue, (16 more...)

2406.09159

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Texas > Dallas County > Dallas (0.04)

Genre:

Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Khan, Imran, Lowe, Robert

Surprise! Using Physiological Stress for Allostatic Regulation Under the Active Inference Framework [Pre-Print]

arXiv.org Artificial IntelligenceJun-12-2024

Note: This manuscript has been accepted for publication at a conference in 2024 and will be published under the same title. The version in this pre-print will undergo minor edits and thus does not represent the final version of this work. Abstract-- Allostasis proposes that long-term viability of a living system is achieved through anticipatory adjustments of its physiology and behaviour: emphasising physiological and affective stress as an adaptive state of adaptation that minimizes long-term prediction errors. More recently, the active inference framework (AIF) has also sought to explain action and long-term adaptation through the minimization of future errors (free energy), through the learning of statistical contingencies of the world, offering a formalism for allostatic regulation. We suggest that framing prediction errors through the lens of biological hormonal dynamics proposed by allostasis offers a way to integrate these two models together in a biologically-plausible manner. In this paper, we describe our initial work in developing a model that grounds prediction errors (surprisal) into the secretion of a physiological stress hormone (cortisol) acting as an adaptive, allostatic mediator on a homeostatically-controlled physiology. We evaluate this using a computational model in simulations using an active inference agent endowed with an artificial physiology, regulated through homeostatic and allostatic control in a stochastic environment. Our results find that allostatic functions of cortisol (stress), secreted as a function of prediction errors, provide adaptive advantages to the agent's longterm physiological regulation. We argue that the coupling of information-theoretic prediction errors to low-level, biological hormonal dynamics of stress can provide a computationally efficient model to long-term regulation for embodied intelligent systems. A. Background In both biological and artificial systems, mechanisms of adaptation are critical to long-term stability and viability in dynamic, unpredictable environments.

agent, generative model, prediction error, (15 more...)

2406.08471

Country:

Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceJun-12-2024

Grounding Multimodal Large Language Models in Actions

Szot, Andrew, Mazoure, Bogdan, Agrawal, Harsh, Hjelm, Devon, Kira, Zsolt, Toshev, Alexander

Multimodal Large Language Models (MLLMs) have demonstrated a wide range of capabilities across many domains, including Embodied AI. In this work, we study how to best ground a MLLM into different embodiments and their associated action spaces, with the goal of leveraging the multimodal world knowledge of the MLLM. We first generalize a number of methods through a unified architecture and the lens of action space adaptors. For continuous actions, we show that a learned tokenization allows for sufficient modeling precision, yielding the best performance on downstream tasks. For discrete actions, we demonstrate that semantically aligning these actions with the native output token space of the MLLM leads to the strongest performance. We arrive at these lessons via a thorough study of seven action space adapters on five different environments, encompassing over 114 embodied tasks.

action space, arxiv preprint arxiv, mllm, (14 more...)

2406.07904

Country: North America > United States > Indiana (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceJun-12-2024

A Federated Online Restless Bandit Framework for Cooperative Resource Allocation

Tong, Jingwen, Li, Xinran, Fu, Liqun, Zhang, Jun, Letaief, Khaled B.

Restless multi-armed bandits (RMABs) have been widely utilized to address resource allocation problems with Markov reward processes (MRPs). Existing works often assume that the dynamics of MRPs are known prior, which makes the RMAB problem solvable from an optimization perspective. Nevertheless, an efficient learning-based solution for RMABs with unknown system dynamics remains an open problem. In this paper, we study the cooperative resource allocation problem with unknown system dynamics of MRPs. This problem can be modeled as a multi-agent online RMAB problem, where multiple agents collaboratively learn the system dynamics while maximizing their accumulated rewards. We devise a federated online RMAB framework to mitigate the communication overhead and data privacy issue by adopting the federated learning paradigm. Based on this framework, we put forth a Federated Thompson Sampling-enabled Whittle Index (FedTSWI) algorithm to solve this multi-agent online RMAB problem. The FedTSWI algorithm enjoys a high communication and computation efficiency, and a privacy guarantee. Moreover, we derive a regret upper bound for the FedTSWI algorithm. Finally, we demonstrate the effectiveness of the proposed algorithm on the case of online multi-user multi-channel access. Numerical results show that the proposed algorithm achieves a fast convergence rate of $\mathcal{O}(\sqrt{T\log(T)})$ and better performance compared with baselines. More importantly, its sample complexity decreases with the number of agents.

algorithm, fedtswi algorithm, system dynamic, (15 more...)

2406.07992

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Asia > China > Fujian Province > Xiamen (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(2 more...)

Genre: Research Report (0.83)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)