AITopics

2501.06058

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Law Enforcement & Public Safety > Fire & Emergency Services (0.30)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Ata, Karimeh Ibrahim Mohammad, Hassan, Mohd Khair, Ismaeel, Ayad Ghany, Al-Haddad, Syed Abdul Rahman, Alquthami, Thamer, Alani, Sameer

A Multi-Layer CNN-GRUSKIP model based on transformer for spatial TEMPORAL traffic flow prediction

Traffic flow prediction remains a cornerstone for intelligent transportation systems ITS, influencing both route optimization and environmental efforts. While Recurrent Neural Networks RNN and traditional Convolutional Neural Networks CNN offer some insights into the spatial temporal dynamics of traffic data, they are often limited when navigating sparse and extended spatial temporal patterns. In response, the CNN-GRUSKIP model emerges as a pioneering approach. Notably, it integrates the GRU-SKIP mechanism, a hybrid model that leverages the Gate Recurrent Unit of GRU capabilities to process sequences with the SKIP feature of ability to bypass and connect longer temporal dependencies, making it especially potent for traffic flow predictions with erratic and extended patterns. Another distinctive aspect is its non-standard 6-layer CNN, meticulously designed for in-depth spatiotemporal correlation extraction. The model comprises (1) the specialized CNN feature extraction, (2) the GRU-SKIP enhanced long-temporal module adept at capturing extended patterns, (3) a transformer module employing encoder-decoder and multi-attention mechanisms to hone prediction accuracy and trim model complexity, and (4) a bespoke prediction module. When tested against real-world datasets from California of Caltrans Performance Measurement System PeMS, specifically PeMS districts 4 and 8, the CNN-GRUSKIP consistently outperformed established models such as ARIMA, Graph Wave Net, HA, LSTM, STGCN, and APTN. With its potent predictive prowess and adaptive architecture, the CNN-GRUSKIP model stands to redefine ITS applications, especially where nuanced traffic dynamics are in play.

ain sham engineering journal 15, module, prediction, (10 more...)

doi: 10.1016/j.asej.2024.103045

2501.07593

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.25)
Asia > Malaysia (0.05)
Europe > Switzerland (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.34)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Consumer Products & Services > Travel (0.95)
Energy (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Generative Flow Networks: Theory and Applications to Structure Learning

Deleu, Tristan

forward and backward transition probability, intractable normalization constant, terminating state distribution, (17 more...)

Without any assumptions about data generation, multiple causal models may explain our observations equally well. To avoid selecting a single arbitrary model that could result in unsafe decisions if it does not match reality, it is therefore essential to maintain a notion of epistemic uncertainty about our possible candidates. This thesis studies the problem of structure learning from a Bayesian perspective, approximating the posterior distribution over the structure of a causal model, represented as a directed acyclic graph (DAG), given data. It introduces Generative Flow Networks (GFlowNets), a novel class of probabilistic models designed for modeling distributions over discrete and compositional objects such as graphs. They treat generation as a sequential decision making problem, constructing samples of a target distribution defined up to a normalization constant piece by piece. In the first part of this thesis, we present the mathematical foundations of GFlowNets, their connections to existing domains of machine learning and statistics such as variational inference and reinforcement learning, and their extensions beyond discrete problems. In the second part of this thesis, we show how GFlowNets can approximate the posterior distribution over DAG structures of causal Bayesian Networks, along with the parameters of its causal mechanisms, given observational and experimental data.

2501.05498

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.13)
North America > Canada > Ontario > Toronto (0.13)
North America > Canada > Quebec > Montreal (0.04)
(8 more...)

Genre:

Overview (0.92)
Personal > Honors (0.67)
Research Report > New Finding (0.45)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

CoDe: Communication Delay-Tolerant Multi-Agent Collaboration via Dual Alignment of Intent and Timeliness

Song, Shoucheng, Lin, Youfang, Han, Sheng, Yao, Chang, Wu, Hao, Wang, Shuo, Lv, Kai

Communication has been widely employed to enhance multi-agent collaboration. Previous research has typically assumed delay-free communication, a strong assumption that is challenging to meet in practice. However, real-world agents suffer from channel delays, receiving messages sent at different time points, termed {\it{Asynchronous Communication}}, leading to cognitive biases and breakdowns in collaboration. This paper first defines two communication delay settings in MARL and emphasizes their harm to collaboration. To handle the above delays, this paper proposes a novel framework, Communication Delay-tolerant Multi-Agent Collaboration (CoDe). At first, CoDe learns an intent representation as messages through future action inference, reflecting the stable future behavioral trends of the agents. Then, CoDe devises a dual alignment mechanism of intent and timeliness to strengthen the fusion process of asynchronous messages. In this way, agents can extract the long-term intent of others, even from delayed messages, and selectively utilize the most recent messages that are relevant to their intent. Experimental results demonstrate that CoDe outperforms baseline algorithms in three MARL benchmarks without delay and exhibits robustness under fixed and time-varying delays.

agent, communication, zhang, (9 more...)

2501.05207

Country: Asia > China > Beijing > Beijing (0.05)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Kortus, Tobias, Keidel, Ralf, Gauger, Nicolas R., Kieseler, Jan

Constrained Optimization of Charged Particle Tracking with Multi-Agent Reinforcement Learning

Reinforcement learning demonstrated immense success in modelling complex physics-driven systems, providing end-to-end trainable solutions by interacting with a simulated or real environment, maximizing a scalar reward signal. In this work, we propose, building upon previous work, a multi-agent reinforcement learning approach with assignment constraints for reconstructing particle tracks in pixelated particle detectors. Our approach optimizes collaboratively a parametrized policy, functioning as a heuristic to a multidimensional assignment problem, by jointly minimizing the total amount of particle scattering over the reconstructed tracks in a readout frame. To satisfy constraints, guaranteeing a unique assignment of particle hits, we propose a safety layer solving a linear assignment problem for every joint action. Further, to enforce cost margins, increasing the distance of the local policies predictions to the decision boundaries of the optimizer mappings, we recommend the use of an additional component in the blackbox gradient estimation, forcing the policy to solutions with lower total assignment costs. We empirically show on simulated data, generated for a particle detector developed for proton imaging, the effectiveness of our approach, compared to multiple single- and multi-agent baselines. We further demonstrate the effectiveness of constraints with cost margins for both optimization and generalization, introduced by wider regions with high reconstruction performance as well as reduced predictive instabilities. Our results form the basis for further developments in RL-based tracking, offering both enhanced performance with constrained policies and greater flexibility in optimizing tracking algorithms through the option for individual and team rewards.

agent, particle, university, (14 more...)

2501.05113

Country:

Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.05)
Europe > Norway > Eastern Norway > Oslo (0.04)
Europe > Hungary > Budapest > Budapest (0.04)
(15 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Nuclear Medicine (0.93)
Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Zhao, Yujie, Escamill, Jose Efraim Aguilar, Lu, Weyl, Wang, Huazheng

RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning

Reinforcement Learning from Human Feedback (RLHF) has recently surged in popularity, particularly for aligning large language models and other AI systems with human intentions. At its core, RLHF can be viewed as a specialized instance of Preference-based Reinforcement Learning (PbRL), where the preferences specifically originate from human judgments rather than arbitrary evaluators. Despite this connection, most existing approaches in both RLHF and PbRL primarily focus on optimizing a mean reward objective, neglecting scenarios that necessitate risk-awareness, such as AI safety, healthcare, and autonomous driving. These scenarios often operate under a one-episode-reward setting, which makes conventional risk-sensitive objectives inapplicable. To address this, we explore and prove the applicability of two risk-aware objectives to PbRL: nested and static quantile risk objectives. We also introduce Risk-Aware-PbRL (RA-PbRL), an algorithm designed to optimize both nested and static objectives. Additionally, we provide a theoretical analysis of the regret upper bounds, demonstrating that they are sublinear with respect to the number of episodes, and present empirical results to support our findings.

algorithm, objective, trajectory, (15 more...)

2410.23569

Country:

North America > United States > Oregon (0.04)
North America > United States > California > Yolo County > Davis (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Transportation > Ground > Road (0.34)
Information Technology > Robotics & Automation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Junior, Francisco Erivaldo Fernandes, Oulasvirta, Antti

AgentForge: A Flexible Low-Code Platform for Reinforcement Learning Agent Design

Developing a reinforcement learning (RL) agent often involves identifying values for numerous parameters, covering the policy, reward function, environment, and agent-internal architecture. Since these parameters are interrelated in complex ways, optimizing them is a black-box problem that proves especially challenging for nonexperts. Although existing optimization-as-a-service platforms (e.g., Vizier and Optuna) can handle such problems, they are impractical for RL systems, since the need for manual user mapping of each parameter to distinct components makes the effort cumbersome. It also requires understanding of the optimization process, limiting the systems' application beyond the machine learning field and restricting access in areas such as cognitive science, which models human decision-making. To tackle these challenges, the paper presents AgentForge, a flexible low-code platform to optimize any parameter set across an RL system. Available at https://github.com/feferna/AgentForge, it allows an optimization problem to be defined in a few lines of code and handed to any of the interfaced optimizers. With AgentForge, the user can optimize the parameters either individually or jointly. The paper presents an evaluation of its performance for a challenging vision-based RL problem.

agent, gentf orge, optimization, (13 more...)

2410.19528

Country:

Europe > Finland (0.05)
South America > Ecuador > Pichincha Province > Quito (0.04)
North America > United States (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

arXiv.org Artificial IntelligenceJan-8-2025

Microservice Deployment in Space Computing Power Networks via Robust Reinforcement Learning

Yu, Zhiyong, Jiang, Yuning, Liu, Xin, Shi, Yuanming, Jiang, Chunxiao, Kuang, Linling

With the growing demand for Earth observation, it is important to provide reliable real-time remote sensing inference services to meet the low-latency requirements. The Space Computing Power Network (Space-CPN) offers a promising solution by providing onboard computing and extensive coverage capabilities for real-time inference. This paper presents a remote sensing artificial intelligence applications deployment framework designed for Low Earth Orbit satellite constellations to achieve real-time inference performance. The framework employs the microservice architecture, decomposing monolithic inference tasks into reusable, independent modules to address high latency and resource heterogeneity. This distributed approach enables optimized microservice deployment, minimizing resource utilization while meeting quality of service and functional requirements. We introduce Robust Optimization to the deployment problem to address data uncertainty. Additionally, we model the Robust Optimization problem as a Partially Observable Markov Decision Process and propose a robust reinforcement learning algorithm to handle the semi-infinite Quality of Service constraints. Our approach yields sub-optimal solutions that minimize accuracy loss while maintaining acceptable computational costs. Simulation results demonstrate the effectiveness of our framework.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2501.06244

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States (0.04)
Europe > Switzerland (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry: Energy (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Shakhadri, Syed Abdul Gaffar, KR, Kruthika, Angadi, Kartik Basavaraj

Samba-ASR: State-Of-The-Art Speech Recognition Leveraging Structured State-Space Models

arXiv.org Artificial IntelligenceJan-8-2025

The rapid evolution of deep learning has significantly transformed Automatic Speech Recognition (ASR), shifting from traditional systems such as Hidden Markov Models (HMMs) and Gaussian Mixture Models (GMMs) to advanced end-to-end neural architectures. While innovations such as Connectionist Temporal Classification (CTC) and attentionbased encoder-decoder models have established new baselines [1], transformer-based models like OpenAI's Whisper have further pushed the boundaries, setting state-of-the-art benchmarks for multilingual, multitask ASR systems [2]. Despite their successes, transformer architectures face inherent challenges in scaling to long sequences, particularly those encountered in extended audio recordings.

architecture, samba-asr, sequence, (14 more...)

2501.02832

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.46)
Media (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Ke, Naichang, Tanaka, Ryogo, Kawahara, Yoshinobu

Learning Stochastic Nonlinear Dynamics with Embedded Latent Transfer Operators

arXiv.org Artificial IntelligenceJan-8-2025

We consider an operator-based latent Markov representation of a stochastic nonlinear dynamical system, where the stochastic evolution of the latent state embedded in a reproducing kernel Hilbert space is described with the corresponding transfer operator, and develop a spectral method to learn this representation based on the theory of stochastic realization. The embedding may be learned simultaneously using reproducing kernels, for example, constructed with feed-forward neural networks. We also address the generalization of sequential state-estimation (Kalman filtering) in stochastic nonlinear systems, and of operator-based eigen-mode decomposition of dynamics, for the representation. Several examples with synthetic and real-world data are shown to illustrate the empirical characteristics of our methods, and to investigate the performance of our model in sequential state-estimation and mode decomposition.

decomposition, experiment, operator, (15 more...)

2501.02721

Country:

Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)