AITopics | Markov Models

Collaborating Authors

Markov Models

News Overviews Instructional Materials AI-Alerts Classics

X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer as Meta Multi-Agent Reinforcement Learner

Jiang, Haoyuan, Li, Ziyue, Wei, Hua, Xiong, Xuantang, Ruan, Jingqing, Lu, Jiaming, Mao, Hangyu, Zhao, Rui

arXiv.org Artificial IntelligenceJun-17-2024

The effectiveness of traffic light control has been significantly improved by current reinforcement learning-based approaches via better cooperation among multiple traffic lights. However, a persisting issue remains: how to obtain a multi-agent traffic signal control algorithm with remarkable transferability across diverse cities? In this paper, we propose a Transformer on Transformer (TonT) model for cross-city meta multi-agent traffic signal control, named as X-Light: We input the full Markov Decision Process trajectories, and the Lower Transformer aggregates the states, actions, rewards among the target intersection and its neighbors within a city, and the Upper Transformer learns the general decision trajectories across different cities. This dual-level approach bolsters the model's robust generalization and transferability. Notably, when directly transferring to unseen scenarios, ours surpasses all baseline methods with +7.91% on average, and even +16.3% in some cases, yielding the best results.

intersection, scenario, transformer, (15 more...)

arXiv.org Artificial Intelligence

2404.1209

Country:

Asia > China (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Arizona (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Reward Machines for Deep RL in Noisy and Uncertain Environments

Li, Andrew C., Chen, Zizhao, Klassen, Toryn Q., Vaezipoor, Pashootan, Icarte, Rodrigo Toro, McIlraith, Sheila A.

arXiv.org Artificial IntelligenceJun-17-2024

Reward Machines provide an automata-inspired structure for specifying instructions, safety constraints, and other temporally extended reward-worthy behaviour. By exposing complex reward function structure, they enable counterfactual learning updates that have resulted in impressive sample efficiency gains. While Reward Machines have been employed in both tabular and deep RL settings, they have typically relied on a ground-truth interpretation of the domain-specific vocabulary that form the building blocks of the reward function. Such ground-truth interpretations can be elusive in many real-world settings, due in part to partial observability or noisy sensing. In this paper, we explore the use of Reward Machines for Deep RL in noisy and uncertain environments. We characterize this problem as a POMDP and propose a suite of RL algorithms that leverage task structure under uncertain interpretation of domain-specific vocabulary. Theoretical analysis exposes pitfalls in naive approaches to this problem, while experimental results show that our algorithms successfully leverage task structure to improve performance under noisy interpretations of the vocabulary. Our results provide a general framework for exploiting Reward Machines in partially observable environments.

abstraction model, agent, international conference, (13 more...)

arXiv.org Artificial Intelligence

2406.0012

Country:

North America > Canada > Ontario > Toronto (0.14)
South America > Chile (0.04)
Europe > Italy (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry: Transportation (0.32)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)

Add feedback

Sparsity-Constraint Optimization via Splicing Iteration

Wang, Zezhi, Zhu, Jin, Zhu, Junxian, Tang, Borui, Lin, Hongmei, Wang, Xueqin

arXiv.org Machine LearningJun-17-2024

Sparsity-constraint optimization has wide applicability in signal processing, statistics, and machine learning. Existing fast algorithms must burdensomely tune parameters, such as the step size or the implementation of precise stop criteria, which may be challenging to determine in practice. To address this issue, we develop an algorithm named Sparsity-Constraint Optimization via sPlicing itEration (SCOPE) to optimize nonlinear differential objective functions with strong convexity and smoothness in low dimensional subspaces. Algorithmically, the SCOPE algorithm converges effectively without tuning parameters. Theoretically, SCOPE has a linear convergence rate and converges to a solution that recovers the true support set when it correctly specifies the sparsity. We also develop parallel theoretical results without restricted-isometry-property-type conditions. We apply SCOPE's versatility and power to solve sparse quadratic optimization, learn sparse classifiers, and recover sparse Markov networks for binary variables. The numerical results on these specific tasks reveal that SCOPE perfectly identifies the true support set with a 10--1000 speedup over the standard exact solver, confirming SCOPE's algorithmic and theoretical merits. Our open-source Python package skscope based on C++ implementation is publicly available on GitHub, reaching a ten-fold speedup on the competing convex relaxation methods implemented by the cvxpy library.

algorithm, algorithm 1, sparsity-constraint optimization, (13 more...)

arXiv.org Machine Learning

2406.12017

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Position: Understanding LLMs Requires More Than Statistical Generalization

Reizinger, Patrik, Ujváry, Szilvia, Mészáros, Anna, Kerekes, Anna, Brendel, Wieland, Huszár, Ferenc

arXiv.org Machine LearningJun-17-2024

The last decade has seen blossoming research in deep learning theory attempting to answer, "Why does deep learning generalize?" A powerful shift in perspective precipitated this progress: the study of overparametrized models in the interpolation regime. In this paper, we argue that another perspective shift is due, since some of the desirable qualities of LLMs are not a consequence of good statistical generalization and require a separate theoretical explanation. Our core argument relies on the observation that AR probabilistic models are inherently non-identifiable: models zero or near-zero KL divergence apart -- thus, equivalent test loss -- can exhibit markedly different behaviors. We support our position with mathematical examples and empirical observations, illustrating why non-identifiability has practical relevance through three case studies: (1) the non-identifiability of zero-shot rule extrapolation; (2) the approximate non-identifiability of in-context learning; and (3) the non-identifiability of fine-tunability. We review promising research directions focusing on LLM-relevant generalization measures, transferability, and inductive biases.

arxiv, generalization, inductive bias, (13 more...)

arXiv.org Machine Learning

2405.01964

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(10 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

UniZero: Generalized and Efficient Planning with Scalable Latent World Models

Pu, Yuan, Niu, Yazhe, Ren, Jiyuan, Yang, Zhenjie, Li, Hongsheng, Liu, Yu

arXiv.org Artificial IntelligenceJun-15-2024

Learning predictive world models is essential for enhancing the planning capabilities of reinforcement learning agents. Notably, the MuZero-style algorithms, based on the value equivalence principle and Monte Carlo Tree Search (MCTS), have achieved superhuman performance in various domains. However, in environments that require capturing long-term dependencies, MuZero's performance deteriorates rapidly. We identify that this is partially due to the \textit{entanglement} of latent representations with historical information, which results in incompatibility with the auxiliary self-supervised state regularization. To overcome this limitation, we present \textit{UniZero}, a novel approach that \textit{disentangles} latent states from implicit latent history using a transformer-based latent world model. By concurrently predicting latent dynamics and decision-oriented quantities conditioned on the learned latent history, UniZero enables joint optimization of the long-horizon world model and policy, facilitating broader and more efficient planning in latent space. We demonstrate that UniZero, even with single-frame inputs, matches or surpasses the performance of MuZero-style algorithms on the Atari 100k benchmark. Furthermore, it significantly outperforms prior baselines in benchmarks that require long-term memory. Lastly, we validate the effectiveness and scalability of our design choices through extensive ablation studies, visual analyses, and multi-task learning results. The code is available at \textcolor{magenta}{https://github.com/opendilab/LightZero}.

latent state, unizero, world model, (14 more...)

arXiv.org Artificial Intelligence

2406.10667

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > Alberta (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Education (0.67)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

Order-theoretic models for decision-making: Learning, optimization, complexity and computation

Hack, Pedro

arXiv.org Artificial IntelligenceJun-15-2024

The study of intelligent systems explains behaviour in terms of economic rationality. This results in an optimization principle involving a function or utility, which states that the system will evolve until the configuration of maximum utility is achieved. Recently, this theory has incorporated constraints, i.e., the optimum is achieved when the utility is maximized while respecting some information-processing constraints. This is reminiscent of thermodynamic systems. As such, the study of intelligent systems has benefited from the tools of thermodynamics. The first aim of this thesis is to clarify the applicability of these results in the study of intelligent systems. We can think of the local transition steps in thermodynamic or intelligent systems as being driven by uncertainty. In fact, the transitions in both systems can be described in terms of majorization. Hence, real-valued uncertainty measures like Shannon entropy are simply a proxy for their more involved behaviour. More in general, real-valued functions are fundamental to study optimization and complexity in the order-theoretic approach to several topics, including economics, thermodynamics, and quantum mechanics. The second aim of this thesis is to improve on this classification. The basic similarity between thermodynamic and intelligent systems is based on an uncertainty notion expressed by a preorder. We can also think of the transitions in the steps of a computational process as a decision-making procedure. In fact, by adding some requirements on the considered order structures, we can build an abstract model of uncertainty reduction that allows to incorporate computability, that is, to distinguish the objects that can be constructed by following a finite set of instructions from those that cannot. The third aim of this thesis is to clarify the requirements on the order structure that allow such a framework.

information theory and statistical mechanics, multi-utility injective monotone strict monotone, recursive function and effective computability, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.18725/OPARU-52612

2406.1073

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > Alameda County > Berkeley (0.13)
North America > United States > California > San Francisco County > San Francisco (0.13)
(14 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)

Industry:

Education > Educational Setting (0.45)
Leisure & Entertainment > Games (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
(2 more...)

Add feedback

Horizon-wise Learning Paradigm Promotes Gene Splicing Identification

Li, Qi-Jie, Sun, Qian, Zhang, Shao-Qun

arXiv.org Artificial IntelligenceJun-15-2024

Identifying gene splicing is a core and significant task confronted in modern collaboration between artificial intelligence and bioinformatics. Past decades have witnessed great efforts on this concern, such as the bio-plausible splicing pattern AT-CG and the famous SpliceAI. In this paper, we propose a novel framework for the task of gene splicing identification, named Horizon-wise Gene Splicing Identification (H-GSI). The proposed H-GSI follows the horizon-wise identification paradigm and comprises four components: the pre-processing procedure transforming string data into tensors, the sliding window technique handling long sequences, the SeqLab model, and the predictor. In contrast to existing studies that process gene information with a truncated fixed-length sequence, H-GSI employs a horizon-wise identification paradigm in which all positions in a sequence are predicted with only one forward computation, improving accuracy and efficiency. The experiments conducted on the real-world Human dataset show that our proposed H-GSI outperforms SpliceAI and achieves the best accuracy of 97.20\%. The source code is available from this link.

prediction, sequence, splice site, (15 more...)

arXiv.org Artificial Intelligence

2406.119

Country: Asia > China > Jiangsu Province > Nanjing (0.05)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Bridging the Communication Gap: Artificial Agents Learning Sign Language through Imitation

Tavella, Federico, Galata, Aphrodite, Cangelosi, Angelo

arXiv.org Artificial IntelligenceJun-14-2024

Artificial agents, particularly humanoid robots, interact with their environment, objects, and people using cameras, actuators, and physical presence. Their communication methods are often pre-programmed, limiting their actions and interactions. Our research explores acquiring non-verbal communication skills through learning from demonstrations, with potential applications in sign language comprehension and expression. In particular, we focus on imitation learning for artificial agents, exemplified by teaching a simulated humanoid American Sign Language. We use computer vision and deep learning to extract information from videos, and reinforcement learning to enable the agent to replicate observed actions. Compared to other methods, our approach eliminates the need for additional hardware to acquire information. We demonstrate how the combination of these different techniques offers a viable way to learn sign language. Our methodology successfully teaches 5 different signs involving the upper body (i.e., arms and hands). This research paves the way for advanced communication skills in artificial agents.

artificial intelligence, machine learning, pattern recognition, (17 more...)

arXiv.org Artificial Intelligence

2406.10043

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Gesture Recognition (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Interpretable Cascading Mixture-of-Experts for Urban Traffic Congestion Prediction

Jiang, Wenzhao, Han, Jindong, Liu, Hao, Tao, Tao, Tan, Naiqiang, Xiong, Hui

arXiv.org Artificial IntelligenceJun-14-2024

Rapid urbanization has significantly escalated traffic congestion, underscoring the need for advanced congestion prediction services to bolster intelligent transportation systems. As one of the world's largest ride-hailing platforms, DiDi places great emphasis on the accuracy of congestion prediction to enhance the effectiveness and reliability of their real-time services, such as travel time estimation and route planning. Despite numerous efforts have been made on congestion prediction, most of them fall short in handling heterogeneous and dynamic spatio-temporal dependencies (e.g., periodic and non-periodic congestions), particularly in the presence of noisy and incomplete traffic data. In this paper, we introduce a Congestion Prediction Mixture-of-Experts, CP-MoE, to address the above challenges. We first propose a sparsely-gated Mixture of Adaptive Graph Learners (MAGLs) with congestion-aware inductive biases to improve the model capacity for efficiently capturing complex spatio-temporal dependencies in varying traffic scenarios. Then, we devise two specialized experts to help identify stable trends and periodic patterns within the traffic data, respectively. By cascading these experts with MAGLs, CP-MoE delivers congestion predictions in a more robust and interpretable manner. Furthermore, an ordinal regression strategy is adopted to facilitate effective collaboration among diverse experts. Extensive experiments on real-world datasets demonstrate the superiority of our proposed method compared with state-of-the-art spatio-temporal prediction models. More importantly, CP-MoE has been deployed in DiDi to improve the accuracy and reliability of the travel time estimation system.

cp-moe, prediction, proceedings, (12 more...)

arXiv.org Artificial Intelligence

2406.12923

Country:

Asia > China > Beijing > Beijing (0.05)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
Asia > China > Shanghai > Shanghai (0.05)
(5 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Information Technology (1.00)
Consumer Products & Services > Travel (0.88)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
(2 more...)

Add feedback

Fundamental operating regimes, hyper-parameter fine-tuning and glassiness: towards an interpretable replica-theory for trained restricted Boltzmann machines

Fachechi, Alberto, Agliari, Elena, Aquaro, Miriam, Coolen, Anthony, Mulder, Menno

arXiv.org Artificial IntelligenceJun-14-2024

We consider restricted Boltzmann machines with a binary visible layer and a Gaussian hidden layer trained by an unlabelled dataset composed of noisy realizations of a single ground pattern. We develop a statistical mechanics framework to describe the network generative capabilities, by exploiting the replica trick and assuming self-averaging of the underlying order parameters (i.e., replica symmetry). In particular, we outline the effective control parameters (e.g., the relative number of weights to be trained, the regularization parameter), whose tuning can yield qualitatively-different operative regimes. Further, we provide analytical and numerical evidence for the existence of a sub-region in the space of the hyperparameters where replica-symmetry breaking occurs.

boltzmann machine, regime, transition, (14 more...)

arXiv.org Artificial Intelligence

2406.09924

Country:

Europe > Netherlands > Gelderland > Nijmegen (0.04)
Europe > Italy (0.04)
Europe > Finland (0.04)
(2 more...)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.72)

Add feedback