AITopics

Current research on human activity recognition (HAR) mainly assumes that training and testing data are drawn from the same distribution to achieve a generalised model, which means all the data are considered to be independent and identically distributed $\displaystyle (i.i.d.) $. In many real-world applications, this assumption does not hold, and collected training and target testing datasets have non-uniform distribution, such as in the case of cross-user HAR. Domain adaptation is a promising approach for cross-user HAR tasks. Existing domain adaptation works based on the assumption that samples in each domain are $\displaystyle i.i.d. $ and do not consider the knowledge of temporal relation hidden in time series data for aligning data distribution. This strong assumption of $\displaystyle i.i.d. $ may not be suitable for time series-related domain adaptation methods because the samples formed by time series segmentation and feature extraction techniques are only coarse approximations to $\displaystyle i.i.d. $ assumption in each domain. In this paper, we propose the temporal relation optimal transport (TROT) method to utilise temporal relation and relax the $\displaystyle i.i.d. $ assumption for the samples in each domain for accurate and efficient knowledge transfer. We obtain the temporal relation representation and implement temporal relation alignment of activities via the Hidden Markov model (HMM) and optimal transport (OT) techniques. Besides, a new regularisation term that preserves temporal relation order information for an improved optimal transport mapping is proposed to enhance the domain adaptation performance. Comprehensive experiments are conducted on three public activity recognition datasets (i.e. OPPT, PAMAP2 and DSADS), demonstrating that TROT outperforms other state-of-the-art methods.

activity recognition, domain adaptation, target user, (11 more...)

2403.15423

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.05)
North America > Canada > Newfoundland and Labrador > Labrador (0.04)
Europe > Italy > Sicily (0.04)
Europe > Greece (0.04)

Genre:

Research Report > Promising Solution (0.68)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Consumer Health (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Rizvi, Michael, Lizaire, Maude, Lacroce, Clara, Rabusseau, Guillaume

Simulating Weighted Automata over Sequences and Trees with Transformers

Transformers are ubiquitous models in the natural language processing (NLP) community and have shown impressive empirical successes in the past few years. However, little is understood about how they reason and the limits of their computational capabilities. These models do not process data sequentially, and yet outperform sequential neural models such as RNNs. Recent work has shown that these models can compactly simulate the sequential reasoning abilities of deterministic finite automata (DFAs). This leads to the following question: can transformers simulate the reasoning of more complex finite state machines? In this work, we show that transformers can simulate weighted finite automata (WFAs), a class of models which subsumes DFAs, as well as weighted tree automata (WTA), a generalization of weighted automata to tree structured inputs. We prove these claims formally and provide upper bounds on the sizes of the transformer models needed as a function of the number of states the target automata. Empirically, we perform synthetic experiments showing that transformers are able to learn these compact solutions via standard gradient-based training.

automata, construction, transformer, (14 more...)

2403.09728

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Switzerland (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

SpaceOctopus: An Octopus-inspired Motion Planning Framework for Multi-arm Space Robot

Zhao, Wenbo, Wang, Shengjie, Fan, Yixuan, Gao, Yang, Zhang, Tao

Space robots have played a critical role in autonomous maintenance and space junk removal. Multi-arm space robots can efficiently complete the target capture and base reorientation tasks due to their flexibility and the collaborative capabilities between the arms. However, the complex coupling properties arising from both the multiple arms and the free-floating base present challenges to the motion planning problems of multi-arm space robots. We observe that the octopus elegantly achieves similar goals when grabbing prey and escaping from danger. Inspired by the distributed control of octopuses' limbs, we develop a multi-level decentralized motion planning framework to manage the movement of different arms of space robots. This motion planning framework integrates naturally with the multi-agent reinforcement learning (MARL) paradigm. The results indicate that our method outperforms the previous method (centralized training). Leveraging the flexibility of the decentralized framework, we reassemble policies trained for different tasks, enabling the space robot to complete trajectory planning tasks while adjusting the base attitude without further learning. Furthermore, our experiments confirm the superior robustness of our method in the face of external disturbances, changing base masses, and even the failure of one arm.

agent, robot, space robot, (13 more...)

2403.08219

Country: Asia > China (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency

Zhao, Yanxiao, Qian, Yangge, Wang, Tianyi, Shan, Jingyang, Qin, Xiaolin

Deep reinforcement learning (DRL) algorithms require substantial samples and computational resources to achieve higher performance, which restricts their practical application and poses challenges for further development. Given the constraint of limited resources, it is essential to leverage existing computational work (e.g., learned policies, samples) to enhance sample efficiency and reduce the computational resource consumption of DRL algorithms. Previous works to leverage existing computational work require intrusive modifications to existing algorithms and models, designed specifically for specific algorithms, lacking flexibility and universality. In this paper, we present the Snapshot Reinforcement Learning (SnapshotRL) framework, which enhances sample efficiency by simply altering environments, without making any modifications to algorithms and models. By allowing student agents to choose states in teacher trajectories as the initial state to sample, SnapshotRL can effectively utilize teacher trajectories to assist student agents in training, allowing student agents to explore a larger state space at the early training phase. We propose a simple and effective SnapshotRL baseline algorithm, S3RL, which integrates well with existing DRL algorithms. Our experiments demonstrate that integrating S3RL with TD3, SAC, and PPO algorithms on the MuJoCo benchmark significantly improves sample efficiency and average return, without extra samples and additional computational resources.

agent, algorithm, snapshotrl, (13 more...)

2403.00673

Country:

Asia > China > Sichuan Province > Chengdu (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(6 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Education (1.00)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Neural Information Processing SystemsMar-11-2024, 20:10:56 GMT

Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net

We propose a novel method to directly learn a stochastic transition operator whose repeated application provides generated samples. Traditional undirected graphical models approach this problem indirectly by learning a Markov chain model whose stationary distribution obeys detailed balance with respect to a parameterized energy function. The energy function is then modified so the model and data distributions match, with no guarantee on the number of steps required for the Markov chain to converge. Moreover, the detailed balance condition is highly restrictive: energy based models corresponding to neural networks must have symmetric weights, unlike biological neural circuits. In contrast, we develop a method for directly learning arbitrarily parameterized transition operators capable of expressing nonequilibrium stationary distributions that violate detailed balance, thereby enabling us to learn more biologically plausible asymmetric neural networks and more general non-energy based dynamical systems. The proposed training objective, which we derive via principled variational methods, encourages the transition operator to "walk back" (prefer to revert its steps) in multi-step trajectories that start at datapoints, as quickly as possible back to the original data points. We present a series of experimental results illustrating the soundness of the proposed approach, Variational Walkback (VW), on the MNIST, CIFAR-10, SVHN and CelebA datasets, demonstrating superior samples compared to earlier attempts to learn a transition operator. We also show that although each rapid training trajectory is limited to a finite but variable number of steps, our transition operator continues to generate good samples well past the length of such trajectories, thereby demonstrating the match of its non-equilibrium stationary distribution to the data distribution.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Workflow (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)

arXiv.org Artificial IntelligenceMar-11-2024

On the Global Convergence of Policy Gradient in Average Reward Markov Decision Processes

Kumar, Navdeep, Murthy, Yashaswini, Shufaro, Itai, Levy, Kfir Y., Srikant, R., Mannor, Shie

We present the first finite time global convergence analysis of policy gradient in the context of infinite horizon average reward Markov decision processes (MDPs). Specifically, we focus on ergodic tabular MDPs with finite state and action spaces. Our analysis shows that the policy gradient iterates converge to the optimal policy at a sublinear rate of $O\left({\frac{1}{T}}\right),$ which translates to $O\left({\log(T)}\right)$ regret, where $T$ represents the number of iterations. Prior work on performance bounds for discounted reward MDPs cannot be extended to average reward MDPs because the bounds grow proportional to the fifth power of the effective horizon. Thus, our primary contribution is in proving that the policy gradient algorithm converges for average-reward MDPs and in obtaining finite-time performance guarantees. In contrast to the existing discounted reward performance bounds, our performance bounds have an explicit dependence on constants that capture the complexity of the underlying MDP. Motivated by this observation, we reexamine and improve the existing performance bounds for discounted reward MDPs. We also present simulations to empirically evaluate the performance of average reward policy gradient algorithm.

average reward, convergence, policy gradient, (15 more...)

2403.06806

Country:

Asia > Middle East > Israel (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

arXiv.org Artificial IntelligenceMar-11-2024

A Survey of Explainable Knowledge Tracing

Bai, Yanhong, Zhao, Jiabao, Wei, Tingjiang, Cai, Qing, He, Liang

With the long term accumulation of high quality educational data, artificial intelligence has shown excellent performance in knowledge tracing. However, due to the lack of interpretability and transparency of some algorithms, this approach will result in reduced stakeholder trust and a decreased acceptance of intelligent decisions. Therefore, algorithms need to achieve high accuracy, and users need to understand the internal operating mechanism and provide reliable explanations for decisions. This paper thoroughly analyzes the interpretability of KT algorithms. First, the concepts and common methods of explainable artificial intelligence and knowledge tracing are introduced. Next, explainable knowledge tracing models are classified into two categories: transparent models and black box models. Then, the interpretable methods used are reviewed from three stages: ante hoc interpretable methods, post hoc interpretable methods, and other dimensions. It is worth noting that current evaluation methods for explainable knowledge tracing are lacking. Hence, contrast and deletion experiments are conducted to explain the prediction results of the deep knowledge tracing model on the ASSISTment2009 by using three XAI methods. Moreover, this paper offers some insights into evaluation methods from the perspective of educational stakeholders. This paper provides a detailed and comprehensive review of the research on explainable knowledge tracing, aiming to offer some basis and inspiration for researchers interested in the interpretability of knowledge tracing.

international conference, interpretability, knowledge, (14 more...)

2403.07279

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China > Shanghai > Shanghai (0.04)
North America > United States > New York > New York County > New York City (0.04)
(8 more...)

Genre:

Overview (1.00)
Instructional Material (1.00)
Research Report > New Finding (0.68)
Research Report > Experimental Study (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(6 more...)

arXiv.org Artificial IntelligenceMar-11-2024

A Survey of Learned Indexes for the Multi-dimensional Space

Al-Mamun, Abdullah, Wu, Hao, He, Qiyang, Wang, Jianguo, Aref, Walid G.

A recent research trend involves treating database index structures as Machine Learning (ML) models. In this domain, single or multiple ML models are trained to learn the mapping from keys to positions inside a data set. This class of indexes is known as "Learned Indexes." Learned indexes have demonstrated improved search performance and reduced space requirements for one-dimensional data. The concept of one-dimensional learned indexes has naturally been extended to multi-dimensional (e.g., spatial) data, leading to the development of "Learned Multi-dimensional Indexes". This survey focuses on learned multi-dimensional index structures. Specifically, it reviews the current state of this research area, explains the core concepts behind each proposed method, and classifies these methods based on several well-defined criteria. We present a taxonomy that classifies and categorizes each learned multi-dimensional index, and survey the existing literature on learned multi-dimensional indexes according to this taxonomy. Additionally, we present a timeline to illustrate the evolution of research on learned indexes. Finally, we highlight several open challenges and future research directions in this emerging and highly active field.

learned index, multi-dimensional index, proceedings, (14 more...)

2403.06456

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(6 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.92)
(4 more...)

arXiv.org Artificial IntelligenceMar-10-2024

Risk-Sensitive RL with Optimized Certainty Equivalents via Reduction to Standard RL

Wang, Kaiwen, Liang, Dawen, Kallus, Nathan, Sun, Wen

We study Risk-Sensitive Reinforcement Learning (RSRL) with the Optimized Certainty Equivalent (OCE) risk, which generalizes Conditional Value-at-risk (CVaR), entropic risk and Markowitz's mean-variance. Using an augmented Markov Decision Process (MDP), we propose two general meta-algorithms via reductions to standard RL: one based on optimistic algorithms and another based on policy optimization. Our optimistic meta-algorithm generalizes almost all prior RSRL theory with entropic risk or CVaR. Under discrete rewards, our optimistic theory also certifies the first RSRL regret bounds for MDPs with bounded coverability, e.g., exogenous block MDPs. Under discrete rewards, our policy optimization meta-algorithm enjoys both global convergence and local improvement guarantees in a novel metric that lower bounds the true OCE risk. Finally, we instantiate our framework with PPO, construct an MDP, and show that it learns the optimal risk-sensitive policy while prior algorithms provably fail.

algorithm, augmdp, mdp, (15 more...)

2403.06323

Genre:

Workflow (0.68)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Benhamou, Jonas, Bonnabel, Silvère, Chapdelaine, Camille

Backpropagation-Based Analytical Derivatives of EKF Covariance for Active Sensing

arXiv.org Artificial IntelligenceMar-8-2024

In robotics, perception-aware (PA) approaches, [1, 2, 3, 4], or active sensing approaches, seek trajectories that maximize information gathered from sensors so as to perform robotic tasks safely. Notably, in the context of ground vehicles, when localization is based on ranging or bearing measurements relative to beacons, the efficiency of active sensing has been shown by [5, 6]. In [7], trajectories are generated to perform optimal online calibration between GPS and inertial measurement unit (IMU), see also [8]. In [3], for visual-inertial navigation systems, the authors have optimized the duration in which landmarks remain within the field of view. In the context of simultaneous localization and mapping (SLAM), those methods pertain to active SLAM, see [9].

equation, gradient, trajectory, (15 more...)

2402.17569

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report (0.50)

Industry: Automobiles & Trucks (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.43)